473,397 Members | 2,056 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

file reading

Hi All,
I am facing some problem with basic file operation...

I have one xml file looks like
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<x:recording>

<udf3>Gélin</udf3>

</x:recording>

My code is like it will read this file and store it into one string
and will call one Database Stored procedure to parse the xml and store
it into some tables.

eg:
FILE * file = fopen("testFile.xml","r+b");

struct _stat buffer;

int result1 = _stat( "testFile.xml", &buffer );

int size = buffer.st_size;
char *temp = new char [(sizeof(char))*(size+1)];
fread(temp,sizeof(char),size,file);

pass this temp to Ado for SP execution.

Problem:

you can see the xml file has one higherorderASCII character' é '

this going to the SP as wrong character 'é '

While debugging the code as well I can see the temp is having this
wrong value.

I reading in the binary mode but still why this problem is happening.

Can you please help me to resolve that

Apr 12 '07 #1
4 1827
ra***************@yahoo.co.in wrote:
Hi All,
I am facing some problem with basic file operation...

I have one xml file looks like
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<x:recording>

<udf3>Gélin</udf3>

</x:recording>

My code is like it will read this file and store it into one string
and will call one Database Stored procedure to parse the xml and store
it into some tables.

eg:
FILE * file = fopen("testFile.xml","r+b");

struct _stat buffer;

int result1 = _stat( "testFile.xml", &buffer );

int size = buffer.st_size;
char *temp = new char [(sizeof(char))*(size+1)];
As we are in the C world, that should be malloc - and sizeof(char) is by
definition, 1.
fread(temp,sizeof(char),size,file);

pass this temp to Ado for SP execution.
What is Ado and SP? Without knowing what is called, it's difficult to
answer the question.
>
Problem:

you can see the xml file has one higherorderASCII character' é '

this going to the SP as wrong character 'é '
What happens if you use unsigned char? Does the function you are
calling expect ASCII or UTF8, char, unsigned char or something else?

--
Ian Collins.
Apr 12 '07 #2
ra***************@yahoo.co.in wrote:
I have one xml file looks like
<?xml version=3D"1.0" encoding=3D"UTF-8" standalone=3D"no" ?>
^^^^^^^
There's your problem.
you can see the xml file has one higherorderASCII character' =E9 '
No, it doesn't.
this going to the SP as wrong character '=C3=A9 '
This is what is actually in the file.

Read up on UTF-8. It's a way of encoding Unicode, including characters
_above_ 0xFF (such as Devanagari and other Indian scripts, which may be
one reason why the person who supplied your file uses it), in sequences
of 8-bit bytes. This does mean that all over 0x7F must be encoded in two
or more bytes. Either just pass on the UTF-8, or decode it by hand; it's
not hard. The greatest problem is going to be deciding what to do when
(not if!) you do get a Unicode character that won't fit in your C char.

Richard
Apr 12 '07 #3
On Apr 12, 1:22 pm, r...@hoekstra-uitgeverij.nl (Richard Bos) wrote:
ramyakrishnaku...@yahoo.co.in wrote:
I have one xml file looks like
<?xml version=3D"1.0" encoding=3D"UTF-8" standalone=3D"no" ?>

^^^^^^^
There's your problem.
you can see the xml file has one higherorderASCII character' =E9 '

No, it doesn't.
this going to the SP as wrong character '=C3=A9 '

This is what is actually in the file.

Read up on UTF-8. It's a way of encoding Unicode, including characters
_above_ 0xFF (such as Devanagari and other Indian scripts, which may be
one reason why the person who supplied your file uses it), in sequences
of 8-bit bytes. This does mean that all over 0x7F must be encoded in two
or more bytes. Either just pass on the UTF-8, or decode it by hand; it's
not hard. The greatest problem is going to be deciding what to do when
(not if!) you do get a Unicode character that won't fit in your C char.

Richard

File is getting written by another routine , where all the characters
are written using fwrite.
In that header is been hard coded as "<?xml version="1.0"
encoding="UTF-8" standalone="no" ?"
I think this conversion of characters is happening after writteninto
the file right?

Can we change anything[changing any other format of xml] while writing
the xml file, which will store these without conversion?

In reading code, how will it come to know these wto characters are
belongs to one character. or is there any other decoding machanism.

I am not much familiar with the xml.

I tried reading with unicode wide char as well, but it was not reading
properly.

Apr 12 '07 #4
ra***************@yahoo.co.in wrote:
On Apr 12, 1:22 pm, r...@hoekstra-uitgeverij.nl (Richard Bos) wrote:
ramyakrishnaku...@yahoo.co.in wrote:
I have one xml file looks like
<?xml version=3D"1.0" encoding=3D"UTF-8" standalone=3D"no" ?>
^^^^^^^
There's your problem.
you can see the xml file has one higherorderASCII character' =E9 '
No, it doesn't.
this going to the SP as wrong character '=C3=A9 '
This is what is actually in the file.

Read up on UTF-8. It's a way of encoding Unicode,
File is getting written by another routine , where all the characters
are written using fwrite.
In that header is been hard coded as "<?xml version="1.0"
encoding="UTF-8" standalone="no" ?"
I think this conversion of characters is happening after writteninto
the file right?
How the blazes should _I_ know? _You_ have access to (possibly even
written) this "routine", whether that mean function or whatever, I do
not.
In reading code, how will it come to know these wto characters are
belongs to one character. or is there any other decoding machanism.
My dear boy, if you won't do your own research, you'll never amount to a
programmer. Information on UTF-8 is extremely easy to come by.

Richard
Apr 13 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Xah Lee | last post by:
# -*- coding: utf-8 -*- # Python # to open a file and write to file # do f=open('xfile.txt','w') # this creates a file "object" and name it f. # the second argument of open can be
19
by: Lionel B | last post by:
Greetings, I need to read (unformatted text) from stdin up to EOF into a char buffer; of course I cannot allocate my buffer until I know how much text is available, and I do not know how much...
4
by: Oliver Knoll | last post by:
According to my ANSI book, tmpfile() creates a file with wb+ mode (that is just writing, right?). How would one reopen it for reading? I got the following (which works): FILE *tmpFile =...
0
by: Lokkju | last post by:
I am pretty much lost here - I am trying to create a managed c++ wrapper for this dll, so that I can use it from c#/vb.net, however, it does not conform to any standard style of coding I have seen....
7
by: John Dann | last post by:
I'm trying to read some binary data from a file created by another program. I know the binary file format but can't change or control the format. The binary data is organised such that it should...
1
AdrianH
by: AdrianH | last post by:
Assumptions I am assuming that you know or are capable of looking up the functions I am to describe here and have some remedial understanding of C programming. FYI Although I have called this...
6
Atran
by: Atran | last post by:
Hello: In this article: You will learn to Write or Read A Text File. Let's Begin: First Create a new project (ConsoleApp or WinApp). And Make sure your program uses these namespaces: using...
2
by: Zach | last post by:
I compiled a game client and it crashed (segmentation fault) resulting in a core file being generated. I'm trying to find out exactly what caused it to crash. Any ideas how I can do this with gdb?...
1
by: dwaterpolo | last post by:
Hi Everyone, I am trying to read two text files swY40p10t3ctw45.col.txt and solution.txt and compare them, the first text file has a bunch of values listed like: y y y y y y y
13
by: rohit | last post by:
Hi All, I am new to C language.I want to read integers from a text file and want to do some operation in the main program.To be more specific I need to multiply each of these integers with another...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.