473,400 Members | 2,163 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,400 software developers and data experts.

Parsing through a file and collect data ...

Greetings,

I have a file that I have written some data into it in the following
manner:

Charlene1719056:2011392059"1.908.555.1212"07083

The data is arranged in this order:
name, size,Unique key,phone number, zip code

There may be hundreds of these entries in the file. I would like to
parse through it and collect this infor and assigne each value to a
variable, which I can later insert into a database.

I count the number of entries at the begining of the read and know how
many records that I need to parse through. I am having difficulties
parsing through the semicolon and the two brackets to gram what is in
between and after.

Any assistance would be greatly appreciated.

Thank you and best regards,

Johnny
Jul 19 '05 #1
7 4029
Hi Johnny
I have a file that I have written some data into it in the following
manner:

Charlene1719056:2011392059"1.908.555.1212"07083

The data is arranged in this order:
name, size,Unique key,phone number, zip code


My first suggestion here is to delimit each field, something like..

Charlene|1719056|:2011392059|"1.908.555.1212"|0708 3

And then seperate each record by a DIFFERENT delimeter. Some people prefer a newline, but you could use spaces or any other such character.
Example:

Charlene|1719056|:2011392059|"1.908.555.1212"|0708 3 Charlene|1719056|:2011392059|"1.908.555.1212"|0708 3
-or-
Charlene|1719056|:2011392059|"1.908.555.1212"|0708 3
Charlene|1719056|:2011392059|"1.908.555.1212"|0708 3

Then you could simply do..

open( file );

read_a_record(); /* Take each record one at a time. Try looking into strtok() */
-> split_record_up(); /* Take each entry of the record. Try strtok() again */

continue read_a_record() until EOF

close( file );

Hope this helped!

-Elliot :)

---
"One must imagine Sisyphus happy."
Jul 19 '05 #2
Elliot <a_*****@hotpop.com> wrote in message news:<20031115134601.7bb4bfdf.a_*****@hotpop.com>. ..
Hi Johnny
I have a file that I have written some data into it in the following
manner:

Charlene1719056:2011392059"1.908.555.1212"07083

The data is arranged in this order:
name, size,Unique key,phone number, zip code


My first suggestion here is to delimit each field, something like..

Charlene|1719056|:2011392059|"1.908.555.1212"|0708 3

And then seperate each record by a DIFFERENT delimeter. Some people prefer a newline, but you could use spaces or any other such character.
Example:

Charlene|1719056|:2011392059|"1.908.555.1212"|0708 3 Charlene|1719056|:2011392059|"1.908.555.1212"|0708 3
-or-
Charlene|1719056|:2011392059|"1.908.555.1212"|0708 3
Charlene|1719056|:2011392059|"1.908.555.1212"|0708 3

Then you could simply do..

open( file );

read_a_record(); /* Take each record one at a time. Try looking into strtok() */
-> split_record_up(); /* Take each entry of the record. Try strtok() again */

continue read_a_record() until EOF

close( file );

Hope this helped!

-Elliot :)

---
"One must imagine Sisyphus happy."

Elliot,

Thank you for your suggestions, however, I have no control over the
structure of the data. i have to deal with it as is and manipulate it
as posted.

Any loop suggestions or string manipulation concept and techniques
would be greatly appreciated.

Thanks,

Johnny
Jul 19 '05 #3
On 15 Nov 2003 21:49:12 -0800
we*****@comcast.net (Johnny Sandaire) wrote:
Thank you for your suggestions, however, I have no control over the
structure of the data. i have to deal with it as is and manipulate it
as posted.

Any loop suggestions or string manipulation concept and techniques
would be greatly appreciated.

Thanks,

Johnny


Eek!
Things got a bit tougher, but they shouldn't be too hard.
The only way to get the computer to be able to parse the data would be to make sure the majority of the fields have a DEFINITE LENGTH.

Charlene1719056:2011392059"1.908.555.1212"07083
name size Unique key phone number zip code

My first logical guess would be to do this..
read the whole string
strrev( string );
zip_code = strrev( read_chars( 5 ) ); /* 07083 */
phone = strrev( read_chars( 16 ) ); /* "1.908.555.1212" */
key_no = strrev( read_chars( 11 ) ); /* :2011392059 */
size = strrev( read_chars( 7 ) ); /* 1719056 */

name = strrev( read_chars( strlen( string ) - 39 ) ) /* 39 = (5+16+11+7) */

This, of course, would mean that the zip, phone, key, and size would all be the same length. Unfortunately I don't expect that your size field will always be constant - i.e. it won't always be a 7-figure number. In my opinion, I can only see this problem becoming something more technical. If the sizes were all different lengths, you would have to read one digit at a time (as characters!) until you reached a real character (as that would signify the start of the name).

I hope this helps!
-Elliot :)
Jul 19 '05 #4
Elliot <a_*****@hotpop.com> wrote in message news:<20031116014333.31bea0bf.a_*****@hotpop.com>. ..
On 15 Nov 2003 21:49:12 -0800
we*****@comcast.net (Johnny Sandaire) wrote:
Thank you for your suggestions, however, I have no control over the
structure of the data. i have to deal with it as is and manipulate it
as posted.

Any loop suggestions or string manipulation concept and techniques
would be greatly appreciated.

Thanks,

Johnny
Eek!
Things got a bit tougher, but they shouldn't be too hard.
The only way to get the computer to be able to parse the data would be to make sure the majority of the fields have a DEFINITE LENGTH.

Charlene1719056:2011392059"1.908.555.1212"07083
name size Unique key phone number zip code

My first logical guess would be to do this..
read the whole string
strrev( string );
zip_code = strrev( read_chars( 5 ) ); /* 07083 */
phone = strrev( read_chars( 16 ) ); /* "1.908.555.1212" */
key_no = strrev( read_chars( 11 ) ); /* :2011392059 */
size = strrev( read_chars( 7 ) ); /* 1719056 */

name = strrev( read_chars( strlen( string ) - 39 ) ) /* 39 = (5+16+11+7) */

This, of course, would mean that the zip, phone, key, and size would

all be the same length. Unfortunately I don't expect that your size
field will always be constant - i.e. it won't always be a 7-figure
number. In my opinion, I can only see this problem becoming something
more technical. If the sizes were all different lengths, you would
have to read one digit at a time (as characters!) until you reached a
real character (as that would signify the start of the name).
I hope this helps!
-Elliot :)

Elliot,

Thank you for your advice. Since I am not sure on how the string will
change over time, I used the String functions to parse through it
looking for the first instance of " and the last of " etc... Then, I
used a substring function call to grab the data in between. Seems to
be working now.

Thanks,

Johnny
Jul 19 '05 #5
Johnny Sandaire <we*****@comcast.net> writes
I have a file that I have written some data into it in the following
manner:

Charlene1719056:2011392059"1.908.555.1212"07083

The data is arranged in this order:
name, size,Unique key,phone number, zip code


I'd read this record by record into a char array, and use sscanf to
split it up. Or perhaps read it record by record into a std::string and
use sscanf and the std::string c_str() method.

--
Simon Elliott
http://www.ctsn.co.uk/


Jul 22 '05 #6
Simon Elliott <si***@nospam.demon.co.uk> wrote in message news:<wH**************@courtlands.demon.co.uk>...
Johnny Sandaire <we*****@comcast.net> writes
I have a file that I have written some data into it in the following
manner:

Charlene1719056:2011392059"1.908.555.1212"07083

The data is arranged in this order:
name, size,Unique key,phone number, zip code


I'd read this record by record into a char array, and use sscanf to
split it up. Or perhaps read it record by record into a std::string and
use sscanf and the std::string c_str() method.


Elliott,

If I have the following:

char ScannedData[256]="proc x86 family 6 model 7 type 3"

How can I use sscanf to grab x86, 6, 7 and 3?

I then want to replace the x with the value that is after family to create 686.

Thanks,

Johnny
Jul 22 '05 #7
Johnny Sandaire wrote:

Elliott,

If I have the following:

char ScannedData[256]="proc x86 family 6 model 7 type 3"

How can I use sscanf to grab x86, 6, 7 and 3?

Depends.
Are those texts constant or can they vary? Is the format fixed or is it variable?

I assume the simplest case:

char Filler1[80], Filler2[80], Filler3[80], Filler4[80];
char Proc[80], Family[80], Model[80], Type[80];

sscanf( ScannedData, "%s %s %s %s %s %s %s %s", Filler1, Proc,
Filler2, Family,
Filler3, Model,
Filler4, Type );
I then want to replace the x with the value that is after family to create 686.


So Family is always 1 character?

Proc[0] = Family[0];
Of course, the above would need some error checking, etc.
Additionally: This is just one (simple) way to do it. Since your
requirements may vary, so does the way to solve that thing.
Also: Since this is C++, a swtich from character arrays and sscsanf
to std::string and std::stringstreams would be a good idea.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Fabian | last post by:
I want to be able to open a window with an url that has parameters like so: <a href="foo.html?xx=5&yy=6&ff=1&level=0">..</a> And then javascript will enter these paramters as global variables....
10
by: George | last post by:
How can I parse an HTML file and collect only that the A tags. I have a start for the code but an unable to figure out how to finish the code. HTML_parse gets the data from the URL document. Thanks...
3
by: kris.dorey | last post by:
Hi, Ive got the following code which seems ok but when the user runs the function for a second time I get an error message stating that the mdb is in use by another process. There is still an...
9
by: ankitdesai | last post by:
I would like to parse a couple of tables within an individual player's SHTML page. For example, I would like to get the "Actual Pitching Statistics" and the "Translated Pitching Statistics"...
12
by: Klaus Alexander Seistrup | last post by:
Hi group, I am new to xgawk (and seemingly to xml also), and I've been struggling all afternoon to have xgawką parsing an XHTML file containing a hCard˛, without luck. I wonder if you guys...
5
by: mailtogops | last post by:
Hi All, I am involved in one project which tends to collect news information published on selected, known web sites inthe format of HTML, RSS, etc and sortlist them and create a bookmark on our...
3
by: toton | last post by:
Hi, I have some ascii files, which are having some formatted text. I want to read some section only from the total file. For that what I am doing is indexing the sections (denoted by .START in...
3
by: maheshkadam | last post by:
Hi friends I am new to perl so please guide me. I have one application which created backup log file every day.But it appends that file so you can see logs for different day in one file only. ...
1
by: reddyth | last post by:
Dear All, I wanted to parse an XML file and print the element's content. I have the following code for the same. I have printed the ourput too. The problem is it is printing unwanted spaces and...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.