Hello,
I suspect this comes up quite often, but I haven't found an exact
solution in the FAQ. I have to read and parse a file with arbitrarily
long lines and have come up with the following plan:
1. start with a statically allocated buffer and a pointer of equal size
2. read into the buffer using fgets and append to the pointer
3. if buffer does not contain '\n', reallocate buffer and jump to 2
4. return the pointer
Do you see anything wrong with this? If so, how can I improve it?
Thanks in advance,
Vlad Dogaru
--
Number one reason to date an engineer:
The world does revolve around us; we pick the coordinate system. 7 2635
Vlad Dogaru said:
Hello,
I suspect this comes up quite often, but I haven't found an exact
solution in the FAQ. I have to read and parse a file with arbitrarily
long lines and have come up with the following plan:
1. start with a statically allocated buffer and a pointer of equal
size 2. read into the buffer using fgets and append to the pointer
3. if buffer does not contain '\n', reallocate buffer and jump to 2
4. return the pointer
Do you see anything wrong with this? If so, how can I improve it?
To start with, you can't reallocate a statically allocated buffer! Nor
can you have a pointer of equal size to a buffer except by sizing the
buffer to be the same size as a pointer. Nor can you append to a
pointer.
Once we get those impossibilities out of the way, we can dispense with
the unnecessary fgets call - your input is already buffered, so why
buffer it again through fgets?
Here's the plan:
Allocate C (greater than 1) bytes of storage space DYNAMICALLY - point
at this allocation with P. Set U to 0. Have a temporary pointer T
kicking about the place.
While you can read a character successfully that isn't a newline:
If U == C - 1
You're about to run out of space, so get some more
T = realloc(P, C * 2)
If that didn't work, you might want to try lower multipliers
(1.5, 1.25 maybe) or even use add instead of multiply - and
warn the caller that you're running low on RAM.
Eventually, either you give up (in which case tell the user
you failed), or you succeed, in which case set P = T
Increase C to describe the new allocation amount accurately
Endif
If all is well
P[U++] = the character you read
Endif
Endwhile
If all is well
P[u] = '\0'
End if
P now contains the line.
For a discussion of long-line issues, an implementation of a full line
capture function, and links to other such implementations , see http://www.cpax.org.uk/prg/writings/fgetdata.php
--
Richard Heathfield <http://www.cpax.org.uk >
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Vlad Dogaru wrote:
>
Hello,
I suspect this comes up quite often, but I haven't found an exact
solution in the FAQ. I have to read and parse a file with arbitrarily
long lines and have come up with the following plan:
1. start with a statically allocated buffer and a pointer of equal size
2. read into the buffer using fgets and append to the pointer
3. if buffer does not contain '\n', reallocate buffer and jump to 2
4. return the pointer
Do you see anything wrong with this?
Possibly with the phrase "statically allocated".
There's three kinds of duration:
1 automatic
2 static
3 allocated
Only allocated memory can be reallocated.
If so, how can I improve it?
A few of the regulars here
have written their own getline functions: http://www.cpax.org.uk/prg/writings/...ta.php#related
--
pete
Richard Heathfield wrote:
Vlad Dogaru said:
>Hello,
I suspect this comes up quite often, but I haven't found an exact solution in the FAQ. I have to read and parse a file with arbitrarily long lines and have come up with the following plan:
1. start with a statically allocated buffer and a pointer of equal size 2. read into the buffer using fgets and append to the pointer 3. if buffer does not contain '\n', reallocate buffer and jump to 2 4. return the pointer
Do you see anything wrong with this? If so, how can I improve it?
To start with, you can't reallocate a statically allocated buffer! Nor
can you have a pointer of equal size to a buffer except by sizing the
buffer to be the same size as a pointer. Nor can you append to a
pointer.
Once we get those impossibilities out of the way, we can dispense with
the unnecessary fgets call - your input is already buffered, so why
buffer it again through fgets?
If anything, my lack of English skills has contributed to the
misunderstandin g. I was talking about :
char b[100], *p;
Reading into b with fgets, then reallocating p as necessary to do a
strcat(p, b).
But your solution is much more elegant and now I see why fgets is
unnecessary.
>
Here's the plan:
Allocate C (greater than 1) bytes of storage space DYNAMICALLY - point
at this allocation with P. Set U to 0. Have a temporary pointer T
kicking about the place.
While you can read a character successfully that isn't a newline:
If U == C - 1
You're about to run out of space, so get some more
T = realloc(P, C * 2)
If that didn't work, you might want to try lower multipliers
(1.5, 1.25 maybe) or even use add instead of multiply - and
warn the caller that you're running low on RAM.
Eventually, either you give up (in which case tell the user
you failed), or you succeed, in which case set P = T
Increase C to describe the new allocation amount accurately
Endif
If all is well
P[U++] = the character you read
Endif
Endwhile
If all is well
P[u] = '\0'
End if
P now contains the line.
For a discussion of long-line issues, an implementation of a full line
capture function, and links to other such implementations , see http://www.cpax.org.uk/prg/writings/fgetdata.php
Thank you for the clarification and the link. I will look into it and I
am confident that I can write a similar function.
Vlad
--
Number one reason to date an engineer:
The world does revolve around us; we pick the coordinate system.
Vlad Dogaru wrote:
Hello,
I suspect this comes up quite often, but I haven't found an exact
solution in the FAQ. I have to read and parse a file with arbitrarily
long lines and have come up with the following plan:
1. start with a statically allocated buffer and a pointer of equal size
2. read into the buffer using fgets and append to the pointer
3. if buffer does not contain '\n', reallocate buffer and jump to 2
4. return the pointer
Do you see anything wrong with this? If so, how can I improve it?
This may not apply to your particular case, but in some instances I have
encountered with "arbitraril y long lines" one can just read a character
at a time, examine it, perform some action, and then continue. This
removes the need for a huge buffer, which in the worst case, might not
even fit into the computer's memory. Obviously this won't work if any
modification to the front of the line depends on a value near the end of
the line.
If you do go with the expanding buffer method be sure you that you do
NOT use strcat() to append each new chunk of text. Doing so will result
in each such addition scanning from the front of the buffer for the
terminal '\0' in the string. I've seen this bug many, many times.
It can cause a huge performance hit. Instead, keep track of the
length of the string in the buffer and just copy the new string directly
to the appropriate position, then adjust the length variable, and repeat.
Regards,
David Mathog
Vlad Dogaru wrote, On 14/08/07 11:46:
Richard Heathfield wrote:
<snip>
>To start with, you can't reallocate a statically allocated buffer! Nor can you have a pointer of equal size to a buffer except by sizing the buffer to be the same size as a pointer. Nor can you append to a pointer.
Once we get those impossibilities out of the way, we can dispense with the unnecessary fgets call - your input is already buffered, so why buffer it again through fgets?
If anything, my lack of English skills has contributed to the
misunderstandin g. I was talking about:
char b[100], *p;
Reading into b with fgets, then reallocating p as necessary to do a
strcat(p, b).
Since we do not know what p points to we cannot say whether you are
allowed to realloc what it points to or not. You can only pass pointers
returned by malloc or realloc to realloc.
Also be ware of denial-of-service attacks where a user deliberately
creates a file with a line 5GB long.
<snip>
--
Flash Gordon
On 2007-08-14 17:43, Flash Gordon <sp**@flash-gordon.me.ukwro te:
Vlad Dogaru wrote, On 14/08/07 11:46:
>Richard Heathfield wrote:
>>To start with, you can't reallocate a statically allocated buffer! Nor can you have a pointer of equal size to a buffer except by sizing the buffer to be the same size as a pointer. Nor can you append to a pointer.
[...]
>If anything, my lack of English skills has contributed to the misunderstandi ng. I was talking about: char b[100], *p; Reading into b with fgets, then reallocating p as necessary to do a strcat(p, b).
Since we do not know what p points to we cannot say whether you are
allowed to realloc what it points to or not.
We cannot *know*, but I think it is reasonable to assume from the
description to assume that he uses malloc to get the initial value for
p. You don't always have to assume the stupidest possible version if
something isn't specified exactly ;-).
Also be ware of denial-of-service attacks where a user deliberately
creates a file with a line 5GB long.
ACK. But that's probably not something which should be hard-coded into
the application. After all, the program might run on a machine with 64
GB RAM where 5 GB of memory usage is quite acceptable. You could use a
configurable limit or rely on OS features to limit memory consumption
(e.g. ulimit on unixoid systems).
hp
--
_ | Peter J. Holzer | I know I'd be respectful of a pirate
|_|_) | Sysadmin WSR | with an emu on his shoulder.
| | | hj*@hjp.at |
__/ | http://www.hjp.at/ | -- Sam in "Freefall"
On Aug 20, 1:57 pm, "Peter J. Holzer" <hjp-usen...@hjp.atw rote:
On 2007-08-14 17:43, Flash Gordon <s...@flash-gordon.me.ukwro te:
Vlad Dogaru wrote, On 14/08/07 11:46:
Richard Heathfield wrote: To start with, you can't reallocate a statically allocated buffer! Nor can you have a pointer of equal size to a buffer except by sizing the buffer to be the same size as a pointer. Nor can you append to a pointer.
[...]
If anything, my lack of English skills has contributed to the
misunderstandin g. I was talking about:
char b[100], *p;
Reading into b with fgets, then reallocating p as necessary to do a
strcat(p, b).
Since we do not know what p points to we cannot say whether you are
allowed to realloc what it points to or not.
We cannot *know*, but I think it is reasonable to assume from the
description to assume that he uses malloc to get the initial value for
p. You don't always have to assume the stupidest possible version if
something isn't specified exactly ;-).
Reading Flash Gordon's post I don't see him assuming anything.
He was simply aiming to cover all possibilities and I'm all for
that ; we do aim to be accurate around here. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Alex Hopson |
last post by:
I'm trying to read an html file from my local server into a string, I'm
using the following code:
$attfile = $attachment; //create filenames
$file_name = basename ($attfile);
$lines = file($attfile); //get file into array
foreach ($lines as $line_num => $line) { //concatenate each line
$fcontent.= $line;
}
|
by: Rajarshi Guha |
last post by:
Hi
I have a file containing 168092 lines (each line a single word) and when
I use
for line in f:
s = s + line
it takes for ages to read it all in - so long in fact that it makes the
program unusable. Is there any way to do something like C's fread in
Python so that I can just slurp in 1.7MB of data at one go, rather than
|
by: adpsimpson |
last post by:
Hi, I have a file which I wish to read from C++. The file, created by
another programme, contains both text and numbers, all as ascii (it's
a .txt file). A sample of the file is shown below:
<< LEDAR V1.3 - Real Time Detection >>
<LEFT 144>
<TOP 165>
<RIGHT 265>
<BOTTOM 376>
|
by: Andrew Robert |
last post by:
Hi Everyone.
I tried the following to get input into optionparser from either a file
or command line.
The code below detects the passed file argument and prints the file
contents but the individual swithces do not get passed to option parser.
|
by: Scott Simpson |
last post by:
I have a loop
for line in f:
...
and if the line is over about 10,000 characters it lops it off. How do I
get around this?
| |
by: bartonc |
last post by:
Here's something cool that I just discovered (on IE7, I wonder about the others):
I was viewing a long code block with some really long lines in it. Since the horizontal scroll bar was WAY of my screen while viewing the line in question, I started fiddling with my mouse buttons (if you don't have a scroll wheel yet, you are missing out, but center button may work). I clicked the scroll wheel and got a <-> looking gizmo on the screen. Moving the...
|
by: rizzie |
last post by:
I am currently creating a program in vb6 that reads thousands of lines from a text file. So I use loop to read each line. It works perfectly but the problem occur when I run the program and try to minimize the form or try to use another application. Seems that the form lost it focus and isnt responding though it is still in the state of processing the loop.
|
by: Derik |
last post by:
I've got a XML file I read using a file_get_contents and turn
into a simpleXML node every time index.php loads. I suspect this is
causing a noticeable lag in my page-execution time. (Or the wireless
where I'm working could just be ungodly slow-- which it is.)
Is reading a file much more resource/processor intensive than,
say, including a .php file? What about the act of creating a
simpleXML object? What about the act of checking the...
|
by: friend.blah |
last post by:
i have a text file lets say in this format
abc abs ajfhg agjfh
fhs ghg jhgjs fjhg
dj djk djghd dkfdf
....
....
......
i want to read the first line at certain time for eg : at 10clk
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |