Hi,
I need to write large quantities of data to a file in C. The data comes from statistics that are continuously gathered from a simulator, and in order to not slow the whole thing down I would obviously want the writes to go as fast and efficient as possible.
Since I/O operations are rather slow, I was thinking that using a large buffer would be better than writing each data point every time. Each data point calls my function, at which point I can do something.
As I understand correctly, fwrite() already uses a buffer, but since my file is currently growing in pieces of 4 KiB, I suppose that is their buffer size. I was thinking more in something of MiBs.
I have currently allocated a buffer of 1 MiB, in which I write something and then do fwrite.
sprintf(data->buf, "magic instr %d \n", (int)n);
fwrite(data->buf, 1, strlen(data->buf), data->fd);
memset(data->buf, 0, sizeof(data->buf));
This does not do what I want, which is filling the buffer as good as possible, and then flush it to disk. I understand that my code is wrong, but I don't really know how to solve it. Should I wrap the fwrite in an if-statement which checks whether the buffer is almost full, like
if (strlen(data->buf) + strlen(data_to_ write_now) > bufsize) {
fwrite(...);
memset(..);
}
sprintf(data->buf, data_to_write_n ow);
?
I also wonder whether this approach would be the best way to handle this. I have heard about mmap, would it be more efficient?
And then there is the possibility to put the writing in another thread, so that the mean thread puts the data it receives from the simulator in the buffer, and the second one does the actual writing. Will this work better?
Thank you very much,
Thomas
3 9239 AdrianH 1,251
Recognized Expert Top Contributor
mmap is making a memory mapped file. It is very efficient, but can be cumbersome. You also will need to truncate the file as appropriate when you are done logging.
If you open your file as binary, you can still use fprintf() on it and it will write till it fills the buffer size instead of a line per line basis (I don’t know if that is defined or redefinable anywhere).
You could pass on to a separate thread, but be warned that if you pass a string, it had better be copied or it may no longer be there when the thread tried to read the string.
if (strlen(data->buf) + strlen(data_to_ write_now) > bufsize) {
fwrite(...);
memset(..);
}
sprintf(data->buf, data_to_write_n ow);
Yeah, sort of. You will defiantly have to check to see if you are going to overrun the buffer. One way of using sprintf() is as follows: -
bytesWrittenToBuffer += sprintf(buffer + bytesWrittenToBuffer, stringToOutput);
-
BUT, you still must be aware of buffer overrun issues. That can be alleviated by using snprintf(). -
bytesWrittenToBuffer
-
+= snprintf(buffer + bytesWrittenToBuffer, bytesAllocatedToBuffer – bytesWrittenToBuffer, stringToOutput);
-
Though this will not cause a buffer overrun, it will truncate making you loose part of your log if you are not careful. Still preferable compared with your programme starting to do random things though ;). So you still should look for a buffer overrun.
If you are not going to be using parameters in your format string, a simple strcpy() or strncpy() would surfice in a similar way as I described.
I would only use memset() for debugging. It is not necessary once you’ve gotten the bugs worked out.
Adrian
mmap is making a memory mapped file. It is very efficient, but can be cumbersome. You also will need to truncate the file as appropriate when you are done logging.
If you open your file as binary, you can still use fprintf() on it and it will write till it fills the buffer size instead of a line per line basis (I don’t know if that is defined or redefinable anywhere).
I'm on Linux, I thought that writing as binary or text was the same (i.e. the 'b' option is ignored). From the manpage:
The mode string can also include the letter ``b'' either as a last character or as a character between the characters in any of the two-character strings described above. This is strictly for compatibility with C89 and has no effect; the ``b'' is ignored on all POSIX conforming systems, including Linux. (Other systems may treat text files and binary files differently, and adding the ``b'' may be a good idea if you do I/O to a binary file and expect that your program may be ported to non-Unix environments.)
You could pass on to a separate thread, but be warned that if you pass a string, it had better be copied or it may no longer be there when the thread tried to read the string.
Yeah, sort of. You will defiantly have to check to see if you are going to overrun the buffer. One way of using sprintf() is as follows: -
bytesWrittenToBuffer += sprintf(buffer + bytesWrittenToBuffer, stringToOutput);
-
BUT, you still must be aware of buffer overrun issues. That can be alleviated by using snprintf(). -
bytesWrittenToBuffer
-
+= snprintf(buffer + bytesWrittenToBuffer, bytesAllocatedToBuffer – bytesWrittenToBuffer, stringToOutput);
-
Though this will not cause a buffer overrun, it will truncate making you loose part of your log if you are not careful. Still preferable compared with your programme starting to do random things though ;). So you still should look for a buffer overrun.
What do you mean with the last sentence? I thought snprintf will check for the buffer overrun, right?
If you are not going to be using parameters in your format string, a simple strcpy() or strncpy() would surfice in a similar way as I described.
I would only use memset() for debugging. It is not necessary once you’ve gotten the bugs worked out.
Adrian
When I don't use memset, I have to make sure that I pass the correct length arguments to fwrite, right? Is it ok to give the size argument to fwrite a value of 1 (byte) ?
Thanks, Thomas
AdrianH 1,251
Recognized Expert Top Contributor
I'm on Linux, I thought that writing as binary or text was the same (i.e. the 'b' option is ignored). From the manpage:
If it is part of the C89 standard then it is true. Then your buffering is going to be as big as the buffer allocated by the stdio library. I know that there are some exceptions, stdin and stdout are line buffered, but this is not the same for a regular file.
What do you mean with the last sentence? I thought snprintf will check for the buffer overrun, right?
Yes, I meant that if you don’t check for a buffer overrun as if you were using sprintf() you would truncate your string using snprintf().
When I don't use memset, I have to make sure that I pass the correct length arguments to fwrite, right? Is it ok to give the size argument to fwrite a value of 1 (byte) ?
You have to do that if you use memset() or not. As far as I know, you can. You can set it higher if you want to pass an array structure objects and don’t want to calculate the total size yourself.
Thanks, Thomas
Your welcome, Adrian ;)
Sign in to post your reply or Sign up for a free account.
Similar topics |
by: Sebastian Kemi |
last post by:
How should a write a class to a file?
Would this example work:
object *myobject = 0;
tfile.write(reinterpret_cast<char *>(myobject), sizeof(*object));
/ sebek
|
by: Adam Hartshorne |
last post by:
As a result of a graphics based algorihtms, I have a list of indices to
a set of nodes.
I want to efficiently identify any node indices that are stored multiple
times in the array and the location of them in the array /list. Hence
the output being some list of lists, containing groups of indices of the
storage array that point to the same node index.
This is obviously a trivial problem, but if my storage list is large and
the set of...
|
by: lwickland |
last post by:
Summary: System.Net.ScatterGatherBuffers.MemoryChuck allocates inordinately
large bytes when sending large post data.
The following application consumes inordinate quantities of memory. My code
does not explicitly allocate memory in a loop nor does it explicitly allocate
large blocks of memory. Yet, the application’s memory footprint will grow as
large as 370 MB. Rarely will it run to completion; usually, it throws an out
of memory...
|
by: Anony |
last post by:
Hi All,
I'm trying to chunk a long string SourceString into lines of LineLength
using this code:
Dim sReturn As String = ""
Dim iPos As Integer = 0
Do Until iPos >= SourceString.Length - LineLength
sReturn += SourceString.Substring(iPos, LineLength) + vbCrLf
iPos += LineLength
|
by: Chris Springer |
last post by:
I'd like to get some feedback on the issue of storing data out to disk and
where to store it. I've never been in a production environment in
programming so you'll have to bear with me...
My question is about storing data in a database. Yes I understand that you
can link to a database in your program and read and write to the database
etc etc. Well, that's all find and dandy but what if the person you're
writing the application for...
| |
by: Claudio Grondi |
last post by:
I have a 250 Gbyte file (occupies the whole hard drive space) and want
to change only eight bytes in this file at a given offset of appr. 200
Gbyte (all other data in that file should remain unchanged).
How can I do that in Python?
Claudio Grondi
|
by: Cameron Walsh |
last post by:
Hi all,
I have a numpy.array of 89x512x512 uint8's, set up with code like this:
data=numpy.array(,dtype="uint8")
data.resize((89,512,512))
# Data filled in about 4 seconds from 89 image slices
<snip lots of processing code>
|
by: random guy |
last post by:
Hi,
I'm writing a program which creates an index of text files. For each
file it
processes, the program records the start and end positions (as
returned by
tellg()) of sections of interest, and then some time later uses these
positions
to read the interesting sections from the file.
|
by: Barry Flynn |
last post by:
Hi
I am working with a VB 2005 program which has been converted from VB6.
It writes data out to a flat file, with code like the following line
WriteLine(riFileNo, "Hist", lsAssetID, lsRecordType, lsXNbr, lsFiscYr,
"Beg", CStr(H.BegBalAccDepn), CStr(H.BegBalCost), CStr(H.BegBalCostReval),
CStr(H.BegBalDepCost), CStr(H.BegBalDepnReval))
The program is running from within a Virtual PC
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |