473,804 Members | 3,021 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

slurping in binary data

1 000100000000000 0001
2 000100000000000 0001
3 100000110010000 00000000001
4 100000110010000 00000000001
5 100000110010000 00000000001
6 100000110010000 00000000001
7 100000110010000 00000000001
8 100000110010000 00000000001
9 100000110010000 00000000001
10 100000110010000 00000000001
11 100000001111100
12 100000001111100
13 100000001111100
14 100000111011011 110000000000000 1
15 100000111011011 110000000000000 1
16 100000111011011 110000000000000 1
17 100000111011011 110000000000000 1
18 100000111011011 110000000000000 1
19 100000111011011 110000000000000 1
20 000100000000000 0001

1
2
*****
*****
*****
*****
*****
*****
*****
***** 10
************ 11
************
************
***** *** 14
***** ***
***** ***
***** ***
***** ***
***** ***
20

3W, 12 B, 5W
1000 0000111 1100

If I wanted to suck in the above 20 lines without line numbers, spaces,
carriage returns, line feeds, or anything that is not a zero, one or EOF,
how would I do it?

Thanks in advance,

--
George

Terrorist attacks can shake the foundations of our biggest buildings, but
they cannot touch the foundation of America. These acts shatter steel, but
they cannot dent the steel of American resolve.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
Nov 18 '08
30 1952
Richard Heathfield wrote:
James Kuyper said:
>George wrote:

<snip>
>>Would you prefer fgets to either Chuck's or Richard's enhancements?
My news server reports that there are currently 18 messages in this
thread. One was posted by Chuck Falconer, and 2 by Richard Heathfield,

and all three of them *after* George's message.
>none by anyone else named Chuck or Richard. All three of those messages
are short comments which don't suggest anything I would call an
enhancement. Are you referring to messages which haven't been showing up
on my news server?

Despite his use of the word "enhancemen t", George may be referring to Chuck
Falconer's ggets() function.
Ah! If so, I will say that I share your frequently expressed negative
opinion of that function, though I don't disapprove of it quite as
passionately as you do.
To George: Google for messages by Richard Heathfield containing ggets().

My biggest problem with ggets() is the lack of support for re-using
buffers that have already been allocated. On those rare occasions when I
want to do anything remotely similar to ggets(), I usually use a
single-line buffer which I realloc() to a larger size whenever a line
comes in which is too big for the current buffer size. I almost never
want to waste time allocating a brand new buffer for every single line
of the input file.
Nov 20 '08 #21
James Kuyper said:
Richard Heathfield wrote:
<snip>
>Despite his use of the word "enhancemen t", George may be referring to
Chuck Falconer's ggets() function.

Ah! If so, I will say that I share your frequently expressed negative
opinion of that function, though I don't disapprove of it quite as
passionately as you do.
There's nothing wrong with it that can't be fixed. Unfortunately, the owner
doesn't seem to agree that it needs fixing.

<snip>
My biggest problem with ggets() is the lack of support for re-using
buffers that have already been allocated.
Right. It's not exactly difficult to take a char ** and a size_t *.

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Nov 20 '08 #22
James Kuyper <ja*********@ve rizon.netwrites :
My biggest problem with ggets() is the lack of support for re-using
buffers that have already been allocated. On those rare occasions when
I want to do anything remotely similar to ggets(), I usually use a
single-line buffer which I realloc() to a larger size whenever a line
comes in which is too big for the current buffer size. I almost never
want to waste time allocating a brand new buffer for every single line
of the input file.
I'll put in a plug for something like the getline() function from glibc.

ssize_t getline (char **LINEPTR, size_t *N, FILE *STREAM);

*LINEPTR should be a pointer to a buffer of size *N obtained from
malloc. If it is not large enough, it's expanded with realloc, and
*LINEPTR and *N are updated appropriately.

It's very easy to write your own version in standard C, of course.
Unfortunately glibc's code is tightly wound in with the rest of their
stdio library, otherwise it could be swiped and used verbatim in a GPL
program.
Nov 20 '08 #23
James Kuyper wrote:
George wrote:
.... snip ...
>
>Would you prefer fgets to either Chuck's or Richard's enhancements?

My news server reports that there are currently 18 messages in this
thread. One was posted by Chuck Falconer, and 2 by Richard Heathfield,
none by anyone else named Chuck or Richard. All three of those
messages are short comments which don't suggest anything I would call
an enhancement. Are you referring to messages which haven't been
showing up on my news server?
I suspect George is referring to my ggets.c package. This is
available in public domain source form at:

<http://cbfalconer.home .att.net/download/ggets.zip>

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home .att.net>
Try the download section.
Nov 20 '08 #24
On Thu, 20 Nov 2008 14:44:08 +0000, Richard Heathfield wrote:
>My biggest problem with ggets() is the lack of support for re-using
buffers that have already been allocated.

Right. It's not exactly difficult to take a char ** and a size_t *.
It's difficult enough for me, who is having quite a time disambiguating all
the terms here. Chuck does have a link to your treatment of this material,
right at the bottom of readme.txt.

pg. 263, unleashed
The gets() function takes as its sole arguments a pointer to char.
Starting at that char, it will probably fill memory with data from stdin
until anewline is encountered, at which point gets90 will probably
null-terminate the string and probably return control to its caller. If,
by some miracle, STDIN contains a newline character within the first N
characters, where N is the supplied buffer, you can take the "probably"
from that sentence--for that one call. If there isn't a newline early
enough, gets() will start to trample over memory that it shouldn't be
touching, with undefined results.
If our way to slurp in these data followed roughly the declarations in
Nick's version:

gets (char data[], FILE *in)
{
char buffer [40];
int line_num;
, you wouldn't need the file *in part, because that's stdin. Are you
saying that:

14 *10000011101101 111000000000000 011111111111111 1111111111111

would screw it up?

--
George

Leadership to me means duty, honor, country. It means character, and it
means listening from time to time.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
Nov 21 '08 #25
On Thu, 20 Nov 2008 15:40:54 -0500, CBFalconer wrote:
James Kuyper wrote:
>George wrote:
... snip ...
>>
>>Would you prefer fgets to either Chuck's or Richard's enhancements?

My news server reports that there are currently 18 messages in this
thread. One was posted by Chuck Falconer, and 2 by Richard Heathfield,
none by anyone else named Chuck or Richard. All three of those
messages are short comments which don't suggest anything I would call
an enhancement. Are you referring to messages which haven't been
showing up on my news server?

I suspect George is referring to my ggets.c package. This is
available in public domain source form at:

<http://cbfalconer.home .att.net/download/ggets.zip>

#include <stdio.h>
#include <stdlib.h>
#include "ggets.h"

#define INITSIZE 112 /* power of 2 minus 16, helps malloc */
#define DELTASIZE (INITSIZE + 16)

enum {OK = 0, NOMEM};

int fggets(char* *ln, FILE *f)

! end abridged source of ggets.c

If this is to replace the use of gets, what need have you of that file
pointer? The only use of f is in getc(f).
--
George

I believe that God has planted in every heart the desire to live in
freedom.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
Nov 21 '08 #26
On Thu, 20 Nov 2008 02:13:43 GMT, James Kuyper wrote:
George wrote:
>On Tue, 18 Nov 2008 12:26:05 GMT, James Kuyper wrote:
>>George wrote:
On Tue, 18 Nov 2008 00:59:20 -0800 (PST), Nick Keighley wrote:
...
assuming you mean the twenty lines at the beginning. Would fgets()
followed by
>
int line_num;
char data[32];
>
fscanf (line, "%d %32s", &line_num, data);
>
do the job?
Why fgets before scanf?

Key point to keep in mind here: I was thinking of sscanf(), not fscanf()
(or scanf()). The fgets()/sscanf() combo is the best way I know of to
read most text-format files.
I hadn't even disambiguated these. Am I correct that

scanf
sscanf
fscanf

are the only ones that look like another?
>
>>Because scanf() treats newline characters the same way as any other
whitespace character. This is usually not the way they should be
handled. As a result, a single incorrectly formatted line can cause all
following lines to be handled incorrectly, causing bugs that can be a
real pain to track down.

How does fgets know to stop?

It stops at the first newline or when the buffer you've provided it is
full, or at the end of the file, whichever comes first. The key point is
the "newline" - that's what makes this approach more robust when reading
line-oriented files.

ok
>
>Presumably, we want to start scanf'ing with '1'. Let me refresh you memory
of the data set. I call it george.txt to reflect my pseudonym.

1 000100000000000 0001
2 000100000000000 0001
3 100000110010000 00000000001
4 100000110010000 00000000001
5 100000110010000 00000000001
6 100000110010000 00000000001
7 100000110010000 00000000001
8 100000110010000 00000000001
9 100000110010000 00000000001
10 100000110010000 00000000001
11 100000001111100
12 100000001111100
13 100000001111100
14 100000111011011 110000000000000 1
15 100000111011011 110000000000000 1
16 100000111011011 110000000000000 1
17 100000111011011 110000000000000 1
18 100000111011011 110000000000000 1
19 100000111011011 110000000000000 1
20 000100000000000 0001

It's 20 by forty. Given that I premise that the first datum is the line
number, do I still have to fgets?

If your input file is perfectly formatted, and your program is correctly
written, there's no need. However, I think it's poor design to write
code that fails catastrophicall y when given incorrect inputs. I believe
in designing programs so they fail gracefully when given bad input. That
means that they fail without undefined behavior, and with an informative
error message, if possible. It's a lot harder to achieve that goal with
fscanf() than it is with fgets()/sscanf().

What happens if the line number is missing from, for example, line 11?
With fscanf(), it will try to interpret 100000001111100 as a decimal
integer, and store it into the line number (with undefined behavior
unless INT_MAX is larger than that value), and then put "12" into the
data buffer. fscanf() will return a value of 2, indicating a successful
read, because it has no way of noticing that anything went wrong. With
fgets()/sscanf(), you can check whether sscanf()==2; if it does not, you
immediately know there's a problem with the line.

Continuing processing despite a problem like that can be pointless, or
mandatory, or anywhere in between those two extremes, depending upon
your application. If you keep using fscanf(), it would attempt to read
100000001111100 as the line number and put "13" into the data buffer; it
will stay out of sync with the actual lines until the end of the file,
or the next incorrectly formatted line, whichever comes first.

With fgets()/sscanf(), fgets() will start cleanly at the next line, so
sscanf() can do exactly what you need it to do; the combination of those
two functions won't stay out of sync with the data, the way fscanf() would.
#include <stdio.h>
#include <stdlib.h>

#define PATH "george.txt "
#define NUMBER 100
#define BIN 1000
#define MAXFMTLEN 2000

int main(void)
{
FILE *fp;
char pattern[MAXFMTLEN];
char lnumber[NUMBER];
char lbin[BIN];
char line[MAXFMTLEN];

if ((fp = fopen(PATH, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

sprintf(pattern , "%%%ds %%%ds", BIN-1, NUMBER-1);

while ((fgets(line, MAXFMTLEN, fp)) != NULL ) {
sscanf(line, pattern , lnumber, lbin);
/*fscanf (fp, "%d %32s", &lnumber, lbin);*/
printf("%s\n", lbin);
}
Q1) Does the while control satisfy your critism above?

Q2) Why doesn't the sprintf have to *follow* the while?

whitespace crlf
whitespace crlf
1 000100000000000 0001
2 000100000000000 0001
3 100000110010000 00000000001
4 100000110010000 00000000001
--
George

If you're sick and tired of the politics of cynicism and polls and
principles, come and join this campaign.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
Nov 21 '08 #27
On Tue, 18 Nov 2008 09:15:43 -0800, Keith Thompson wrote:
Chad <cd*****@gmail. comwrites:
[66 lines deleted]
>>But he would would fgets() followed by fscanf(), wouldn't that just
suck in one line at a time vs all 20 lines at once? Here is what I
mean. BTW, I used sscanf() and not fscanf().
[85 lines deleted]
>>

Yikes! I forgot that using internet slang is a no no here. BTW = by
the way.

I don't think BTW is a problem; it's common enough that I think almost
everyone understands it. Silly abbreviations like "u" for "you"
are frowned upon.

But you really didn't need to quote the *entire* previous article to
add a one-line comment.
Chad's fine.

#include <stdio.h>
#include <stdlib.h>

#define PATH "george.txt "
#define NUMBER 100
#define BIN 1000
#define MAXFMTLEN 2000

int main(void)
{
FILE *fp;
char pattern[MAXFMTLEN];
char lnumber[NUMBER];
char lbin[BIN];
char line[MAXFMTLEN];

/*int line_num;
char data[32];*/
// was this an earlier version?

if ((fp = fopen(PATH, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

sprintf(pattern , "%%%ds %%%ds", BIN-1, NUMBER-1);

while ((fgets(line, MAXFMTLEN, fp)) != NULL ) {
sscanf(line, pattern , lnumber, lbin);
/*fscanf (fp, "%d %32s", &lnumber, lbin);*/
printf("%s\n", lbin);
}

fclose(fp);
return 0;
}

// gcc -o x.exe chad1.c

I see no useful distinction between

#define NUMBER 100
#define BIN 1000
, if there were to be meaningful line numbers. The next part of the
process is that I strip away anything that isn't a one or a zero in a range
of columns, so it's no big deal.

I'm a little sketchy about this line,
sscanf(line, pattern , lnumber, lbin);
in particular the meaning of pattern, when there's already a line, an
lnumber and an lbin.

BTW, everyone has equal editorial priveleges. As OP, I'm just more
equal.:-)
--
George

Saddam Hussein is a homicidal dictator who is addicted to weapons of mass
destruction.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
Nov 21 '08 #28
On Thu, 20 Nov 2008 10:45:13 -0800, Nate Eldredge wrote:
James Kuyper <ja*********@ve rizon.netwrites :
>My biggest problem with ggets() is the lack of support for re-using
buffers that have already been allocated. On those rare occasions when
I want to do anything remotely similar to ggets(), I usually use a
single-line buffer which I realloc() to a larger size whenever a line
comes in which is too big for the current buffer size. I almost never
want to waste time allocating a brand new buffer for every single line
of the input file.

I'll put in a plug for something like the getline() function from glibc.

ssize_t getline (char **LINEPTR, size_t *N, FILE *STREAM);

*LINEPTR should be a pointer to a buffer of size *N obtained from
malloc. If it is not large enough, it's expanded with realloc, and
*LINEPTR and *N are updated appropriately.

It's very easy to write your own version in standard C, of course.
Unfortunately glibc's code is tightly wound in with the rest of their
stdio library, otherwise it could be swiped and used verbatim in a GPL
program.
Are you on gcc's mailing list? How would you do that without having 20
emails a day in a mailbox that gets 2 personal and important emails a week?

gmane?
--
George

Now, there are some who would like to rewrite history - revisionist
historians is what I like to call them.
George W. Bush

Picture of the Day http://apod.nasa.gov/apod/
Nov 22 '08 #29
George <ge****@example .invalidwrites:
On Thu, 20 Nov 2008 14:44:08 +0000, Richard Heathfield wrote:
>>My biggest problem with ggets() is the lack of support for re-using
buffers that have already been allocated.

Right. It's not exactly difficult to take a char ** and a size_t *.

It's difficult enough for me, who is having quite a time disambiguating all
the terms here. Chuck does have a link to your treatment of this material,
right at the bottom of readme.txt.

pg. 263, unleashed
The gets() function takes as its sole arguments a pointer to char.
Starting at that char, it will probably fill memory with data from stdin
until anewline is encountered, at which point gets90 will probably
null-terminate the string and probably return control to its caller. If,
by some miracle, STDIN contains a newline character within the first N
characters, where N is the supplied buffer, you can take the "probably"
from that sentence--for that one call. If there isn't a newline early
enough, gets() will start to trample over memory that it shouldn't be
touching, with undefined results.
Is that a 99.8% verbatim quote, because it looks horrifically mangled?
(I subtract .2%, as I presume 90 was really () sans shift key, and your
whitespace is a bit dodgy.)

Phil
--
I tried the Vista speech recognition by running the tutorial. I was
amazed, it was awesome, recognised every word I said. Then I said the
wrong word ... and it typed the right one. It was actually just
detecting a sound and printing the expected word! -- pbhj on /.
Nov 22 '08 #30

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
6007
by: Paul | last post by:
Hi, (First apologies if this is not the most relevant place to post this but I wasn't sure of where was and I am writing my app in VB.) I'm attempting to parse a binary file for which I have the format. The format states that the general packet format is as follows Message header Hex Length of whole binary packet
13
15272
by: yaipa | last post by:
What would be the common sense way of finding a binary pattern in a ..bin file, say some 200 bytes, and replacing it with an updated pattern of the same length at the same offset? Also, the pattern can occur on any byte boundary in the file, so chunking through the code at 16 bytes a frame maybe a problem. The file itself isn't so large, maybe 32 kbytes is all and the need for speed is not so great, but the need for accuracy in the...
103
48771
by: Steven T. Hatton | last post by:
§27.4.2.1.4 Type ios_base::openmode Says this about the std::ios::binary openmode flag: *binary*: perform input and output in binary mode (as opposed to text mode) And that is basically _all_ it says about it. What the heck does the binary flag mean? -- If our hypothesis is about anything and not about some one or more particular things, then our deductions constitute mathematics. Thus mathematics may be defined as the subject in...
2
2536
by: Lisa Pearlson | last post by:
Hi, My php application (on Apache/Linux) needs to do the following: The PHP script receives a request from a client (binary), asking for certain records of data. My PHP script loops through all records and sends each of them ONE BY ONE. After each record that my server script sends, it waits for the client to confirm proper reception with an ACK (binary digit). When there are no more records, my server script sends the client a binary
28
2831
by: wwj | last post by:
void main() { char* p="Hello"; printf("%s",p); *p='w'; printf("%s",p); }
4
3696
by: knapak | last post by:
Hello I'm a self instructed amateur attempting to read a huge file from disk... so bear with me please... I just learned that reading a file in binary is faster than text. So I wrote the following code that compiles OK. It runs and shows the requested output. However, after execution, it pops one of those windows to send error reports online to the porgram creator. I have managed to find where the error is but can't see what's wrong....
6
2734
by: | last post by:
Hi all, is there a better way to stream binary data stored in a table in sql 2005 to a browser in .net 2.0? Or is the code same as in .net 1.1? We noticed that in certain heavy load scenarios, every now and then the client would timeout and have to re-initiate the request... TIA!
3
7004
by: stockblaster | last post by:
Hello all.. Is it possible to convert a DataTable (i create the DataTable from a CSV file) into binary data and save it into an sql 2005 table (into binary field). After that I want to have the ability to add a row to the beginning of the to the binary data..
1
2314
by: willakawill | last post by:
I have a very large binary file saved from a vb array with 2 dimensions; Dim arMatrix() As Byte Dim fNum As Integer ReDim arMatrix(7166, 17769) 'code here to store data from a database into this 'array 'get a free file number fNum = FreeFile()
0
9706
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9584
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10583
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10337
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10323
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
6854
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5525
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5654
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4301
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.