473,545 Members | 2,715 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Segfault question.

When I started testing the algorithms for my wrap program, I threw together
this snippet of code, which works quite well. Except that it (predictably)
segfaults at the end when it tries to go beyond the file. At some point, I
tried to mend that behavior using feof() but without success. The
functionality is not harmed, but this has started to bug me. What am I
missing here? Sometimes being a code duffer is frustrating!! lol!!!

The code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main (int argc, char *argv[])
{
FILE *fp;

int len;
char buf[100];

if ((fp = fopen(argv[1], "r")) == NULL) {
fprintf(stderr, "can't open fp");
return EXIT_FAILURE;
}

while (((len = strlen(fgets(bu f, 80, fp))) != 0)) {
printf(" %i\t", len);
printf("%s", buf);
}

fclose(fp); /* Nah, no error checking here... */

return EXIT_SUCCESS;
}

Thanks for reading.
--
Email is wtallman at olypen dot com
Nov 14 '05 #1
10 1931

"name" <us**@host.doma in> wrote in message
news:10******** *****@corp.supe rnews.com...
When I started testing the algorithms for my wrap program, I threw together this snippet of code, which works quite well.
No it doesn't. It invokes undefined behavior.
Except that it (predictably)
segfaults at the end when it tries to go beyond the file.
And I can see exactly why. See below.
At some point, I
tried to mend that behavior using feof() but without success.
Guessing rarely will fix the real problem.
The
functionality is not harmed,
Well, no, you can't kill something that's already dead. :-)
but this has started to bug me.
Yes, you have a serious, fatal bug.
What am I
missing here?
You apparently forgot to check the documentation of a library
function, because you didn't allow for its possible failure.
Sometimes being a code duffer is frustrating!! lol!!!
Especially when you try to go to fast, as I suspect you've done.

The code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main (int argc, char *argv[])
{
FILE *fp;

int len;
I see below that you store the return value from 'strlen()'
in 'len'. This means its type should be 'size_t', not 'int'.
char buf[100];

if ((fp = fopen(argv[1], "r")) == NULL) {
You should check that 'argv[1]' is indeed a valid pointer
(i.e. make sure argc > 1) before trying to dereference it.
If argc <= 1, then the expression 'argv[1]' invokes
undefined behavior.
fprintf(stderr, "can't open fp");
return EXIT_FAILURE;
}

while (((len = strlen(fgets(bu f, 80, fp))) != 0)) {
If 'fgets()' encounters an error or end of file, it will return
NULL. If you pass NULL as the argument to 'strlen()' you get
undefined behavior (which could be manifested as a 'segfault').
printf(" %i\t", len);
printf("%s", buf);
}

fclose(fp); /* Nah, no error checking here... */
Nor did you do error checking where it really mattered.
Check the return value of *any* function which is documented
to possibly return a 'failure' indication (as does 'fgets()').

return EXIT_SUCCESS;
}


-Mike
Nov 14 '05 #2
name wrote:

When I started testing the algorithms for my wrap program, I
threw together this snippet of code, which works quite well.
Except that it (predictably) segfaults at the end when it tries
to go beyond the file. At some point, I tried to mend that
behavior using feof() but without success. The functionality is
not harmed, but this has started to bug me. What am I missing
here? Sometimes being a code duffer is frustrating!! lol!!!

The code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main (int argc, char *argv[])
{
FILE *fp;

int len;
char buf[100];

if ((fp = fopen(argv[1], "r")) == NULL) {
fprintf(stderr, "can't open fp");
return EXIT_FAILURE;
}

while (((len = strlen(fgets(bu f, 80, fp))) != 0)) {
printf(" %i\t", len);
printf("%s", buf);
}

fclose(fp); /* Nah, no error checking here... */

return EXIT_SUCCESS;
}


Look up what fgets returns when it encounters end-of-file or an
i/o error. Then consider what strlen does when you ask it to chew
on that.

<rant> Please get rid of the excessive indentation in your code.
3 or 4 spaces is quite enough. The excessive space makes lines
too long and causes things to disappear over the right margin
(although the above lines are short enough to avoid that). Don't
use tabs. </rant>

--
"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
Nov 14 '05 #3
CBFalconer wrote:
Look up what fgets returns when it encounters end-of-file or an
i/o error. Then consider what strlen does when you ask it to chew
on that. I've neve been able to figue out when fgets returns NULL. Please can you
explain it.
Fo example if I've got the following file:
aaaaaaaa\n
bbbbbbbbb<EOF>

And I would call fgets three times in a raw with a big buffer (lager
then 10 bytes or chars). When whould fgets return NULL?
<rant> Please get rid of the excessive indentation in your code.
3 or 4 spaces is quite enough. The excessive space makes lines
too long and causes things to disappear over the right margin
(although the above lines are short enough to avoid that). Don't
use tabs. </rant>
IMHO if lines are too long it's time to create new function to solve the
part of the whole task.

"name" wrote while (((len = strlen(fgets(bu f, 80, fp))) != 0)) {

I think that functional style is good enough, so I suggest that you'll
write a wrapper fo strlen.
Something like:
int my_strlen (const char *s)
{
size_t tmp;

if (s == NULL)
return -1;
tmp = strlen (s);
if (tmp > INT_MAX) {
errno = EINVAL;
return -1;
}
return (int)tmp;
}

--
vir
Nov 14 '05 #4
Victor Nazarov wrote:
CBFalconer wrote:
Look up what fgets returns when it encounters end-of-file or an
i/o error. Then consider what strlen does when you ask it to
chew on that.


I've neve been able to figue out when fgets returns NULL. Please
can you explain it.


Look at the last two references in my sig. line.

--
Some useful references:
<http://www.ungerhu.com/jxh/clc.welcome.txt >
<http://www.eskimo.com/~scs/C-faq/top.html>
<http://benpfaff.org/writings/clc/off-topic.html>
<http://anubis.dkuug.dk/jtc1/sc22/wg14/www/docs/n869/> (C99)
<http://www.dinkumware. com/refxc.html> C-library
Nov 14 '05 #5
In article <news:41******* ********@yahoo. com>
<rant> Please get rid of the excessive indentation in your code.
3 or 4 spaces is quite enough. The excessive space makes lines
too long and causes things to disappear over the right margin
(although the above lines are short enough to avoid that). Don't
use tabs. </rant>


I personally do not mind 8-character-per-lexical-level indentation,
although I do think 4 works better. I do remember hearing, from
the "human/computer interaction" folks and people doing visual
studies, that anything less than three characters is not so good,
because -- depending on one's font -- two-character indentations
may not create sufficient angles to trigger the brain's horizontal
and vertical line detectors. These detectors exist, though, and
run all the time whether we want them to or not; careful indentation
takes advantage of them.

As for tabs: use them or do not, but do not change your system's
interpretation of them. If you want n-character indentation where
"n" differs from the system's interpretation of "hardware" tabs,
just make sure that when you push your "tab" key in your editor,
it inserts spaces and/or tabs in order to get to the n'th column.
(In some cases, this may mean pushing a key other than the one
labelled "tab". For instance, in vi/nvi/vim, use ^T and ^D to
indent and -- assuming you have autoindent set -- de-indent by the
value you have put in the "shiftwidth " setting. In emacs, of
course, the whole thing is fully programmable.) If you do this
*instead* of instructing your editor to re-interpret the hardware
tabs, then anyone using the same underlying system will be able to
edit your code and see the same columnization that you see.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #6
Oops, included wrong file! My bad!! That was a prototype that didn't yield
correct results, as well as being badly constructed. The user version does
yield correct results but is still badly constructed (natch...), so exhibits
the same behavior.

On 2004-09-05, CBFalconer <cb********@yah oo.com> wrote:

Look up what fgets returns when it encounters end-of-file or an
i/o error. Then consider what strlen does when you ask it to chew
on that.


Okay, fgets returns a null pointer if it encounters either an EOF
immediately, or if it encounters an error. In the latter case, the string
array is undefined, so error checking fgets should be the first thing to do,
I gather. Passing a null pointer to strlen is what causes the segfault?
Does that mean that strlen returns that error because it doesn't recognize
what has been passed and so assumes it's outside of its allotted territory?

Or is something else entirely going on and I'm still at sea? <grin>

Thanks!
--
Email is wtallman at olypen dot com
Nov 14 '05 #7
In article <news:10******* ******@corp.sup ernews.com>
name <us**@host.doma in> wrote:
Okay, fgets returns a null pointer if it encounters either an EOF
immediately, or if it encounters an error.
Yes (although the "error" case is a bit dodgy; some fgets()
implementations will only return NULL on EOF-or-error-at-start,
treating error-in-the-middle as a sign to return a valid C string
that does not end with '\n').
In the latter case, the string array is undefined, so error
checking fgets should be the first thing to do, I gather.
Yes. More precisely, check whether fgets() returned its first
argument or NULL (these are the only two possibilities). (You
can also use (feof(fp) || ferror(fp)) to see whether EOF and/or
error were encountered "along the way", but this may interact
badly with fgets() variants that handle partial input lines, as
I described above.)
Passing a null pointer to strlen is what causes the segfault?
Just so. The effect is officially undefined, but a "nice" system
such as a Linux box will trap the error at runtime and terminate
the program (by default -- programs can override this, and debuggers
can trap the problem before the program sees it). Less-nice systems
might have strlen() return 42.
Does that mean that strlen returns that error because it doesn't recognize
what has been passed and so assumes it's outside of its allotted territory?

Or is something else entirely going on and I'm still at sea? <grin>


On your system, strlen() never returns at all -- so it makes no
sense to say "strlen returns that error". You could say "strlen
produces that result", which at least avoids the word "returns". :-)

The method by which Linux detects the problem and aborts the program
is beyond the scope of this newsgroup.[%] Here, it suffices to say
that strlen() requires a C string, and (char *)NULL does not qualify
as one. (A "C string" is a data structure consisting of one or
more "char"s in sequence, beginning with the char whose address is
given as a value of type "char *", and ending with the first '\0'.
Since NULL never points to a valid C data object, it cannot provide
the first byte of a string. Note that the empty string begins and
ends with its '\0' byte, which makes it quite different from NULL:
there is at least one valid C "char" there holding the '\0'.)

[% Still, I will mention that it has to do with "virtual memory"
and the on-chip MMU, which translates "virtual addresses" used by
running programs into "physical addresses" used to locate actual
bytes in RAM. The translation process has several trapping options,
with varying methods of handling them and "degrees of fatality":
areas can be marked entire-off-limits, or "within limits but not
present in RAM at the moment", or "valid but read-only", and so
on. On some CPUs, areas can even be marked execute-only, so that
it is impossible to read CPU instructions as data. Linux reserves
some areas as "not allocated to the program" and sets up the MMU
so that those areas are marked off-limits, then delivers a segmentation
fault if you attempt to read, write, or execute from such an area.]
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #8
On 2004-09-05, Chris Torek <no****@torek.n et> wrote:
<saved densely informative post for further study!!>

I gather that, for my purposes, the segfault at the EOF is comfortable,
because the EOF virtually always follows a newline and will be thus at the
beginning of a string. Doesn't bother me, but... suppose I process a file
where the EOF does show up without being preceeded by a newline? At that
point, I can't just live with a segfault unless I'm sure of the data I'm
getting.

Certainly I'm not in the market for the solution to Schroedinger's God
Problem as obtained some thousands of centuries hence in a land far, far
away!! LOL!!! Ummm... that was (will have been?) 42, was it not? <grin>

Perhaps I should just use the venerable ((c=getc(fp))!= EOF) approach. I'm
using that to drive the wrap program and, as near as I can tell, it's simple
enough that it should be considered bullet-proof. I understand that's not
the most efficient way of going, but for what I'm doing, that's really not
relevant. I can say that I am not disposed to even touch the scanf family
of functions! <grin>

I must presume that there are other more sophisticated strategies in use,
but I'm going to have to stick with what I think I can manage to understand,
lest I inundate myself unnecessarily!

In any case, thanks for all the info!
--
Email is wtallman at olypen dot com
Nov 14 '05 #9
On Mon, 06 Sep 2004 04:59:41 -0000, name <us**@host.doma in> wrote:
I gather that, for my purposes, the segfault at the EOF is comfortable,
because the EOF virtually always follows a newline and will be thus at the
beginning of a string. Doesn't bother me, but... suppose I process a file
where the EOF does show up without being preceeded by a newline? At that
point, I can't just live with a segfault unless I'm sure of the data I'm
getting.


Not a good plan. The segfault you are currently experiencing is the
result of undefined behavior. The thing about undefined behavior is
it need not be consistent. Tomorrow it could manifest itself in a
completely different fashion, such as deleting the file you just
finished processing. Upgrading your hardware, OS, or compiler could
also change the behavior.

<<Remove the del for email>>
Nov 14 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
3140
by: Nathaniel Echols | last post by:
I've written a function in C to perform protein sequence alignment. This works fine in a standalone C program. I've added the necessary packaging to use it in Python; it returns three strings and an integer. However, as soon as the function is complete, I get a segfault and the interpreter dies. If I run Python interactively, just...
6
1975
by: Stefan Behnel | last post by:
Hi! In Python 2.4b3, the deque is causing a segfault on two different machines I tested on. With deque, my program runs fine for a while (at least some tens of seconds up to minutes) and then suddenly segfaults. I'm sorry I can't tell exactly when, but I'm running an application that uses a few hundred deques where elements are...
0
1812
by: dale | last post by:
Python newbie disclaimer on I am running an app with Tkinter screen in one thread and command-line input in another thread using raw_input(). First question - is this legal, should it run without issue? If not can you point me to a description of why. While updating objects on the screen I get a segfault after an indeterminate number...
165
6745
by: Dieter | last post by:
Hi. In the snippet of code below, I'm trying to understand why when the struct dirent ** namelist is declared with "file" scope, I don't have a problem freeing the allocated memory. But when the struct is declared in main (block scope) it will segfault when passing namelist to freeFileNames().
162
6598
by: Richard Heathfield | last post by:
I found something interesting on the Web today, purely by chance. It would be funny if it weren't so sad. Or sad if it weren't so funny. I'm not sure which. http://www.developerdotstar.com/community/node/291 This "teacher of C" demonstrates his prowess with a masterful display of incompetence in a 200-line program that travels as...
3
2294
by: kj | last post by:
I am trying to diagnose a bug in my code, but I can't understand what's going on. I've narrowed things down to this: I have a function, say foo, whose signature looks something like: int foo( int w, int x, int y, int z, my_struct **results ) During its execution, foo initializes *results using calloc: ( *results ) = calloc( w+1,...
2
3194
by: danielesalatti | last post by:
Hello!! I'm studying c++ and I'm trying to get a little piece of code working, but I'm getting a segfault with strlen here: void tabhash::set (url *U) { uint hash = U->hashCode(); char* url = U->giveUrl(); char* chash = (char*)hash; char* Insert1="INSERT INTO sep_hashtable VALUES (";
10
1858
by: somebody | last post by:
There are two files below named search.c and search.h. In the for loop in search.c, the for loop never exits, even if mystruct.field1 has no match. Instead of exiting the for loop it keeps going until it segfaults. This seems to be related to the strcmp with the NULL value. There are 2 comments below that indicate the segfaults. I guess...
14
4944
by: Donn Ingle | last post by:
Yo, An app of mine relies on PIL. When PIL hits a certain problem font (for unknown reasons as of now) it tends to segfault and no amount of try/except will keep my wxPython app alive. My first thought is to start the app from a bash script that will check the return value of my wxPython app and could then launch a new app to help the user...
0
7499
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7432
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7689
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7456
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6022
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5359
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5076
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3490
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
743
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.