473,698 Members | 2,339 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Problems reading from files

Hi all,
I'm having programs reading from files.

I have a text file "files.txt" that contains the names of the files to
be opened, i.e. the contents of files.txt are

Homo_sapiens.fa
Rattus_norvegic us.fa

(They are FA files that can be opened in any text editor.)

Each of the FA files contains a number in the first line and a string
of characters (A,T,G or C). For example, the Homo_sapiens.fa file
would contain

16571
GATCACAGGTCTATC ACCCTATTAACCACT CACGGGAGCTCTCCA TGCATTTGGTATTTT
CGTCTGGGGGGTGTG CACGCGATAGCATTG CGAGACGCTGGAGCC GGAGCACCCTATGTC
GCAGTATCTGTCTTT GATTCCTGCCTCATT CTATTATTTATCGCA CCTACGTTCAATATT
ACAGGCGAACATACC TACTAAAGTGTGTTA ATTAATTAATGCTTG TAGGACATAATAATA

and so on, with 16571 A,T,G or Cs.

Below is my code:

#include <stdio.h>
#include <stdlib.h>

#define MAX_FILE 100 // maximum length of file name
#define MAX_SEQ 20000 // maximum length of sequence
#define N 2 // total number of sequences

int main(void)
{
FILE *fin, *fin1, *fout;
char input[MAX_FILE+1], seq[N][MAX_SEQ+1], c;
int size[N], i = 0, j = 0;

fin = fopen("files.tx t", "r");
fout = fopen("output.t xt", "w");
while (fscanf(fin, "%s", input) != EOF)
{
fin1 = fopen(input, "r");
printf("%s\n", input);
fscanf(fin1, "%d ", &size[i]);
printf("%d\n", size[i]);
while ((c = fgetc(fin1)) != EOF)
{
fprintf(fout, "%c", c);
if (c != '\n')
seq[i][j] = c;
j++;
if (j % 100 == 0)
printf("%c", seq[i][j]);
}
fprintf(fout, "\n\n");
j = 0;
i++;
}

fclose(fin);
fclose(fin1);
fclose(fout);
return 0;
}

The printf statements for me to check my code.

When I try to open 2 files, the first file is read in fine, but the
second file is incomplete. Over 600 characters are not read, and the
program hangs.

I get the output (due to the checking printf statements)

Homo_sapiens.fa
16571
Rattus_norvegic us.fa
16300
<program hangs>

Notice that the statements
if (j % 100 == 0)
printf("%c", seq[i][j]);
are not executed, but if I just print the character seq[0][100], it
comes out correctly.

If I try to open 3 files, the same program happens, i.e. the first
file is read correctly, but the second file is incomplete and the
third file is not read at all. I get the output

Homo_sapiens.fa
16571
Rattus_norvegic us.fa
16300
Homo_sapiens.fa
16571
Segmentation fault

I tried my program with 2 much smaller files (one has 13 characters
and the other 14), and the program works. Are the 2 files too big and
the program ran out of memory? How do I get around this problem, as I
have to read files even bigger than these 2 later?

Thank you.

Regards,
Rayne

Aug 25 '07 #1
10 3171

<la********@yah oo.comwrote in message
news:11******** **************@ l22g2000prc.goo glegroups.com.. .
Hi all,
I'm having programs reading from files.

I have a text file "files.txt" that contains the names of the files to
be opened, i.e. the contents of files.txt are

Homo_sapiens.fa
Rattus_norvegic us.fa

(They are FA files that can be opened in any text editor.)

Each of the FA files contains a number in the first line and a string
of characters (A,T,G or C). For example, the Homo_sapiens.fa file
would contain

16571
GATCACAGGTCTATC ACCCTATTAACCACT CACGGGAGCTCTCCA TGCATTTGGTATTTT
CGTCTGGGGGGTGTG CACGCGATAGCATTG CGAGACGCTGGAGCC GGAGCACCCTATGTC
GCAGTATCTGTCTTT GATTCCTGCCTCATT CTATTATTTATCGCA CCTACGTTCAATATT
ACAGGCGAACATACC TACTAAAGTGTGTTA ATTAATTAATGCTTG TAGGACATAATAATA

and so on, with 16571 A,T,G or Cs.

Below is my code:

#include <stdio.h>
#include <stdlib.h>

#define MAX_FILE 100 // maximum length of file name
#define MAX_SEQ 20000 // maximum length of sequence
#define N 2 // total number of sequences

int main(void)
{
FILE *fin, *fin1, *fout;
char input[MAX_FILE+1], seq[N][MAX_SEQ+1], c;
Thjis line could cause problems, seq is too big to so safely on the stack.
make it static.
>
int size[N], i = 0, j = 0;

fin = fopen("files.tx t", "r");
fout = fopen("output.t xt", "w");
Check here .
if(!fin) /* haven't opened fin */
if(|fout) /* haven't opened fout */
>
while (fscanf(fin, "%s", input) != EOF)
{
fin1 = fopen(input, "r");
Check here
if (!fin1); /* can't open fin 1 */
>
printf("%s\n", input);
Is this diagnostic doing what you expect. I suspect you don't want fscanf(),
you wnat fgets() to read a whole line, then chop of the trailing newline.
>
fscanf(fin1, "%d ", &size[i]);
printf("%d\n", size[i]);
while ((c = fgetc(fin1)) != EOF)
{
fprintf(fout, "%c", c);
if (c != '\n')
seq[i][j] = c;
j++;
if (j % 100 == 0)
printf("%c", seq[i][j]);
Check here if(j >= MAX_SEQ -1) /* j too big, out of space */
Put a null on the end for convenience, hence the minus 1.
>
}
fprintf(fout, "\n\n");
j = 0;
i++;
What happens when i goes greater than 1 ? You will do an illegal meory
access. You need to check if( i >= N) /* can't continue, out of space */
>
}

fclose(fin);
fclose(fin1);
fclose(fout);
return 0;
}

The printf statements for me to check my code.

When I try to open 2 files, the first file is read in fine, but the
second file is incomplete. Over 600 characters are not read, and the
program hangs.

I get the output (due to the checking printf statements)

Homo_sapiens.fa
16571
Rattus_norvegic us.fa
16300
<program hangs>

Notice that the statements
if (j % 100 == 0)
printf("%c", seq[i][j]);
are not executed, but if I just print the character seq[0][100], it
comes out correctly.

If I try to open 3 files, the same program happens, i.e. the first
file is read correctly, but the second file is incomplete and the
third file is not read at all. I get the output

Homo_sapiens.fa
16571
Rattus_norvegic us.fa
16300
Homo_sapiens.fa
16571
Segmentation fault

I tried my program with 2 much smaller files (one has 13 characters
and the other 14), and the program works. Are the 2 files too big and
the program ran out of memory? How do I get around this problem, as I
have to read files even bigger than these 2 later?

Thank you.

Regards,
Rayne
Aug 25 '07 #2
On Sat, 25 Aug 2007 02:35:35 -0700, la********@yaho o.com wrote:
Hi all,
I'm having programs reading from files.

I have a text file "files.txt" that contains the names of the files to
be opened, i.e. the contents of files.txt are

Homo_sapiens.fa
Rattus_norvegic us.fa

(They are FA files that can be opened in any text editor.)
Each of the FA files contains a number in the first line and a string
of characters (A,T,G or C). For example, the Homo_sapiens.fa file
would contain

16571
GATCACAGGTCTATC ACCCTATTAACCACT CACGGGAGCTCTCCA TGCATTTGGTATTTT
CGTCTGGGGGGTGTG CACGCGATAGCATTG CGAGACGCTGGAGCC GGAGCACCCTATGTC
GCAGTATCTGTCTTT GATTCCTGCCTCATT CTATTATTTATCGCA CCTACGTTCAATATT
ACAGGCGAACATACC TACTAAAGTGTGTTA ATTAATTAATGCTTG TAGGACATAATAATA

and so on, with 16571 A,T,G or Cs.

Below is my code:

#include <stdio.h>
#include <stdlib.h>

#define MAX_FILE 100 // maximum length of file name
stdio.h contains a macro FILENAME_MAX for that purpose.
It already includes room for the terminating null.
#define MAX_SEQ 20000 // maximum length of sequence
#define N 2 // total number of sequences

int main(void)
{
FILE *fin, *fin1, *fout;
char input[MAX_FILE+1], seq[N][MAX_SEQ+1], c;
Try making them static, 40 KB of auto variables could be too much.
int size[N], i = 0, j = 0;

fin = fopen("files.tx t", "r");
fout = fopen("output.t xt", "w");
You should check whether those work, and cope with that otherwise.
while (fscanf(fin, "%s", input) != EOF)
%s will stop on any whitespace character, not just newlines. Is
that ok? (BTW, what happens if files.txt contains a name which is
too long?)
{
fin1 = fopen(input, "r");
printf("%s\n", input);
fscanf(fin1, "%d ", &size[i]);
printf("%d\n", size[i]);
while ((c = fgetc(fin1)) != EOF)
c is declared as a char. If it is unsigned it will never equal
EOF. If it is signed, some valid character (though none of 'ACGT')
could be mistaken as EOF. fgetc returns an int. See www.c-faq.com,
section 12, question 1.
{
fprintf(fout, "%c", c);
if (c != '\n')
seq[i][j] = c;
j++;
Note that j will be incremented even if c is '\n', in
this case there will be a gap in the sequence. Add braces where
needed.
if (j % 100 == 0)
printf("%c", seq[i][j]);
You're already incremented j, so seq[i][j] will be uninitialized
at this time. For example, if at the beginning of the loop body
j were 99 and c were 'T' you would write c into seq[i][99],
increment j to 100, and print seq[i][100].
}
fprintf(fout, "\n\n");
j = 0;
i++;
}

fclose(fin);
fclose(fin1);
fclose(fout);
Ideally you should check whether the fclose() worked without
problems.
return 0;
}
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 25 '07 #3
"la********@yah oo.com" wrote:
>
I'm having programs reading from files.

I have a text file "files.txt" that contains the names of the files to
be opened, i.e. the contents of files.txt are

Homo_sapiens.fa
Rattus_norvegic us.fa

(They are FA files that can be opened in any text editor.)

Each of the FA files contains a number in the first line and a string
of characters (A,T,G or C). For example, the Homo_sapiens.fa file
would contain

16571
GATCACAGGTCTATC ACCCTATTAACCACT CACGGGAGCTCTCCA TGCATTTGGTATTTT
CGTCTGGGGGGTGTG CACGCGATAGCATTG CGAGACGCTGGAGCC GGAGCACCCTATGTC
GCAGTATCTGTCTTT GATTCCTGCCTCATT CTATTATTTATCGCA CCTACGTTCAATATT
ACAGGCGAACATACC TACTAAAGTGTGTTA ATTAATTAATGCTTG TAGGACATAATAATA

and so on, with 16571 A,T,G or Cs.

Below is my code:

#include <stdio.h>
#include <stdlib.h>

#define MAX_FILE 100 // maximum length of file name
#define MAX_SEQ 20000 // maximum length of sequence
#define N 2 // total number of sequences

int main(void)
{
FILE *fin, *fin1, *fout;
char input[MAX_FILE+1], seq[N][MAX_SEQ+1], c;
int size[N], i = 0, j = 0;

fin = fopen("files.tx t", "r");
fout = fopen("output.t xt", "w");
You fail to check for success of the fopen calls.
while (fscanf(fin, "%s", input) != EOF) {
fin1 = fopen(input, "r");
printf("%s\n", input);
fscanf(fin1, "%d ", &size[i]);
You fail to check for success of the fscanf call.
printf("%d\n", size[i]);
while ((c = fgetc(fin1)) != EOF) {
c can never be EOF, because you have erroneously declared it a
char. It should be an int.
fprintf(fout, "%c", c);
if (c != '\n')
seq[i][j] = c;
j++;
if (j % 100 == 0)
printf("%c", seq[i][j]);
}
fprintf(fout, "\n\n");
j = 0;
i++;
You fail to close fin1 before attempting to attach it to another
file.
}

fclose(fin);
fclose(fin1);
fclose(fout);
return 0;
}

The printf statements for me to check my code.

When I try to open 2 files, the first file is read in fine, but the
second file is incomplete. Over 600 characters are not read, and the
program hangs.
The amount of loss (after causing undefined behaviour) leads me to
suspect that your system has INT_MAX set at 32767. If so, you will
need to use long to ensure 32 bit ability.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>
--
Posted via a free Usenet account from http://www.teranews.com

Aug 25 '07 #4
"la********@yah oo.com" <la********@yah oo.comwrites:
Hi all,
I'm having programs reading from files.
<snip>
Below is my code:
The hang is almost certainly because 'c' should be an int. fgetc
returns int so it can signal EOF. See the FAQ (http://c-faq.com/).

I will not a couple of other things but I think most have now been
covered.
#include <stdio.h>
#include <stdlib.h>

#define MAX_FILE 100 // maximum length of file name
#define MAX_SEQ 20000 // maximum length of sequence
#define N 2 // total number of sequences

int main(void)
{
FILE *fin, *fin1, *fout;
char input[MAX_FILE+1], seq[N][MAX_SEQ+1], c;
int c; and use FILENAME_MAX.
int size[N], i = 0, j = 0;

fin = fopen("files.tx t", "r");
fout = fopen("output.t xt", "w");
Check these!
while (fscanf(fin, "%s", input) != EOF)
Danger! Danger! There are pre-processor tricks you can use to get the
correct size into a scanf %s format, but it is probably better to use fgets.
{
fin1 = fopen(input, "r");
printf("%s\n", input);
fscanf(fin1, "%d ", &size[i]);
printf("%d\n", size[i]);
while ((c = fgetc(fin1)) != EOF)
{
fprintf(fout, "%c", c);
if (c != '\n')
seq[i][j] = c;
It is always best (unless you know it is safe) to check that you
indexes are in range.
j++;
if (j % 100 == 0)
printf("%c", seq[i][j]);
}
fprintf(fout, "\n\n");
j = 0;
i++;
}

fclose(fin);
fclose(fin1);
fclose(fout);
return 0;
}

The printf statements for me to check my code.

When I try to open 2 files, the first file is read in fine, but the
second file is incomplete. Over 600 characters are not read, and the
program hangs.
see above!

--
Ben.
Aug 25 '07 #5
CBFalconer <cb********@yah oo.comwrites:
"la********@yah oo.com" wrote:
[...]
> while ((c = fgetc(fin1)) != EOF) {

c can never be EOF, because you have erroneously declared it a
char. It should be an int.
[...]

c can compare equal to EOF if plain char happens to be signed. In
that case, the code will *probably* work "correctly" ; fgetc() will
eventually return EOF, and the test will work as intended.

It can fail badly if plain char is unsigned, and it can terminate
early if plain char is signed, and the file happens to contain a
character whose value matches EOF (typically EOF is -1 and char is 8
bits, so a character '\xff' in the input file would trigger this).

But rather than spending any time considering how the code can fail,
the OP should fix the bug by declarsing c as int. If the program
continues to misbehave in the same way, he'll have narrowed down the
problem to the rest of the program; if not, he'll have fixed one bug.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 25 '07 #6
Thank you all, I've revised the code and it works now.

Aug 26 '07 #7
[snips]

On Sat, 25 Aug 2007 10:50:40 +0100, Malcolm McLean wrote:
>#define MAX_FILE 100 // maximum length of file name
#define MAX_SEQ 20000 // maximum length of sequence
#define N 2 // total number of sequences

int main(void)
{
FILE *fin, *fin1, *fout;
char input[MAX_FILE+1], seq[N][MAX_SEQ+1], c;
Thjis line could cause problems, seq is too big to so safely on the
stack.
What stack? Could you kindly show the part of the C standard which
defines "stack" or requires auto variables to be created on the stack?
Aug 27 '07 #8

"Kelsey Bjarnason" <kb********@gma il.comwrote in message
news:8r******** ****@spanky.loc alhost.net...
>Thjis line could cause problems, seq is too big to so safely on the
stack.

What stack? Could you kindly show the part of the C standard which
defines "stack" or requires auto variables to be created on the stack?
Oh deary me. There's useful pedantry, and then there's the sort that just
tries to be clever.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Aug 27 '07 #9
On Mon, 27 Aug 2007 19:42:34 +0100, Malcolm McLean wrote:
"Kelsey Bjarnason" <kb********@gma il.comwrote in message
news:8r******** ****@spanky.loc alhost.net...
>>Thjis line could cause problems, seq is too big to so safely on the
stack.

What stack? Could you kindly show the part of the C standard which
defines "stack" or requires auto variables to be created on the stack?
Oh deary me. There's useful pedantry, and then there's the sort that just
tries to be clever.
Indeed. Useful pedantry says that since you're using C, and C has no
concept of a stack, that to discuss "the stack" is meaningless at best in
the context.

So, since you seem to think there's something wrong with this, I ask
again, could you kindly show the part of the C standard which defines
"stack" or requires auto variables to be created on the stack?

Or perhaps you weren't aware there are actually machines which don't use
stacks? There are - which is probably why C doesn't require stacks.
Aug 29 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
3302
by: Keegan Alex | last post by:
Hi folks, I downloaded PHP-Nuke 6.7, and after reading the INSTALL file, it looks very simple... but either i'm reading them wrong or they're very flawed. First, it says: Untar the package into the directory you want to use on you web server, if you're unsure where that is, ask you system administrator.
5
1457
by: Sergey Poberezovskiy | last post by:
Hi, I have an .xsd document (Inc_B.xsd) that "includes" two more from the same folder: <xs:include schemaLocation="Inc.xsd" /> <xs:include schemaLocation="Inc_A.xsd" /> They all have the same targetNamespace="http://tempuri.org/Inc.xsd" xmlns:ir="http://tempuri.org/Inc.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema"
4
12798
by: Erpman | last post by:
I am trying to access the data with in a wav file. I am testing with very small files in order to keep the code simple to start with. Basically, im writing the entire wav file to a byte using a fileStream. The problem is that when reading back the data word by word and printing the results on a general form, once the data chunk header has been read, i.e, "data" and "chunk size", all the rest of the bytes in the array have the value...
2
3272
by: Mike | last post by:
Hi, I am new to C and having problems with the following program. Basically I am trying to read some files, loading data structures into memory for latter searching. I am trying to use structres and arrays of pointers to them. I have gotten the program to compile with gcc on WinXP. If the file i read doesnt have alot of records, it runs thru. But once i add more, it dies. In this program i have 4 files setup to read. The
3
2927
by: akang2005 | last post by:
When I write 'double' data to the file, it seems working fine, but when I read it later, it returns a eof when it encounters a particular number even the file is not at the end yet. However, while I write a different set of data, there is no problem to read it. Attached is the file: try change the #define RANGE 1000 into 10000, the code works when the number is 1000, and does not work when the number is 10000.
9
2194
by: Sheldon | last post by:
Good day Everyone, I am a still very new at learning C and I have thrown myself in the deep end. I started with a simple program and kept widening the scope. This has taught me many things about C and some, I must admit, have not really sunk in yet. Still, I push on. Now I am taken a library of C programs that were designed to read HDF files. I work on a Unix server and in Mandrake10. The program below is most likely broken but I cannot...
2
2862
by: subsanta | last post by:
My computer has so many problems and ive looked around on the internet and ive managed to fix some of them. I know that i have a few viruses on my computer, but i cant get rid of them, in one case i think i have one on my ipod "boot.exe" and i cant get rid of it. All of the antivirus programs ive tried either dont pick it up or cant scan my ipod. Den i have "copy.exe" which i dont know how to fix either. Task manager and regedit were...
2
2962
by: patrickdepinguin | last post by:
Hi, I use zlib to write data structures to a compressed file, using the gzwrite function. Afterwards I read the data back with gzread. I notice that this works well when the data written is not that much, but when there is more data to write, after a while I get data errors when reading back the data. Error in main: couldn't read stat zlib error -3: test512-20070531-18h10m02.stat.gz: data error
12
4663
by: xamdam | last post by:
Hi fellas, I am experiencing problems reading a 2GB zipfile consisting of multiple zipped files. I found a thread http://mail.python.org/pipermail/python-dev/2005-April/053027.html that mentions a problem on the writing side, does such a problem exist on a reading side? I am using 2.4.1, perhaps there is a fix in a later version?
0
8674
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8603
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
7725
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6518
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5860
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4369
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3046
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2329
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2001
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.