By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,190 Members | 839 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,190 IT Pros & Developers. It's quick & easy.

problems with CR (carriage return) and LF (line feed )

P: n/a
I have created a program that downloads a web page and then performs
some text processing on it . The problem is in the text processing ,
every line (in the downloaded txt file ) ends with a strange symbol
which is the carriage return and the line feed . ( Hex values 0D and
0A ). How are these values represented in C ??? . For istance for
every character I read from the file i want the function to ignore it
.. for example :

.................................................. ..

while((c=fgetc(fp) ) != EOF )
{

switch(c)
{
case '<' :
{
tagFlag=true;
cont=true;
i=0;
if(getvalue==1)
{
getvalue=0;
string_found=false ;
}
break;
}
case '>' :
{
tagFlag=false;
break;
}
case <<<<<< What should i put here ??????
{
break;
}
default :
{
if( (string_found == true) )
{
if(tagFlag == false )
{

getvalue=1;
printf("%c \n",c);
}
}
else if( (string_found==false))
{
if( (tagFlag==false) &&
(cont==true))
{
if(c==target[i])
{

if(i==
(target.GetLen()-1) )
{
times_found++;

string_found=true;
}
else
{
i++;

cont=true;
}
}
}
}
break;

}
}
}
..................................................

The file is stored like this :
......................................

if(ret == SOCKET_ERROR)
{

exit(EXIT_FAILURE);
}

_setmode(_fileno(fp), _O_TEXT);
/* fp is the file pointer */
do
{
bytesRead = recv(itsSocket, Buffer,
sizeof(Buffer), 0);

fwrite(Buffer,sizeof(char),bytesRead,fp);
} while(bytesRead!=0)

(Ok I know socket programming is offtopic but my question isn't ....
)
Nov 13 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a
Andrew wrote:
I have created a program that downloads a web page and then performs
some text processing on it . The problem is in the text processing ,
every line (in the downloaded txt file ) ends with a strange symbol
which is the carriage return and the line feed . ( Hex values 0D and
0A ). How are these values represented in C ??? .


0x0D and 0x0A. Alternatively, '\x0D' and '\x0A'.

It may be that these values happen to correspond to characters in the
execution character set, and can be represented some other way (such as
'\r' or '\n', for example), but this is implementation-dependent.

-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.

Nov 13 '05 #2

P: n/a
in comp.lang.c i read:
I have created a program that downloads a web page and then performs
some text processing on it . The problem is in the text processing ,
every line (in the downloaded txt file ) ends with a strange symbol
which is the carriage return and the line feed .
naturally they do, that's what the http specification requires -- i.e.,
http `headers' must all end with crlf. generally files are transported
verbatim, so those bytes are likely present in the file, on the server.
( Hex values 0D and 0A ). How are these values represented in C ??? .


umm, 0x0d and 0x0a.

--
a signature
Nov 13 '05 #3

P: n/a
Andrew wrote:

I have created a program that downloads a web page and then performs
some text processing on it . The problem is in the text processing ,
every line (in the downloaded txt file ) ends with a strange symbol
which is the carriage return and the line feed . ( Hex values 0D and
0A ). How are these values represented in C ??? . For istance for
every character I read from the file i want the function to ignore it
. for example :
I have taken the liberty of reformating your code so I can clearly
indicate suggested changes (which are no longer quoted lines).
.................................................. .

while ((c = fgetc(fp) ) != EOF ) {
switch(c) {
case '<' : tagFlag = true;
cont = true;
i = 0;
if (getvalue == 1) {
getvalue = 0;
string_found = false ;
}
break;

case '>' : tagFlag=false;
break;

/* case <<<<<< What should i put here ?????? */ case '\n':
case '\r': break;
default : if ( (string_found == true) ) {
if (tagFlag == false ) {
getvalue = 1;
printf("%c \n",c);
}
}
else if ( (string_found == false)) {
if ( (tagFlag == false) && (cont == true)) {
if (c == target[i]) {
if (i == (target.GetLen()-1) ) {
times_found++;
string_found = true;
}
else {
i++;
cont = true;
}
}
}
}
break; } /* switch */
} /* while */


Excessive vertical spacing is just as harmful to comprehensibility
as the lack of breaks. Note that braces around the individual
cases are useless and confusing, as code normally simply executes
in order in the absence of a break.

I believe that the standards for HTML specify that those lines end
in \r\n, so the solution should be portable. However I am not
sure of this. You may want to inject a blank, which you can
probably do by replacing the "break" with "c = ' '" and falling
through. Other than this I am making no allegations about the
accuracy of the code.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 13 '05 #4

P: n/a
in comp.lang.c i read:
I believe that the standards for HTML specify that those lines end
in \r\n, so the solution should be portable.


they specify 0x0d 0x0a. whether those correspond to \r and \n depends on
the implementation. most likely they will, but the key to writing portable
code is in not making assumptions you can avoid.

--
a signature
Nov 13 '05 #5

P: n/a
those who know me have no need of my name wrote:
I believe that the standards for HTML specify that those lines
end in \r\n, so the solution should be portable.


they specify 0x0d 0x0a. whether those correspond to \r and \n
depends on the implementation. most likely they will, but the
key to writing portable code is in not making assumptions you
can avoid.


Of course. But the i/o system would presumably make those
translations if the internal system is not ascii based. At any
rate, the point is that it is a vulnerability to be watched when
porting.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 13 '05 #6

P: n/a
Thank you very-very much people it worked fine !!!!
Nov 13 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.