473,882 Members | 1,695 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

differance between binary file and ascii file

vim
hello everybody
Plz tell the differance between binary file and ascii
file........... ....
Thanks
in advance
vim

May 13 '06
68 5292
On Tue, 16 May 2006 08:11:04 UTC, Flash Gordon
<sp**@flash-gordon.me.uk> wrote:
Herbert Rosenau wrote:
On Mon, 15 May 2006 13:58:28 UTC, Flash Gordon
<sp**@flash-gordon.me.uk> wrote:
Without text streams how can you produce a C source file that is
guaranteed to produce a valid text file on whatever system you run the
program on? Historically systems have used rather more schemes than just
terminating lines with CR, CRLF or LF,


It is simple. A stream is an absract form of data I/O. There is


<snip>

I think you are in violent agreement with me. I was responding to a
questions about why C has text streams as well as binary streams with an
explanation of the problems if it did not. You are explaining why C
programs see an abstraction (e.g. text and binary streams with the
system) specifics handled at a lower level.


Yes - but in question it helps nothing. Some years ago I had the job
to write a program that hat to read text files, reformat them from
line mode to stream mode (means having a paragraph as a sinlge line
independant how many single lines it were in the soure. Problem: the
files to convert on a single mashine were coming in native text
- origination from DOS/WIN, OS/2, FTP text \r\n
- origination from 370 FTP binary mode \n
- origination from 370 virtual console \r
All found mixed up in a single directory tree on local disk
Some of them were created with a stange program using 0x8d as soft
line feed.
Reading anything as text failed to get clean output.

So reading it in binary mode and interpreting
\r\n\r\n as paragraph separator
\r\r "
\n\n "
convert 0x8d to either nothing or single space
convert (\r)\n\f to \n\n
convert \f to nothing or single space
\t as single space - except in tables
\t as sequence of spaces in tables to fill up the columns
remove any syllable (mens make a single word of the syllabled one) but
leave hyphen intact

and then reformat to 80 column fixed font, leaving tables intact.

No problem insofar but the different newline separators had it made
impossible to read that as text because the only way to get out the
different text modes was to read that as binary stream.

myungetc(), mygetc() was needed to unget multiple chars.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
May 17 '06 #61
P.J. Plauger <pj*@dinkumware .com> wrote:
"S.Tobias" <si***@FamOuS.B edBuG.pAlS.INVA LID> wrote in message
news:4c******** *****@individua l.net...
Why is there the text mode in the first place? All operations valid
for text streams seem to be valid for binary ones, too. Text streams
are more difficult to handle (eg. you can't calculate offsets, there's
some extra undefinededness ). Apart from system compatibility, is there
any advantage to opening files in text mode?


System compatibility is a damned important reason.


All right. But besides that, is there any advantage that text files/mode
offer that binary files/mode don't have?

Suppose I'm serializing data into a textual representation to be read on
another system (with the same charset). Does it matter whether I open
the file in text or binary mode?
Whitesmiths,
Ltd. introduced the text/binary dichotomy in 1978 when porting C to
dozens of non-Unix systems, and other companies did much the same thing
in the coming years. It was a slam dunk to put it in the draft C Standard
begun in 1983.


I've often heard in c.l.c. some systems (Mainframes) had complicated
internal representation of text files. Why was it that way? What did
it solve? Why couldn't they be replaced with simple "binary" files with
'\n' as record separator?

IMHO how text is represented could be viewed as a per-application
convention rather than system-wide. `Sendmail' doesn't have to read
`inetd' configuration files and v.v., so there is no reason why they
should follow the same text representation convention. It means that
on a system several (or even unlimited) conventions might be present.
Why is there in the C language room only for one type of text stream?

There is only one "binary" file (bytes are stored in the file exactly as
written to; no translation is done). Why isn't the default open mode
(such as "r+") binary? It seems to me more natural to have settled it
this way.

--
Stan Tobias
mailx `echo si***@FamOuS.Be dBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
May 21 '06 #62
"S.Tobias" <si***@FamOuS.B edBuG.pAlS.INVA LID> writes:
P.J. Plauger <pj*@dinkumware .com> wrote:
"S.Tobias" <si***@FamOuS.B edBuG.pAlS.INVA LID> wrote in message
news:4c******** *****@individua l.net...
Why is there the text mode in the first place? All operations valid
for text streams seem to be valid for binary ones, too. Text streams
are more difficult to handle (eg. you can't calculate offsets, there's
some extra undefinededness ). Apart from system compatibility, is there
any advantage to opening files in text mode?


System compatibility is a damned important reason.


All right. But besides that, is there any advantage that text files/mode
offer that binary files/mode don't have?


Um, yes. Text files represent text.
Suppose I'm serializing data into a textual representation to be read on
another system (with the same charset). Does it matter whether I open
the file in text or binary mode?
Absolutely. For example, as I'm sure you know, Windows represents an
end-of-line by two characters, a CR followed by an LF ('\r' followed
by '\n'). If you write a "text" file on one Windows system and read
it on another in binary mode, there are two possibilities: either the
program that reads the file has to explicitly discard the '\r'
characters, or the file won't be a valid Windows text file, and you
won't be able to process it with other tools, such as ordinary text
editors.
Whitesmiths ,
Ltd. introduced the text/binary dichotomy in 1978 when porting C to
dozens of non-Unix systems, and other companies did much the same thing
in the coming years. It was a slam dunk to put it in the draft C Standard
begun in 1983.


I've often heard in c.l.c. some systems (Mainframes) had complicated
internal representation of text files. Why was it that way? What did
it solve? Why couldn't they be replaced with simple "binary" files with
'\n' as record separator?


They could have. They weren't.

Historically, files on mainframes were typically stacks of 80-column
punch cards. The complex internal representations of text files were
based on that. (I'm not very familiar with this, so I could be
mistaken.) Changing to a Unix-style format would break compatibility.
IMHO how text is represented could be viewed as a per-application
convention rather than system-wide. `Sendmail' doesn't have to read
`inetd' configuration files and v.v., so there is no reason why they
should follow the same text representation convention. It means that
on a system several (or even unlimited) conventions might be present.
That sounds like a nightmare. Do I have to have one version of vi or
emacs to read sendmail config files and another to read indentd config
files?
Why is there in the C language room only for one type of text stream?
Why does there need to be more than one?
There is only one "binary" file (bytes are stored in the file exactly as
written to; no translation is done). Why isn't the default open mode
(such as "r+") binary? It seems to me more natural to have settled it
this way.


There is no default open mode. You always have to specify whether
you're opening the file in text or binary node. You specify binary
mode by including a 'b' in the mode argument; you specify text mode by
not includig a 'b' in the mode argument.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
May 21 '06 #63
In article <4d************ *@individual.ne t> "S.Tobias" <si***@FamOuS.B edBuG.pAlS.INVA LID> writes:
....
I've often heard in c.l.c. some systems (Mainframes) had complicated
internal representation of text files. Why was it that way? What did
it solve? Why couldn't they be replaced with simple "binary" files with
'\n' as record separator?


Because they also did not have simple "binary" files. A "binary" file
consisted of (for instance) fixed length records of (say) 80 bytes
(whatever the size of a byte). This conformed to the Fortran and
Cobol models (also for text files), with only an implicit record
separator. And if there were variable length records available, they
were either represented by a length preceding the content or (on the
CDC Cyber) as a sequence of words, each containing 10 6-bit bytes or
5 12-bit bytes, where the last word in the sequence contained 12 zero
bits in the low order part. The ultimate reason was that I/O was
record oriented, because of speed.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
May 21 '06 #64
On Sat, 20 May 2006 22:53:22 +0000, S.Tobias wrote:
I've often heard in c.l.c. some systems (Mainframes) had complicated
internal representation of text files. Why was it that way? What did
it solve? Why couldn't they be replaced with simple "binary" files with
'\n' as record separator?


If I can be informatively flippant: the problem was not a complicated
*internal* representation, but of a dominant *external* one -- punched
cards. In a world of cards, why would one waste one of the precious 72
character spaces (the last 8 were often reserved for sequence numbering)
for a marker to show the end of something that so obvious as the end of
the card? It ended -- the computer got an signal the card was done. What
was the point of a marker?

In that world, \n (and \r and \0 used to pad the output) were seen as
control characters sent only to a printer so that it would advance the
paper and re-position the head (if it had one!).

Another consequence was that spaces did not really exist (or more
precisely that there were a lot of implicit ones). Most card punches
(if I remember right) punched nothing where there was a space so you could
not tell where the "line" ended, except for the obvious: after 80
characters (or 72 if you were stripping sequence numbers).

--
Ben.
May 21 '06 #65
Keith Thompson said:
Do I have to have one version of vi or
emacs to read sendmail config files and another to read indentd config
files?


The indentd daemon (after just a quick pint of config down at the /etc)
faithfully chunters along in the background, waiting and watching for that
momentous occasion when the user decides to run indent, a thin client which
opens a connection to the daemon, hurls the C code down it, and says
"whaddya mek o' that, then, laddie?"

Nothing daunted, indentd bravely catches the code, and turns it from IOCCC
material into something approximately approaching readability. Handing it
back to the client with a smart salute and a "have a nice day", indentd
awaits the next urgent case of mangled layout, knowing that every readable
program it produces is another victory for God, Queen, and country.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
May 21 '06 #66
CBFalconer <cb********@yah oo.com> wrote:
"P.J. Plauger" wrote:
If you try to live with just binary mode, then every program either has
to map text files for itself or tolerate a broad assortment of rules for
delimiting text lines. There's precedent for the latter approach too
(see, for example, Java), but Unix gives a powerful precedent for
having a uniform internal convention for representing text streams.


However the user should be aware that everything breaks down if the
input system tries to handle a file as text when that file doesn't
adhere to the conventions for text on the system.


So, as the doctor said to the man who complained that his arm hurt when
he hit his elbow against the wall, Don't Do That, Then. That's why we
have FTP in A mode.

Richard
May 22 '06 #67
Keith Thompson <ks***@mib.or g> wrote:
"S.Tobias" <si***@FamOuS.B edBuG.pAlS.INVA LID> writes:
P.J. Plauger <pj*@dinkumware .com> wrote:
"S.Tobias" <si***@FamOuS.B edBuG.pAlS.INVA LID> wrote in message
news:4c******** *****@individua l.net...
....
All right. But besides that, is there any advantage that text files/mode
offer that binary files/mode don't have?
Um, yes. Text files represent text.

Well, binary files can contain text, too.
Suppose I'm serializing data into a textual representation to be read on
another system (with the same charset). Does it matter whether I open
the file in text or binary mode?


Absolutely. For example, as I'm sure you know, Windows represents an
end-of-line by two characters, a CR followed by an LF ('\r' followed

[...]

I wasn't clear enough, I have to restate the problem. Suppose the file
is not meant for interaction with other system tools (editors), but is
a means of transferring the results to another instance of a similar
program that will continue calculations (ie. the writing mode is known).

main()
{
double result = calculate();
FILE *fp = fopen("results. txt", "w" BINM);
fprintf(fp, "%f\n", result);
fclose(fp);

fp = fopen("results. txt", "r" BINM);
fscanf(fp, "%f", &result);
fclose(fp);
cont_calculatio n(result);
}

Will it matter if BINM is #defined as "b" or as nothing?
Can binary mode replace the text mode in this way? I'm reading from
the Standard that nul characters may be appended to a binary stream;
could this cause problems if I want to handle binary stream in text
manner, like in the above sketch? (Will append mode + multiple closing
and opening work correctly?)
Whitesmith s,
Ltd. introduced the text/binary dichotomy in 1978 when porting C to
dozens of non-Unix systems, and other companies did much the same thing
in the coming years. It was a slam dunk to put it in the draft C Standard
begun in 1983.
.... IMHO how text is represented could be viewed as a per-application
convention rather than system-wide. `Sendmail' doesn't have to read
`inetd' configuration files and v.v., so there is no reason why they
should follow the same text representation convention. It means that
on a system several (or even unlimited) conventions might be present.


That sounds like a nightmare. Do I have to have one version of vi or
emacs to read sendmail config files and another to read indentd config
files?

Or the editors would have to be able to read multiple text formats.
IIRC, Windows XP WordPad and Notepad can save to plain text, RTF, Unicode
and Utf-8 formats. Can't we consider all these formats as text formats?

It's not that uncommon that special configuration files have dedicated
editors, eg.: vipw, vigr, visudo (however, for different reasons than
text file format convention).
Why is there in the C language room only for one type of text stream?
I was wrong here, actually C does not preclude multiple text modes.
One could specify one like this, as an extension: "r+:DOStext ".
There is only one "binary" file (bytes are stored in the file exactly as
written to; no translation is done). Why isn't the default open mode
(such as "r+") binary? It seems to me more natural to have settled it
this way.


There is no default open mode. You always have to specify whether
you're opening the file in text or binary node. You specify binary
mode by including a 'b' in the mode argument; you specify text mode by
not includig a 'b' in the mode argument.

For me the main difference between binary and text modes is that the
first is untranslated and the other is translated. You can "not do"
something only in one way, therefore I feel it would have been more
logical to have the default (not including mode spec in the argument)
binary.

--
Stan Tobias
mailx `echo si***@FamOuS.Be dBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
May 23 '06 #68

Richard Heathfield wrote:
osmium said:
"P.J. Plauger" writes:
I'll be damned! In Note 2, they defined byte very precisely as a word
that simply means a collection of contiguous bits. They took a widely
used word, that meant something to hundreds of thousands of people and
redefined it to mean something entirely different.

There are about 30 definitions of byte that make the cut on google, and
the *vast* majority say a byte is eight bits.

We forgot to do a web search before we chose that terminology in 1983.


I appreciate your sarcasm and have no desire to argue with anyone - and
most certainly not with you.

But wasn't the word byte pretty much introduced into the world by the IBM
360 in 1964 or thereabouts?


Knuth says that the 8-bit "standardisatio n" happened in around 1975 or so.
By then, C was already well under way, and dmr was almost certainly
accustomed to using the word in its non-"standard" sense.


Speaking as someone who worked on System/360's, and other computers,
during the 1960's, the word byte was already established as meaning an
8-bit quantity during that time. For sure, there were machines with
other byte sizes, but those were explicitly qualified - "seven-bit
byte",
or whatever. In the absence of any indication otherwise, byte always
meant 8 bits, even when C was growing up.

May 27 '06 #69

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
3461
by: Bernhard Hidding | last post by:
Hi, my program writes chars to an ascii file via ofstream. You can use "\n" for newline and "\t" for tab there, but is there any command that deletes the last char in the current ofstream? Thanks in advance, Bernhard
2
3303
by: anirudhvr | last post by:
Hi, I needed a basic binary to ascii encoder, so I wrote this piece of code: /* Encoding algo: suppose 11111010 is the byte to be encoded. It is first broken up into two 4-bit parts (1111 and 1010) This 4-bit quantity is then padded between a 01 and a 10, ie, 01xxxx10 will be written out to the output file.
31
3237
by: Claude Yih | last post by:
Hi, everyone. I got a question. How can I identify whether a file is a binary file or an ascii text file? For instance, I wrote a piece of code and saved as "Test.c". I knew it was an ascii text file. Then after compilation, I got a "Test" file and it was a binary executable file. The problem is, I know the type of those two files in my mind because I executed the process of compilation, but how can I make the computer know the type of a...
6
10047
by: SandyMan | last post by:
Hi, I am able to open a binary file for reading but can someone tell me as how to go about converting a Binary file to ASCII file using C. Thanks In Advance SandyMan
5
10716
by: veg_all | last post by:
I have a script where a user can upload their csv file into a mysql database. The problem is sometimes a user will upload the raw excel or access file. How can I perform some simple checks to determine if the file is binary. I could do some regular expressions to check for unusual characters, but how would I read the file line by line?
1
1408
by: rsr | last post by:
Hi everybody, I have a VC++ application to connect two computers with a serial cable. Can anyone of u tell me which scheme of encoding is used to transmit data?..Binary or ASCII.. I guess its Binary as I am able to transmit even special characters like Δ(delta), alpha, beta..But I am not sure.. Thanks in advance
24
3024
by: pbd22 | last post by:
Hi. I want to know the size of a file (prior to FTP) in ASCII Mode. The reason is that I want to know how bit the file is going to be once it reaches the FTP server and there is a size difference between binary and ascii mode (100 or so bytes). So, how do I get the size of the file in ASCII Mode using the usual VB.NET API?
2
3900
by: tedpottel | last post by:
Hi, My program has the following code to transfer a binary file f = open(pathanme+filename,'rb') print "start transfer" self.fthHandle.storbinary('STOR '+filename, f) How can I do an ASCII file transfer?????? -Ted
1
2607
by: Vic | last post by:
I have a c program which writes mac address entries onto a text file. Text file when opened in vim looks like this index mac 1 ^@^@^Q^@^@^A ascii of (00.00.11.00.00.01) 2 ^@^@^Q^@^@^B ascii of (00.00.11.00.00.02) .. ........ Likewise I read the mac entries from the text file using a perl script and do some processing
0
9932
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9777
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
11109
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10726
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10833
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
7957
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5782
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5979
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
3226
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.