473,657 Members | 2,492 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

getc and ungetc

Would getc and ungetc be the best and most simple what to parse
expressions for a parser?

Bill

Nov 14 '05 #1
8 2934
Quoth Bill Cunningham on or about 2004-11-18:
Would getc and ungetc be the best and most simple what to parse
expressions for a parser?


ungetc can't be applied more than once `in a row' (i.e. sequentially).
I suspect that makes for a rather unsuitable function for a parser.

-t
Nov 14 '05 #2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bill Cunningham wrote:
| Would getc and ungetc be the best and most simple what to parse
| expressions for a parser?
|

It's usually easier to do like K&R did in "The C Programming Language"
(at the end, when they designed the RPN calculator): Read in a block of
text, and then define your own getchar()/ungetchar() functions to push
and pop characters on and off that buffer. You can read in more text (a
block at a time, for efficiency and convenience) when your getchar()
function tries to read beyond the end of your buffer.

You need to write a bit more code yourself, but it's fairly trivial code
and you can reuse it in any other parser you write.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBnXYsKxa tjOtX+j0RAnY8AJ 4gpW7DPFF/VgtMv7V2cdfzYIN TawCeOcqi
fpvs+q6HPDjLx6s Vc1ozUXk=
=A6oi
-----END PGP SIGNATURE-----
Nov 14 '05 #3
"Trent Buck" <NO************ @bigpond.com> wrote
Would getc and ungetc be the best and most simple what to parse
expressions for a parser?


ungetc can't be applied more than once `in a row' (i.e. sequentially).
I suspect that makes for a rather unsuitable function for a parser.

It depends on your parser design.
Most simple parsers used for things like computer languages divide the input
stream into token, and then parse from left to right with one token of "look
ahead". So if your tokens are single characters then getc() and ungetc() may
be adequate.
Of course if you have ambitions to build a natural language parser, or even
some simple grammars with unusual characteristics , then this scheme won't
work, and you will need some method of scanning up and down many tokens on
input.
Nov 14 '05 #4
# Of course if you have ambitions to build a natural language parser, or even
# some simple grammars with unusual characteristics , then this scheme won't
# work, and you will need some method of scanning up and down many tokens on
# input.

Such a parser runs the risk of exponential running time, while a tabular
parser doesn't need to back up and has cubic time worst case. ungetc has
at most marginal usability for some types of lexical scanners.

--
SM Ryan http://www.rawbw.com/~wyrmwif/
There are subtler ways of badgering a witness.
Nov 14 '05 #5

"SM Ryan" <wy*****@tang o-sierra-oscar-foxtrot-tango.fake.org> wrote in
message news:10******** *****@corp.supe rnews.com...
# Of course if you have ambitions to build a natural language parser, or even # some simple grammars with unusual characteristics , then this scheme won't # work, and you will need some method of scanning up and down many tokens on # input.

Such a parser runs the risk of exponential running time, while a tabular
parser doesn't need to back up and has cubic time worst case. ungetc has
at most marginal usability for some types of lexical scanners.

I have k&r 2. What about redirecting fgetc to stdio and using it insteat
getc?

Bill
Nov 14 '05 #6

"Bill Cunningham" <no****@nspam.n et> wrote in message
news:cb******** *******@newsfe0 8.lga.highwinds-media.com...

"SM Ryan" <wy*****@tang o-sierra-oscar-foxtrot-tango.fake.org> wrote in
message news:10******** *****@corp.supe rnews.com...
# Of course if you have ambitions to build a natural language parser, or even
# some simple grammars with unusual characteristics , then this scheme

won't
# work, and you will need some method of scanning up and down many

tokens on
# input.

Such a parser runs the risk of exponential running time, while a tabular
parser doesn't need to back up and has cubic time worst case. ungetc has
at most marginal usability for some types of lexical scanners.

I have k&r 2. What about redirecting fgetc to stdio and using it insteat
getc?

Bill


I've heard of top down recursive parsers.
Bill
Nov 14 '05 #7
On Thu, 18 Nov 2004 20:21:41 UTC, "Bill Cunningham" <no****@nspam.n et>
wrote:
Would getc and ungetc be the best and most simple what to parse
expressions for a parser?


Yes. But be aware of that ungetc can only unget ONE char at a time.

On other hand you can simply by macro or function extend getc and
ungetc to unget more than one char. Your UNGETC() would accept a
number of chars to give back in your GETC() in reverse order. Your
GETC() will give back the chars UNGETC had received before it gets new
chars from the stream itself.

A bit tricky is to ungent a char that you have never gotten - legally
as there is nothing that forbids it and ungenc requires the char that
is to unget. Be sure that you does NOT tries to unget EOF, this won't
work. This can be very useful when you has a long list of keyword
delemiters with same meaning to a long list of similar keywords.

Your parser may convert keywords or keychars into tokens and save the
tokens until all or a part of the input stream is readed and then work
on the generated token, it may work in other ways you thinks it
matches your requirements.

getc() (in conjunktion with ungetc() ) gives you the strongest
possible control over the stream you can ever need. When needed you
can siply count the number of chars, words, lines... readed in as side
effect, reset these counters as needed..... You avoids supervising of
buffers - you does need one.

Build your parser as state mashine and you can reuse the same code
again and again beside the little number of statements you needs to
handle a specific (sub)state. You gets high flexibility as you would
easy extend the functionality of the parser by create a new
(sub)state. Makes maintenance an easy work.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation

Nov 14 '05 #8

"Bill Cunningham" <no****@nspam.n et> wrote

I've heard of top down recursive parsers.

What is important in terms of the tokenizer is that they be left-right with
only one token of lookahead. Most practical parsers are in this class.

Unfortunately, using getc / ungetc means that the tokens are constrained to
be single characters. This may be Ok for a very simple application, but if
the tokens are naturally several characters long it will be a nuisance. You
can do it, for instance if you have a mathematical function tan() then you
could build it up from the latters 't' 'a' and 'n' rather than reading it in
as a single token TAN. But it is generally a lot easier to do the lexical
analysis before the parse proper.
Nov 14 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
2614
by: Andrew Kibler | last post by:
Two sections of code, in the first one fwrite works, in the second one it doesn't (ms VC++) both are identical except in the working one fseek is used twice to set the file pointer, once just before the fwrite. WHY??? Works: fseek(fp, itemloc, SEEK_SET); tmp = ReadLongBE(fp); if (*itemsize == tmp){ printf("\n%X\n",ftell(fp));
13
15116
by: William L. Bahn | last post by:
I'm sure this has been asked before, and I have looked in the FAQ, but I'm looking for an explanation for the following: The functions pairs: gets()/fgets() puts()/fputs() printf()/fprintf() scanf()/fscanf()
11
3471
by: TTroy | last post by:
Hello C programmers, Can someone tell me why ungetc can't sent back EOF, but it's sister function getc has no trouble sending it to us? For a file, this might not make a difference, but for an interactive terminal, it is probably nice to push EOF back (because to user doesn't want to provide an EOF twice). How is it getc can send EOF down it's pipe, but we can't send EOF down ungetc's pipe (especially when this pipe is the same)? ...
19
3716
by: mailursubbu | last post by:
HI, Below is my program. I compiled it through g++. Now strange thing is, getc is not reading the data instead its printing the previously read data ! Please some one let me know whats wrong. #include<stdio.h> int main() { int a;
62
5007
by: Argento | last post by:
I was curious at the start about how ungetc() returns the character to the stream, so i did the following coding. Things work as expected except if I change the scanf("%c",&j) to scanf("%d",&j). I don't understand how could scanf() affect the content of i and i. Can someone tell me why? #include <stdio.h> #include <ctype.h> void main() {
15
2415
by: av | last post by:
Why is so danger to allow ungetc(EOF, pfile); (for close the imput stream) ?
5
2567
by: Richard Weeks | last post by:
Below is a fragment from a program that calculates statistics on x,y data. I want the user to be able to predict one or more predicted values of y from x, given the line of best fit. I have a procedural problem. predict: printf("\npredict y? (y/n): "); if((getc(stdin)=='n')) exit(EXIT_SUCCESS); //if((fgets(response, 1, stdin)=="n")) exit(EXIT_SUCCESS);
32
2078
by: vippstar | last post by:
Assuming all the values of int are in the range of unsigned char, what happends if getc returns EOF? Is it possible that EOF was the value of the byte read? Does that mean that code aiming for maximum portability needs to check for both feof() and ferror()? (for example, if both feof() and ferror() return 0 for the stream when getc() returned EOF, consider EOF a valid byte read) To me, that seems to be the case, but maybe the standard...
0
8403
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8316
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8610
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6174
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5636
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4168
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4327
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2735
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1967
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.