473,396 Members | 1,864 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

the de-facto way to "parse" input

Hi all,

I am trying my hands at writing a shell for unix. A very rubbish
shell, but nonetheless, I come to a point where I am confused.

I would like to have something like

shellstop xyz

whereupon the command "stop" will take the argument "xyz" and perform
foo action on it. What in your opinion is the best (easiest?) way to
validate the input (perhaps iterate over a "valid commands" table),
and what calls would you use? (getc(), scanf(), a big while loop and
pointer arithmetic, ...)

I hope this question is not too ambiguous

many thanks
kb
Jun 27 '08 #1
9 2184
Krumble Bunk wrote:
I am trying my hands at writing a shell for unix. A very rubbish
shell, but nonetheless, I come to a point where I am confused.

I would like to have something like

shellstop xyz

whereupon the command "stop" will take the argument "xyz" and perform
foo action on it. What in your opinion is the best (easiest?) way to
validate the input (perhaps iterate over a "valid commands" table),
and what calls would you use? (getc(), scanf(), a big while loop and
pointer arithmetic, ...)
I'd use `fgets` to read the line, expanding the buffer as necessary,
carve the line up into space-separated chunks (if we're doing a rubbish
shell, we won't worry about quoting ...) and then I can look the first
chunk up in a table.

If we want something less rubbish, I'd write a recursive-descent
parser for commands. That would force me to be explicit about the
grammar I'm using and what my tokens are supposed to be. I'd build
an abstract-syntax tree for the commands; on no account would I
try and execute them while parsing them.

And I'd write unit tests. Lots of unit tests. And get something
working end-to-end as soon as possible. (Because it's very
disheartening spending a day or more writing a Super Duper
Program That Does It All, and then spending a week or more
debugging it until it does /something/, as opposed to writing
the smallest program one can manage that recognisably does
something right. Like, read a command line in, and print out
the tokens, /and do nothing else/.)

And likely throw away the first attempt, as a learning exercise.

--
"I don't make decisions. I'm a bird." /A Fine and Private Place/

Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England

Jun 27 '08 #2
On Jun 11, 2:46 pm, Chris Dollin <chris.dol...@hp.comwrote:
Krumble Bunk wrote:

[.....]

the tokens, /and do nothing else/.)

And likely throw away the first attempt, as a learning exercise.

--
"I don't make decisions. I'm a bird." /A Fine and Private Place/

Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England

Very good advice - I will investigate using lex/yacc.

thanks

kb
Jun 27 '08 #3
"Krumble Bunk" <kr*********@gmail.comwrote in message news
On Jun 11, 2:46 pm, Chris Dollin <chris.dol...@hp.comwrote:
>Krumble Bunk wrote:


[.....]

>the tokens, /and do nothing else/.)

And likely throw away the first attempt, as a learning exercise.

--
"I don't make decisions. I'm a bird." /A Fine and Private
Place/

Hewlett-Packard Limited Cain Road, Bracknell,
registered no:
registered office: Berks RG12 1HN 690597
England


Very good advice - I will investigate using lex/yacc.
You could also check out MiniBasic, on my website. Essentially writing a
mini-language for a shell is the same as writing a Basic interpreter, except
expressions consist of pipes and globs and redirections more often than
arithmetical operators. Most shells even have their own looping constructs.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Jun 27 '08 #4
On Jun 11, 7:12 pm, Krumble Bunk <krumbleb...@gmail.comwrote:
On Jun 11, 2:46 pm, Chris Dollin <chris.dol...@hp.comwrote:
Krumble Bunk wrote:

[.....]
the tokens, /and do nothing else/.)
And likely throw away the first attempt, as a learning exercise.
--
"I don't make decisions. I'm a bird." /A Fine and Private Place/
Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England

Very good advice - I will investigate using lex/yacc.

thanks

kb
lex is by-large the de-facto way for tokenizing. I believe gcc makes
extensive use of lex/yacc ( or may be flex/bison but that does not
make a hell of a difference )
Jun 27 '08 #5

"Chris Dollin" <ch**********@hp.comwrote in message
news:g2**********@news-pa1.hpl.hp.com...
Krumble Bunk wrote:
>I am trying my hands at writing a shell for unix. A very rubbish
shell, but nonetheless, I come to a point where I am confused.

I would like to have something like

shellstop xyz
If we want something less rubbish, I'd write a recursive-descent
parser for commands. That would force me to be explicit about the
grammar I'm using and what my tokens are supposed to be. I'd build
an abstract-syntax tree for the commands; on no account would I
try and execute them while parsing them.
An AST for a simple command-line interpreter?

How complex would this shell have to be to make this worthwhile?
>(Because it's very disheartening spending a day or more writing
Might be more disheartening to start a huge project that doesn't finish
because it's overspecified.

I would have suggesting starting with something like the following,
replacing the system() call (and perhaps adjusting the parameter) with the
local equivalent. This would require the handler for each 'command' to be a
separate C program, but has the advantage of the parameters being already
separated.

A bit more work and the commands and parameters can be identified and
executed in the same program.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void main()
{
#define llength 1000
char line[llength];
int i,n;

puts("Type exit to exit.");
puts("");

while (1) {

printf("Prompt>");
fflush(stdout);

if (fgets(line,llength,stdin)==NULL) break;

n=strlen(line); /* get rid of troublesome trailing \n */
if (line[n-1]=='\n') line[n-1]=0;

if (strcmp(line,"exit")==0) break;

if (line[0])
system(line);
};

}

--
Bartc
Jun 27 '08 #6
On Jun 13, 7:07 pm, "Bartc" <b...@freeuk.comwrote:
"Chris Dollin" <chris.dol...@hp.comwrote in message

news:g2**********@news-pa1.hpl.hp.com...
Krumble Bunk wrote:
I am trying my hands at writing a shell for unix. A very rubbish
shell, but nonetheless, I come to a point where I am confused.
I would like to have something like
shellstop xyz
If we want something less rubbish, I'd write a recursive-descent
parser for commands. That would force me to be explicit about the
grammar I'm using and what my tokens are supposed to be. I'd build
an abstract-syntax tree for the commands; on no account would I
try and execute them while parsing them.

An AST for a simple command-line interpreter?

How complex would this shell have to be to make this worthwhile?
(Because it's very disheartening spending a day or more writing

Might be more disheartening to start a huge project that doesn't finish
because it's overspecified.

I would have suggesting starting with something like the following,
replacing the system() call (and perhaps adjusting the parameter) with the
local equivalent. This would require the handler for each 'command' to be a
separate C program, but has the advantage of the parameters being already
separated.

A bit more work and the commands and parameters can be identified and
executed in the same program.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void main()
<snip>
Bartc, haven't you been here long enough to remember main returns int?
Jun 27 '08 #7
On Jun 14, 3:41*pm, vipps...@gmail.com wrote:
On Jun 13, 7:07 pm, "Bartc" <b...@freeuk.comwrote:
"Chris Dollin" <chris.dol...@hp.comwrote in message
news:g2**********@news-pa1.hpl.hp.com...
Krumble Bunk wrote:
>I am trying my hands at writing a shell for unix. *A very rubbish
>shell, but nonetheless, I come to a point where I am confused.
>I would like to have something like
>shellstop xyz
If we want something less rubbish, I'd write a recursive-descent
parser for commands. That would force me to be explicit about the
grammar I'm using and what my tokens are supposed to be. I'd build
an abstract-syntax tree for the commands; on no account would I
try and execute them while parsing them.
An AST for a simple command-line interpreter?
How complex would this shell have to be to make this worthwhile?
>(Because it's very disheartening spending a day or more writing
Might be more disheartening to start a huge project that doesn't finish
because it's overspecified.
I would have suggesting starting with something like the following,
replacing the system() call (and perhaps adjusting the parameter) with the
local equivalent. This would require the handler for each 'command' to be a
separate C program, but has the advantage of the parameters being already
separated.
A bit more work and the commands and parameters can be identified and
executed in the same program.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main()

<snip>
Bartc, haven't you been here long enough to remember main returns int?- Hide quoted text -
Yes, but I suspect I didn't write that bit. Probably the remnants of a
copy&paste of someone else's code. Not my fault at all..

--
Bartc
Jun 27 '08 #8
Bartc wrote:
>
"Chris Dollin" <ch**********@hp.comwrote in message
news:g2**********@news-pa1.hpl.hp.com...
>Krumble Bunk wrote:
>>I am trying my hands at writing a shell for unix. A very rubbish
shell, but nonetheless, I come to a point where I am confused.

I would like to have something like

shellstop xyz
>If we want something less rubbish, I'd write a recursive-descent
parser for commands. That would force me to be explicit about the
grammar I'm using and what my tokens are supposed to be. I'd build
an abstract-syntax tree for the commands; on no account would I
try and execute them while parsing them.

An AST for a simple command-line interpreter?
"something less rubbish" allows for something that isn't simple.

ASTs aren't complicated, even in C.
How complex would this shell have to be to make this worthwhile?
Pipes, sequencing, commands. Brackets and built-in commands,
definitely.
>>(Because it's very disheartening spending a day or more writing

Might be more disheartening to start a huge project that doesn't finish
because it's overspecified.
Where did "huge" come from? And "overspecified"?

--
"Tells of trouble and warns of change to come." /Lothlorien/

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN

Jun 27 '08 #9
On Jun 16, 8:04*am, Chris Dollin <chris.dol...@hp.comwrote:
Bartc wrote:
"Chris Dollin" <chris.dol...@hp.comwrote in message
news:g2**********@news-pa1.hpl.hp.com...
Krumble Bunk wrote:
>I am trying my hands at writing a shell for unix. *A very rubbish
shell, but nonetheless, I come to a point where I am confused.
>I would like to have something like
>shellstop xyz
If we want something less rubbish, I'd write a recursive-descent
parser for commands.
An AST for a simple command-line interpreter?

"something less rubbish" allows for something that isn't simple.

ASTs aren't complicated, even in C.
How complex would this shell have to be to make this worthwhile?

Pipes, sequencing, commands. Brackets and built-in commands,
definitely.
I'm not familiar with unix shells. But I don't remember seeing
anything more complicated than a linear series of commands, filenames,
numbers and switches in Windows' shell. But then, maybe Windows' shell
is rubbish.
>(Because it's very disheartening spending a day or more writing
Might be more disheartening to start a huge project that doesn't finish
because it's overspecified.

Where did "huge" come from? And "overspecified"?
OK not huge. But I associate ASTs with compilers, and that would seem
an overkill for this task.

Perhaps the OP should start by writing the specifications of his/her
syntax, then it might become clearer which approach is best.

--
Bart
Jun 27 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Marten van Urk | last post by:
I got the following error in my page Parse error: parse error, unexpected T_ELSE in line 25 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>Club</title>...
6
by: Dave Kuhlman | last post by:
Suppose that I have content that looks like what I've included at the end of this message. Is there something in the standard Python library that will help me parse it, break into the parts...
4
by: learning_C++ | last post by:
Hi, I try to use input = new istrstream(argv,strlen(argv)); in my code, but it always says error: " error: parse error before `(' token" please help me! Thanks, #include <map>
19
by: Johnny Google | last post by:
Here is an example of the type of data from a file I will have: Apple,4322,3435,4653,6543,4652 Banana,6934,5423,6753,6531 Carrot,3454,4534,3434,1111,9120,5453 Cheese,4411,5522,6622,6641 The...
1
by: Patrick De Ridder | last post by:
When I code Double.Parse(textBox1.Text) I get an error How can I convert a text box entry to a numeric? Please give a code example, if you know the answer.
7
by: sara | last post by:
Hi to all i have string y; int x=int.parse(y); when i run the application this exeption is trewn /////////////////////////////////////// Description: An unhandled exception occurred during the...
8
by: Douglas Crockford | last post by:
There is a new version of JSON.parse in JavaScript. It is vastly faster and smaller than the previous version. It uses a single call to eval to do the conversion, guarded by a single regexp test to...
2
by: dkk | last post by:
I am new to C programming. I need to read data from a formatted input text file (column-based), for example, "12345abcde678", I want to parse it into "123", "45", "ab", "cde", "678", and write them...
5
by: markbfernandez | last post by:
I have an "Update Customer" form that doesn't work as .xhtml. Here's where it stops working: <input type="text" name="ud_first" id="ud_first" value="<? echo "$first"?>" /> -------^ When the...
5
by: Johannes Bauer | last post by:
Hello group, I'm trying to use a htmllib.HTMLParser derivate class to parse a website which I fetched via httplib.HTTPConnection().request().getresponse().read(). Now the problem is: As soon as...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.