Sequence matching exercise

Andy Green

Emphasis is on efficiancy and speed this is the excercise:
The program should monitor a possibly infinite stream of characters
from the keyboard (standard input). If it detects the sequence "aaa"
it outputs a "0". If it detects the sequence "aba" it outputs a "1".
DO NOT detect sequences within sequences. The program should exit
cleanly when it detects an End Of Input. For example:

The following sequence aababaaabaaa<En d Of Input> would produce the
following result: 100
While the following sequence aaababaaaabbaba ba<End Of Input> would
produce the following result: 0101

Any takers?

Nov 14 '05 #1

Subscribe Reply

1941

Martin Ambuhl

Andy Green wrote:

Emphasis is on efficiancy and speed this is the excercise:
The program should monitor a possibly infinite stream of characters
from the keyboard (standard input). If it detects the sequence "aaa"
it outputs a "0". If it detects the sequence "aba" it outputs a "1".
DO NOT detect sequences within sequences. The program should exit
cleanly when it detects an End Of Input. For example:

The following sequence aababaaabaaa<En d Of Input> would produce the
following result: 100
While the following sequence aaababaaaabbaba ba<End Of Input> would
produce the following result: 0101

Any takers?

The requirement that you monitor stdin suggests that you think
unbuffered input on a character-by-character basis is available. This
suggests that you have a specific implementation in mind and makes the
question off-topic. If, on the other hand, the challenge can be
rewritten to examine strings, wherever they came from, then this is a
simple algorithm question and is off-topic.

Perhaps you can explain what content your question has that makes it a
question about C. Or you could do your own homework.

Nov 14 '05 #2

Eric Sosman

Andy Green wrote:

Emphasis is on efficiancy and speed this is the excercise:
The program should monitor a possibly infinite stream of characters
from the keyboard (standard input). If it detects the sequence "aaa"
it outputs a "0". If it detects the sequence "aba" it outputs a "1".
DO NOT detect sequences within sequences. The program should exit
cleanly when it detects an End Of Input. For example:

The following sequence aababaaabaaa<En d Of Input> would produce the
following result: 100
While the following sequence aaababaaaabbaba ba<End Of Input> would
produce the following result: 0101

Any takers?

The C language Standard has nothing to say about speed
and "efficiancy ," but I think this program will be hard to
beat:

int main(void) {
return 0;
}

You may be wondering why this program works as required, and
your professor may wonder the same thing. The answer lies in
a close reading of the homework statement:

- Output is only required *if* "aaa" or "aba" is
detected. This program detects neither (nowhere
does the homework assignment state a requirement
that anything be detected), and is thus relieved
of any responsibility to produce output.

- The program is expressly forbidden to detect
"sequences within sequences." The stream of
input characters is a sequence (note the use of
the term in the two examples), so the program is
forbidden to detect anything in its input.

- The program is required to "monitor" its input,
but the verb "monitor" is not defined by the C
language Standard, nor by the homework assignment.
Therefore we are free to define it in the manner
we find most convenient. The sixth definition
given by http://www.yourdictionary.com for the
transitive verb "monitor" is "to direct," and
the program above "directs" the input to the
highly efficient bit bucket.

- Actually, the program is not required to do anything
at all with or to the input. The homework assignment
says "The program SHOULD ..." (emphasis mine), and
"should" is merely a hint. A requirement is always
properly expressed with "shall" (and a prohibition
with "shall not"), as described in Section 4 of the
C language Standard.

I hope this helps you get the grade you deserve.

--
Er*********@sun .com

Nov 14 '05 #3

Thomas Matthews

Andy Green wrote:

Emphasis is on efficiancy and speed this is the excercise:
The program should monitor a possibly infinite stream of characters
from the keyboard (standard input). If it detects the sequence "aaa"
it outputs a "0". If it detects the sequence "aba" it outputs a "1".
DO NOT detect sequences within sequences. The program should exit
cleanly when it detects an End Of Input. For example:

The following sequence aababaaabaaa<En d Of Input> would produce the
following result: 100
While the following sequence aaababaaaabbaba ba<End Of Input> would
produce the following result: 0101

Any takers?

This falls more under lexing and parsing. IMO, the natural
course of action is to create a language for this assignment.
Something like:
zero_sequence ::= a a a

one_sequance ::= a b a

Run this through a program such as lex or flex to generate
a parse table. Then write a program to support the parse
table. Keep these pieces, as you will need them to do
your next homework assignments.

Or perhaps a simple state diagram would help you:

b +---+
+------> | 4 | --+
| +---+ |
| |
+---+ a +---+ a +---+ |
O -> | 1 | ---> | 2 | ---> | 3 | |
+---+ +---+ +---+ |
^ | | |
| v v |
+---------------------------+

State 1: stay in this state until an 'a' is detected.

State 2: if an 'a' is detected, transition to state 3.
if a 'b' is detected, transition to state 4.
Otherwise transition to state 1.

State 3: if an 'a' is detected, print '0'.
Transition in all cases to state 1.

State 4: if an 'a' is detected, print '1'.
Transition in all cases to state 1.

If an EOF is detected in any state, the program should
exit (i.e. state 5).

Now, code it up.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.l earn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Nov 14 '05 #4

Nick Austin

On 17 Jun 2004 09:59:23 -0700, an*******@opton line.net (Andy Green)
wrote:

Emphasis is on efficiancy and speed this is the excercise:
The program should monitor a possibly infinite stream of characters
from the keyboard (standard input). If it detects the sequence "aaa"
it outputs a "0". If it detects the sequence "aba" it outputs a "1".
DO NOT detect sequences within sequences. The program should exit
cleanly when it detects an End Of Input. For example:

Use a state machine. I was able to complete the exercise with just
four states.

I created two tables to handle the 'a' and 'b' cases separately. It
would be faster to use a sparse 2 dimensional array, indexed by the
current state and the character read from the stream.

This is the code to handle detection of "aaa" in the input stream:

#include <stdio.h>

static void nothing( void );
static void print_0( void );
static void print_1( void );

typedef struct
{
int new_state;
void ( *action )( void );
} TABLE;

static const TABLE table_a[] =
{
/* 0 */ { 1, nothing },
/* 1 */ { 2, nothing },
/* 2 */ { 0, print_0 },
};

static const TABLE table_b[] =
{
/* 0 */ { 0, nothing },
/* 1 */ { 0, nothing },
/* 2 */ { 0, nothing },
};

int main( void )
{
int state = 0;
int ch;

while ( ch = getc( stdin ), ch != EOF )
{
/* Do actions defined by table */

switch ( ch )
{
case 'a':
table_a[ state ].action();
state = table_a[ state ].new_state ;
break;
case 'b':
table_b[ state ].action();
state = table_b[ state ].new_state ;
break;
default:
state = 0;
}
}
printf( "\n" );

return 0;
}

static void nothing( void )
{
}

static void print_0( void )
{
printf( "0" );
}

static void print_1( void )
{
printf( "1" );
}

Nick.

Nov 14 '05 #5

Andy Green

Thomas Matthews <Th************ *************** *@sbcglobal.net > wrote in message news:<oM******* ********@newssv r16.news.prodig y.com>...

Andy Green wrote:
Emphasis is on efficiancy and speed this is the excercise:
The program should monitor a possibly infinite stream of characters
from the keyboard (standard input). If it detects the sequence "aaa"
it outputs a "0". If it detects the sequence "aba" it outputs a "1".
DO NOT detect sequences within sequences. The program should exit
cleanly when it detects an End Of Input. For example:

The following sequence aababaaabaaa<En d Of Input> would produce the
following result: 100
While the following sequence aaababaaaabbaba ba<End Of Input> would
produce the following result: 0101

Any takers?

This falls more under lexing and parsing. IMO, the natural
course of action is to create a language for this assignment.
Something like:
zero_sequence ::= a a a

one_sequance ::= a b a

Run this through a program such as lex or flex to generate
a parse table. Then write a program to support the parse
table. Keep these pieces, as you will need them to do
your next homework assignments.

Or perhaps a simple state diagram would help you:

b +---+
+------> | 4 | --+
| +---+ |
| |
+---+ a +---+ a +---+ |
O -> | 1 | ---> | 2 | ---> | 3 | |
+---+ +---+ +---+ |
^ | | |
| v v |
+---------------------------+

State 1: stay in this state until an 'a' is detected.

State 2: if an 'a' is detected, transition to state 3.
if a 'b' is detected, transition to state 4.
Otherwise transition to state 1.

State 3: if an 'a' is detected, print '0'.
Transition in all cases to state 1.

State 4: if an 'a' is detected, print '1'.
Transition in all cases to state 1.

If an EOF is detected in any state, the program should
exit (i.e. state 5).

Now, code it up.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.l earn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Its not as complicted as that. It is not a homework assignment, one of
my guys submitted this as one of the questions we ask our applicants.
I'm wondering how long it should take. 15 minutes is the suggested
time. What types of answers I'd get. No credits for this one..

Nov 14 '05 #6

Russell Hanneken

Thomas Matthews wrote:

State 1: stay in this state until an 'a' is detected.

State 2: if an 'a' is detected, transition to state 3.
if a 'b' is detected, transition to state 4.
Otherwise transition to state 1.

State 3: if an 'a' is detected, print '0'.
Transition in all cases to state 1.

State 4: if an 'a' is detected, print '1'.
Transition in all cases to state 1.

If an EOF is detected in any state, the program should
exit (i.e. state 5).

I don't think this design would do the right thing. As I understand the
rules laid out by the original poster, a sequence like

aabaaa

Should lead to a 1 being printed (because of aba). But I think your design
would lead to a 0 being printed (aab is discarded, and then aaa causes the 0
to be printed).

--
Russell Hanneken
eu*******@cbobk .pbz
Use ROT13 to decode my email address.

Nov 14 '05 #7

josh

Andy Green wrote:

Emphasis is on efficiancy and speed this is the excercise:
The program should monitor a possibly infinite stream of characters
from the keyboard (standard input). If it detects the sequence "aaa"
it outputs a "0". If it detects the sequence "aba" it outputs a "1".
DO NOT detect sequences within sequences. The program should exit
cleanly when it detects an End Of Input. For example:

The following sequence aababaaabaaa<En d Of Input> would produce the
following result: 100
While the following sequence aaababaaaabbaba ba<End Of Input> would
produce the following result: 0101

Any takers?

Well, I'd think the most straightforward implementation would be
something like:

#include <stdio.h>

char context[3];
int cp;

void clear_context(v oid)
{
cp = context[0] = context[1] = context[3] = 0;
}

char lookbehind(int distance)
{
int i = (cp - distance + 3) % 3;
return context[i];
}

int main(void)
{
clear_context() ;
while (!feof(stdin))
{
context[cp] = fgetc(stdin);
if (context[cp] == 'a' && lookbehind(2) == 'a')
{
if (lookbehind(1) == 'b')
{
fputc('1', stdout);
clear_context() ;
}
else if (lookbehind(1) == 'a')
{
fputc('0', stdout);
clear_context() ;
}
}
cp = (cp+1) % 3;
}
}

But I bet fgetc returns EOF at the end of file, which should be checked
instead of using feof. It doesn't really make a difference here anyway.

It might be slightly better to use a context array of size 4, so you can
do & instead of %, but your speed is really gonna be limited by the IO
anyway.

-josh

Nov 14 '05 #8

"Andy Green" <an*******@opto nline.net> wrote in message
news:a7******** *************** ***@posting.goo gle.com...

Its not as complicted as that. It is not a homework assignment, one of
my guys submitted this as one of the questions we ask our applicants.
I'm wondering how long it should take. 15 minutes is the suggested
time. What types of answers I'd get. No credits for this one..

Depends on what level you want the answer. I would give my answer in 1
minute and it would be verbal :
"I'll would take the next three characters and compare them to two
posibilities, if match is found print the output and compare again and so
on,
else discard first character , fetch one annd compare again and so on.
If fetching fails the program would stop."
I think you are better of comparing how people perform. How fast one solves
this kind of quite trivial problem, doesn't actually tell that a person can
perform more complicated tasks efficiently. I mean I don't think it matters
wheter this one takes 1 , 5 or 15 minutes. Some people like to think first
carefully before giving the answer, others (like me) just start to give the
anser and figure out the troubles as they go:). Of course if this one takes
two hours, for get the person in question.
-Jyrki
PS. What this has to do with C-language?

Nov 14 '05 #9

Andy Green

josh <sm************ *************@y ahoo.com.NOSPAM > wrote in message news:<vJuAc.677 58$0y.559@attbi _s03>...

Andy Green wrote:
Emphasis is on efficiancy and speed this is the excercise:
The program should monitor a possibly infinite stream of characters
from the keyboard (standard input). If it detects the sequence "aaa"
it outputs a "0". If it detects the sequence "aba" it outputs a "1".
DO NOT detect sequences within sequences. The program should exit
cleanly when it detects an End Of Input. For example:

The following sequence aababaaabaaa<En d Of Input> would produce the
following result: 100
While the following sequence aaababaaaabbaba ba<End Of Input> would
produce the following result: 0101

Any takers?

Well, I'd think the most straightforward implementation would be
something like:

#include <stdio.h>

char context[3];
int cp;

void clear_context(v oid)
{
cp = context[0] = context[1] = context[3] = 0;
}

char lookbehind(int distance)
{
int i = (cp - distance + 3) % 3;
return context[i];
}

int main(void)
{
clear_context() ;
while (!feof(stdin))
{
context[cp] = fgetc(stdin);
if (context[cp] == 'a' && lookbehind(2) == 'a')
{
if (lookbehind(1) == 'b')
{
fputc('1', stdout);
clear_context() ;
}
else if (lookbehind(1) == 'a')
{
fputc('0', stdout);
clear_context() ;
}
}
cp = (cp+1) % 3;
}
}

But I bet fgetc returns EOF at the end of file, which should be checked
instead of using feof. It doesn't really make a difference here anyway.

It might be slightly better to use a context array of size 4, so you can
do & instead of %, but your speed is really gonna be limited by the IO
anyway.

-josh

Thanks to all that responded!!. Sorry if I ruffled some feathers. If I
posted this to the wrong group I apologize. Josh and Nick thanks for
taking a stab at it, troopers!. getchar(), getc(stdin) would be o.k.
as well. Tom thanks for the laugh. We are not going to use this or if
we do we have to re-write it, a lot of people have trouble
interpreting the question.
We might reword it to allow the candidate to use ANY language. The guy
who came up with it says a one line shell script should do.. :). He
might be kidding me I can't tell... - Thanks again all - Andy G.

Nov 14 '05 #10

Similar topics

5483

regexp: not matching a sequence of characters

by: | last post by:

Hi, I'm fairly new to regular expressions, and this may be a rather dumb question, but so far I haven't found the answer in any tutorial or reference yet... If I have f.i. the string "The {{{{power of {{{{regular expressions}}}} comes from}}}} the ability to include alternatives and repetitions in the pattern." from which I want to extract chunks starting with "{{{{" and ending with "}}}}".

PHP

4314

sequence to FULLY test AVL tree implementation???

by: Nobody | last post by:

I recently wrote an AVL tree class, I think its mostly working, but I want to make sure. Assuming it uses integers, is there an insert/delete sequence that is known to exercise every possible case of inserting and deleting?

C / C++

2804

Sequence and/or pattern matching

by: Séb | last post by:

Hi everyone, I'm relatively new to python and I want to write a piece of code who do the following work for data mining purpose : 1) I have a list of connexion between some computers. This list has this format : Ip A Date Ip B .... ... ...

Python

23754

How to reset AutoNumber sequence?

by: deko | last post by:

Is there a way to reset the AutoNumber sequence? I have several tables that use the AutoNumber field as the Primary Key, and I'd like to somehow do an Import/Export that will make remove the breaks in the sequence. A few breaks in sequence is not a big deal, but I have one table with under 200 records, but the last AutoNumber PK ID field is over 1500 - due to a lot of edits....

Microsoft Access / VBA

3245

Sequence question

by: Eric E | last post by:

Hi, I have a question about sequences. I need a field to have values with no holes in the sequence. However, the values do not need to be in order. My users will draw a number or numbers from the sequence and write to the field. Sometimes, however, these sequence numbers will be discarded (after a transaction is complete), and thus available for use. During the transaction, however, any drawn numbers need to be unavailable. I would...

PostgreSQL Database

2788

Match First Sequence in Regular Expression?

by: Roger L. Cauvin | last post by:

Say I have some string that begins with an arbitrary sequence of characters and then alternates repeating the letters 'a' and 'b' any number of times, e.g. "xyz123aaabbaabbbbababbbbaaabb" I'm looking for a regular expression that matches the first, and only the first, sequence of the letter 'a', and only if the length of the sequence is exactly 3.

Python

3752

Matching nonempty sequence of '^' scanf with the "%[" specifier

by: regis | last post by:

Greetings, about scanf matching nonempty sequences using the "%" matches a nonempty sequence of anything except '-' "%" matches a nonempty sequence of anything except ']" matches a nonempty sequence of anything except ']' "%" matches a nonempty sequence of anything except '^' "%" matches a nonempty sequence of '-' "%" matches a nonempty sequence of ']" matches a nonempty sequence of ']' ....but how to match a nonempty sequence of '^'...

C / C++

5629

Find all occurences of a bit sequence in a file

by: Daneel | last post by:

Hello! I'm looking for an algorithm which finds all occurences of a bit sequence (e.g., "0001") in a file. This sequence can start at any bit in the file (it is not byte aligned). I have some ideas of how to approach the problem (1) reading file into unsigned char buffer, 2) defining bit structure, 3) comparing the first 4 bits of the buffer with the bit structure, 4) shifting the char buffer one left, 5) repeat at step 3)) but I'm...

C / C++

3025

Getting Junk Values for Sequence Container Class

by: davydany | last post by:

Hey guys...a n00b Here for this site. I'm making a sequence class for my C++ class. And The thing is in the array that I have, lets say i put in {13,17,38,18}, when i see the current values for the array data from 0 to3, I get this {13, JUNK VALUE, 17,38, 18} and JUNK VALUE is like 1.8e831 or something like that. This happens when I use the attach() function and use the current() function to display the values at data I really want to...

C / C++

8142

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

8642

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

8294

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

8444

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

6093

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

5549

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

4058

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4138

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

1758

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP