473,721 Members | 2,069 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

String Generation using Mask Parsing

Hello,

I am new to C and I am trying to write a few small applications to get
some hands-on practise! I am trying to write a random string
generator, based on a masked input. For example, given the string:
"AAANN" it would return a string containing 3 alphanumeric characters
followed by 3 digits. This part I have managed:)

I would now like to add some complexity to this, such as repetitions
and grouping. For example, I'd like to have masks similar to:
"AAN*10", which would return two alphanumeric chars followed by a
sequence of 10 numeric characters. However, the characters could be
grouped, such as: "A(AN)*10", which would now return an alphanumeric
character followed by a sequence of ten alternating alphanumeric/
numeric characters.

I'm not really sure where to start with this next step as I have
minimal experience. Any pointers in the right direction, or sample
code, would be appreciated.

Thanks in advance,
James.
Sep 21 '08 #1
6 3513
James Arnold wrote:
>
I am new to C and I am trying to write a few small applications
to get some hands-on practise! I am trying to write a random
string generator, based on a masked input. For example, given
the string: "AAANN" it would return a string containing 3
alphanumeric characters followed by 3 digits. This part I have
managed:)

I would now like to add some complexity to this, such as
repetitions and grouping. For example, I'd like to have masks
similar to: "AAN*10", which would return two alphanumeric chars
followed by a sequence of 10 numeric characters. However, the
characters could be grouped, such as: "A(AN)*10", which would
now return an alphanumeric character followed by a sequence of
ten alternating alphanumeric/ numeric characters.

I'm not really sure where to start with this next step as I
have minimal experience. Any pointers in the right direction,
or sample code, would be appreciated.
I think a study of regular expressions, as implemented in Unix and
Linux, would be instructive here.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home .att.net>
Try the download section.
Sep 21 '08 #2

"James Arnold" <ja****@gmail.c omwrote in message
I am new to C and I am trying to write a few small applications to get
some hands-on practise! I am trying to write a random string
generator, based on a masked input. For example, given the string:
"AAANN" it would return a string containing 3 alphanumeric characters
followed by 3 digits. This part I have managed:)

I would now like to add some complexity to this, such as repetitions
and grouping. For example, I'd like to have masks similar to:
"AAN*10", which would return two alphanumeric chars followed by a
sequence of 10 numeric characters. However, the characters could be
grouped, such as: "A(AN)*10", which would now return an alphanumeric
character followed by a sequence of ten alternating alphanumeric/
numeric characters.

I'm not really sure where to start with this next step as I have
minimal experience. Any pointers in the right direction, or sample
code, would be appreciated.
It turns into a grammar parsing problem.

To understand parsing generally, you could do worse than to check out my
MinBasic project, on my website.

Basically you divide the input strem into "tokens". Here that step is simple
because most tokens are a single char - the excpetion is the embedded
integers. Then you keep one token of lookahead, and take actions. When you
hit a '(' you need to push a stack of strings. When you hit the matching ')'
you pop it, and look for the '*' and integer to multiply.
You don't actually need to manage the stack yourself. You have a function
char *alphanum() which returns a sequence of As and Ns. When you hit the '('
you call alphanum() recursively, duplicate the string the required number of
times, and append it to the mother alphanum().

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Sep 21 '08 #3
James Arnold wrote:
I would now like to add some complexity to this, such as repetitions
and grouping. For example, I'd like to have masks similar to:
"AAN*10", which would return two alphanumeric chars followed by a
sequence of 10 numeric characters. However, the characters could be
grouped, such as: "A(AN)*10", which would now return an alphanumeric
character followed by a sequence of ten alternating alphanumeric/
numeric characters.

I'm not really sure where to start with this next step as I have
minimal experience. Any pointers in the right direction, or sample
code, would be appreciated.

Thanks in advance,
James.
It's definitely worth it to have a look at how regular expressions work.
I might also suggest that you alter that syntax to be a little less
ambiguous, like putting the numeric values in braces {} so you can more
easily figure out where the number begins and ends.

After that, my suggestion would be to divide the program into two parts.
The first one would input a string like "AA(AN){10} " and expand it to
something like "AAANANANANANAN ANANANAN", which is really what you're
looking at. The second part of the program is essentially what you've
already written; it inputs that string and outputs the random one.

Hope this helps.

Trent
Sep 21 '08 #4
I think a study of regular expressions, as implemented in Unix and
Linux, would be instructive here.
I am already familiar with regluar expressions, but I was under the
impression they can't be used to match braces when nested? So if for
example I wanted to do A(A(N)*10)*5, regular expressions wouldn't be
appropriate?
It turns into a grammar parsing problem.
I have been looking at Lex/Yacc (well, Flex/Bison) and written a
grammar to handle what I would like. I've compiled it and managed to
get it to output the detected tokens, but it definitely seemed to be
overkill for such a small program. Currently I'm just iterating
through a string and switch()'ing on each character, which covers most
of the functionality I'd like. I figured there must be a way of
tracking nested depth and calling the parse routine recursively for
each matched group?
After that, my suggestion would be to divide the program into two parts.
The first one would input a string like "AA(AN){10} " and expand it to
something like "AAANANANANANAN ANANANAN", which is really what you're
looking at.
This is also something I had considered, but I want to be able to use
a range for a specified repetition, e.g. repeat between 5 to 10 times.
This is fine, but if I want to generate 50 different outcomes the full
mask would need to be expanded each time, rather than just the
repeated bit. Surely that is not going to be very efficient? :)

Thanks for the replies!
Sep 21 '08 #5

"James Arnold" <ja****@gmail.c omwrote in message
This is also something I had considered, but I want to be able to use
a range for a specified repetition, e.g. repeat between 5 to 10 times.
This is fine, but if I want to generate 50 different outcomes the full
mask would need to be expanded each time, rather than just the
repeated bit. Surely that is not going to be very efficient? :)
Don't worry too much about efficiency.

Your code goes (skeleton)

char *alphanum()
{
char *answer;

set answer to emptystring;

while(1)
{
switch(gettoken ())
{
case 'A': concatenate an 'A':
case 'N': concatenate an 'N':
case '(')
match '(');
sub = alphanum();
match(')');
match('*');
repetitions = integer();
for(i=0;i<repet itions;i++)
concatenate sub;
case ')':
return answer;
case 0:
return answer;
}
}
}

It won't run superfast, but it won't crawl.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Sep 21 '08 #6

[You or your news reader is not adding attribution lines. This is not
a good idea and you should have a look to see if you can fix it.]

James Arnold <ja****@gmail.c omwrites:
>I think a study of regular expressions, as implemented in Unix and
Linux, would be instructive here.

I am already familiar with regluar expressions, but I was under the
impression they can't be used to match braces when nested? So if for
example I wanted to do A(A(N)*10)*5, regular expressions wouldn't be
appropriate?
I think the suggestion was only that you could look at REs for how to
write your masks. You are right that REs won't be any good as way of
implementing this. For example, some REs use (abc){3,6} for 3 to 6
repeats of "abc" and you might one day want things like [aeiou] rather
than just A and N indicators.
>It turns into a grammar parsing problem.

I have been looking at Lex/Yacc (well, Flex/Bison) and written a
grammar to handle what I would like. I've compiled it and managed to
get it to output the detected tokens, but it definitely seemed to be
overkill for such a small program.
Agreed. You have at most brackets and a couple of operators. No
need for lex and yacc.
Currently I'm just iterating
through a string and switch()'ing on each character, which covers most
of the functionality I'd like. I figured there must be a way of
tracking nested depth and calling the parse routine recursively for
each matched group?
That's roughly what I'd do. In fact, I'd probably make what you call
the parse routine do the actual generation as well. The parsing will
be so simple that actually storing the parse in some form in probably
not needed.
> After that, my suggestion would be to divide the program into two parts.
The first one would input a string like "AA(AN){10} " and expand it to
something like "AAANANANANANAN ANANANAN", which is really what you're
looking at.

This is also something I had considered, but I want to be able to use
a range for a specified repetition, e.g. repeat between 5 to 10 times.
This is fine, but if I want to generate 50 different outcomes the full
mask would need to be expanded each time, rather than just the
repeated bit. Surely that is not going to be very efficient? :)
I agree. If you parse and generate on the fly, there is no need for
ether an intermediate mask or a stored parse tree.

--
Ben.
Sep 21 '08 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

28
8076
by: David Rubin | last post by:
I looked on google for an answer, but I didn't find anything short of using boost which sufficiently answers my question: what is a good way of doing string tokenization (note: I cannot use boost). For example, I have tried this: #include <algorithm> #include <cctype> #include <climits> #include <deque> #include <iostream>
6
2614
by: Ian Gibbons | last post by:
Firstly what type is %x as I've not encountered it before? Now the problem: I'm trying to alter a host masking system for ircd so that it masks all but the isp name and location (if .uk.us etc..). However the crc32 stuff doesnt return as a string, and because the number of hostname fields is not static I need to build a string which contains each hostname field hashed and joined together. if (parc > 4) /* There are isp's like...
23
3600
by: Nascimento | last post by:
Hello, How to I do to return a string as a result of a function. I wrote the following function: char prt_tralha(int num) { int i; char tralha;
9
11232
by: booksnore | last post by:
I am writing some code to search for strings that contain every letter of the alphabet. At the moment I am using the method below to check to see if a string contains every letter of the alphabet. I wanted to use a regular expression but I could not find an ‘AND’ operator within the regular expression – so I need something like match true if string contains ‘a’ and ‘b’ and ‘c’ and ‘d’ ..etc,etc. If anyone has any thoughts on how I can...
20
17225
by: bubunia2000 | last post by:
Hi all, I heard that strtok is not thread safe. So I want to write a sample program which will tokenize string without using strtok. Can I get a sample source code for the same. For exp: 0.0.0.0--->I want to tokenize the string using delimiter as as dot. Regards
6
2598
by: jmanion | last post by:
Hello, I have a problem, I'm hoping someone can help. I have a string, lets say: "111111"; // Not always numeric I have a format mask, lets say: "(00) 0000"; // Again, not always numeric. The mask and the value are passed into a method.
111
20037
by: Tonio Cartonio | last post by:
I have to read characters from stdin and save them in a string. The problem is that I don't know how much characters will be read. Francesco -- ------------------------------------- http://www.riscossione.info/
11
2900
by: Jacek Dziedzic | last post by:
Hi! I need a routine like: std::string nth_word(const std::string &s, unsigned int n) { // return n-th word from the string, n is 0-based // if 's' contains too few words, return "" // 'words' are any sequences of non-whitespace characters // leading, trailing and multiple whitespace characters // should be ignored.
0
1205
by: Mitchel Haas | last post by:
I've noticed several inquiries in the past for libraries/toolkits to generate or parse xhtml. Although there are already a few libraries available for this purpose, I'd like to announce a new lightweight library for this purpose. Xport, XHTML Parsing & Objective Reporting Toolkit, is a new open source lightweight library for the purpose of generating and parsing (x)html documents. Although Xport was created for reporting purposes, Xport...
0
8853
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8736
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9085
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8025
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6677
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5993
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4498
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4762
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2592
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.