473,761 Members | 10,280 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Parsing using RE?

Hello all

I have a huge string that I need to parse

Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1>
Value <Delim3>

Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1>
Value <Delim3>

repeat for a couple hundred thousand times

The <Delim1> seprates the Key, Value pair
<Delim2> seprates two different Key,Value pairs
<Delim2> seprates records.

I need to get the Key Value pairs and populate a table with that
information.

Would the .NET regular expressions be worth while and how would I go
about doing it in a clean optimized fashion.
Thanks

-Ravi Singh

Nov 16 '05 #1
9 1475
yes, definately.

you'll need to write you own reg exp tho

i'd recommend using an app called expresso. free reg exp tester/builder.
http://www.ultrapico.com/Expresso.htm

if all the delimiters are unique definately use a reg exp. else, you'll be
looping "while (str.indexOf("< Delim")) { ..." etc.
using regular expression to find matches would be much quicker, and return
array of matches (and fields)

if you get stuck, repost.

HTH
sam
"Ravi Singh (UCSD)" <ra*********@gm ail.com> wrote in message
news:11******** **************@ l41g2000cwc.goo glegroups.com.. .
Hello all

I have a huge string that I need to parse

Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1>
Value <Delim3>

Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1>
Value <Delim3>

repeat for a couple hundred thousand times

The <Delim1> seprates the Key, Value pair
<Delim2> seprates two different Key,Value pairs
<Delim2> seprates records.

I need to get the Key Value pairs and populate a table with that
information.

Would the .NET regular expressions be worth while and how would I go
about doing it in a clean optimized fashion.
Thanks

-Ravi Singh

Nov 16 '05 #2
string input = "Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2>
Key <Delim1> Value <Delim3>Key <Delim1> Value <Delim2> Key <Delim1>
Value <Delim2> Key <Delim1> Value <Delim3>";

Regex delim1 = new Regex("<Delim1> ");
Regex delim2 = new Regex("<Delim2> ");
Regex delim3 = new Regex("<Delim3> ");

string[] rets3 = delim3.Split(in put);
string[] rets2 = delim2.Split(St ring.Concat(ret s3));
string[] rets1 = delim1.Split(St ring.Concat(ret s2));

rets 2 and rets 1 is not what I expect it to be. =(. any ideas?

Thanks
-Ravi.

Nov 16 '05 #3
I got it :-)

Thanks

Nov 16 '05 #4
Could you post the solution so we can see it? It might help someone
else in the same situation some day.

Nov 16 '05 #5
Ravi,
In addition to the other comments.

You could use a While loop with Match.NextMatch .

Something like:

string pattern = @"(?<key>\w+)=( ?<value>\w+)(:; |)";
string input = "a=1;b=2;c=3;d= 4;e=5;";

Regex parser = new Regex(pattern, RegexOptions.Co mpiled);

Match match = parser.Match(in put);
while (match.Success)
{
Debug.WriteLine (match.Groups["key"], "key");
Debug.WriteLine (match.Groups["value"], "value");
match = match.NextMatch ();
}

Where "=" is Delim1 & ";" is Delim2, depending on how important Delim3 is I
would consider using String.SubStrin g to extract the input upto Delim3 then
use the above code...

Hope this helps
Jay

"Ravi Singh (UCSD)" <ra*********@gm ail.com> wrote in message
news:11******** **************@ l41g2000cwc.goo glegroups.com.. .
Hello all

I have a huge string that I need to parse

Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1>
Value <Delim3>

Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1>
Value <Delim3>

repeat for a couple hundred thousand times

The <Delim1> seprates the Key, Value pair
<Delim2> seprates two different Key,Value pairs
<Delim2> seprates records.

I need to get the Key Value pairs and populate a table with that
information.

Would the .NET regular expressions be worth while and how would I go
about doing it in a clean optimized fashion.
Thanks

-Ravi Singh

Nov 16 '05 #6
string input = "Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2>
Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1>
Value <Delim2> Key <Delim1> Value <Delim3>";

Regex delim1 = new Regex("<Delim1> ");
Regex delim2 = new Regex("<Delim2> ");
Regex delim3 = new Regex("<Delim3> ");

string[] rets3 = delim3.Split(in put);
string[] rets2 = delim2.Split(St ring.Concat(ret s3));
string[] rets1 = delim1.Split(St ring.Concat(ret s2));

There it is I concat it, however a join might be more appropriate.

Thanks

Nov 16 '05 #7
Ravi Singh (UCSD) wrote:

Would the .NET regular expressions be worth while and how would I go
about doing it in a clean optimized fashion.

RegEx? I'd use PERL :)

hjf
Nov 16 '05 #8
Why not use string.split, it should be faster and easier to implement.

"Ravi Singh (UCSD)" <ra*********@gm ail.com> wrote in message
news:11******** **************@ g14g2000cwa.goo glegroups.com.. .
string input = "Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2>
Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1>
Value <Delim2> Key <Delim1> Value <Delim3>";

Regex delim1 = new Regex("<Delim1> ");
Regex delim2 = new Regex("<Delim2> ");
Regex delim3 = new Regex("<Delim3> ");

string[] rets3 = delim3.Split(in put);
string[] rets2 = delim2.Split(St ring.Concat(ret s3));
string[] rets1 = delim1.Split(St ring.Concat(ret s2));

There it is I concat it, however a join might be more appropriate.

Thanks

Nov 16 '05 #9
Here's a little snippet I wrote to do this kind of thing with just 2
delimiters, one to separate the key-value pairs and another to split
apart each actual pair. Since both delimiters are arrays however, you
can specify any number of different delimiters, so in your case you may
have outerDelimiters == { "<Delim2>", "<Delim3>" } ... if I understand
correctly what it is you are after.

Though I haven't tested it, I'm pretty sure the String.Split method
will be much faster than using Regular Expressions; even a simple RE
requires the costly construction of some internal data structures to do
the job, and the RE routines will at least have to do everything that
String.Split() has to do anyway. However if your delimiters are not
predictable recurring strings, RE would be a better way.

The code:

=============== =============== =============== =======
public class NameValueCollec tionEx : NameValueCollec tion
{
public void LoadFrom(string source, string[] outerDelimiters ,
string[] innerDelimiters )
{
// using this constructor is due to be obsoleted in .NET 2.0,
// use StringSplitOpti ons enum instead
string[] pairs = source.Split(ou terDelimiters, true);

foreach ( string pair in pairs ) {
string[] elements = pair.Split(inne rDelimiters, 2, true);
this.Add(elemen ts[0], elements[1]);
}
}
}
=============== =============== =============== =======

I don't think you can get things a whole lot more optimized than this.
Though if anyone feels inspired to do a performance comparison vs. RE,
I'd be interested in seeing the results.

Joel

Nov 16 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
9444
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $ Last-Modified: $Date: 2003/10/28 19:48:44 $ Author: A.M. Kuchling <amk@amk.ca> Status: Draft Type: Standards Track
0
4130
by: Pentti | last post by:
Can anyone help to understand why re-parsing occurs on a remote database (using database links), even though we are using a prepared statement on the local database: Scenario: ======== We have an schema (s1) on an Oracle 9i database with database links pointing to a schema (s2) on another Oracle 9i database.
5
3261
by: gamehack | last post by:
Hi all, I was thinking about parsing equations but I can't think of any generic approach. Basically I have a struct called math_term which is something like: struct math_term { char sign; int constant; int x; int y;
5
4306
by: randy | last post by:
Can some point me to a good example of parsing XML using C# 2.0? Thanks
3
4386
by: toton | last post by:
Hi, I have some ascii files, which are having some formatted text. I want to read some section only from the total file. For that what I am doing is indexing the sections (denoted by .START in the file) with the location. And for a particular section I parse only that section. The file is something like, .... DATAS
3
3319
by: Anup Daware | last post by:
Hi Group, I am facing a strange problem here: I am trying to read xml response from a servlet using XmlTextWriter. I am able to read the read half of the xml and suddenly an exception: “Unexpected end of file while parsing Name has occurred” isbeing thrown. Following is the part o xml I am trying to read: <CHECK_ITEM_OUT>
6
5943
by: jackwootton | last post by:
Hello everyone, I understand that XML can be parsed using JavaScript using the XML Document object. However, it is possible to parse XHTML using JavaScript? I currently listen for DOMMutation events, when the events occur I access the node which was inserted or removed (event.target). There is only ever about 5 lines of XHTML nested in the node, however it would be silly for me to parse it manually using methods like hasChildNodes...
13
4512
by: Chris Carlen | last post by:
Hi: Having completed enough serial driver code for a TMS320F2812 microcontroller to talk to a terminal, I am now trying different approaches to command interpretation. I have a very simple command set consisting of several single letter commands which take no arguments. A few additional single letter commands take arguments:
6
1925
by: gw7rib | last post by:
I have a program that needs to do a small amount of relatively simple parsing. The routines I've written work fine, but the code using them is a bit long-winded. I therefore had the idea of creating a class to do parsing. It could be used as follows: int a, n, x, y; Parser par; par << string;
1
4405
by: eyeore | last post by:
Hello everyone my String reverse code works but my professor wants me to use pop top push or Stack code and parsing code could you please teach me how to make this code work with pop top push or Stack code and parsing code my professor i does not like me using buffer reader on my code and my professor did even give me an example code for parsing as well as pop push top or Stack code and i don't know how to do this code into parsing and pop push...
0
9531
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10115
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9905
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8780
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6609
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5229
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5373
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3881
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3456
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.