473,387 Members | 1,561 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

RegEx hangs - Please help

Hi All,

I have following text Template that needs to be parsed using Regular
Expression

-- Test Template
This is a Test Template for <#OrderNumber/>

Sender Message is <#SENDERMESSAGE/>

<#ORDERS>
This is Order number <#OrderNumber/>

Order Lines are

<#ORDERLINES>
Product Desc: <#PRODUCTDESCRIPTION/>

Product Information: <#ProductInformation/>

Supplier Name: <#SUPPLIERNAME/>

Supplier Contact: <#SUPPLIERCONTACTNO/>
</#ORDERLINES>
</#ORDERS>

Your voucher Reference number is <#VREFNO/>

-- End of Test Template

I need to extract string in between <#ORDERS>(anything)</#ORDERS>. The
reg exp i am using is (<#ORDERS>(\s|.)*?</#ORDERS>) and it works fine.
However if the test template gets change a bit like in following way,
the regular expression gets hanged.

--
<#ORDERS>
This is Order number <#OrderNumber/>

Order Lines are

<#ORDERLINES>
Product Desc: <#PRODUCTDESCRIPTION/>

Product Information: <#ProductInformation/>

Supplier Name: <#SUPPLIERNAME/>

Supplier Contact: <#SUPPLIERCONTACTNO/>
</#ORDERLINES>
</#ORDERS
--

Can some one please help and let me know any suggestion to do the same
thing. I am a bit new with Regular expression

Thanks,
Asif

Nov 17 '05 #1
4 1401
Hi Asif,

This looks for all the world to me like an XML file. Correct me if I'm
mistaken. If you're parsing an XML file, it would be much better to use the
XmlDocument and related classes to parse it. It certainly could be done
using Regular Expressions, but I'm not sure from your description how the
document is actually arranged, and it can be tricky. Using XML classes, it
can be much easier, and you will not need to use a whole set of Regular
Expressions if you want to parse more than just that one tag.

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
A watched clock never boils.

<sa********@gmail.com> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...
Hi All,

I have following text Template that needs to be parsed using Regular
Expression

-- Test Template
This is a Test Template for <#OrderNumber/>

Sender Message is <#SENDERMESSAGE/>

<#ORDERS>
This is Order number <#OrderNumber/>

Order Lines are

<#ORDERLINES>
Product Desc: <#PRODUCTDESCRIPTION/>

Product Information: <#ProductInformation/>

Supplier Name: <#SUPPLIERNAME/>

Supplier Contact: <#SUPPLIERCONTACTNO/>
</#ORDERLINES>
</#ORDERS>

Your voucher Reference number is <#VREFNO/>

-- End of Test Template

I need to extract string in between <#ORDERS>(anything)</#ORDERS>. The
reg exp i am using is (<#ORDERS>(\s|.)*?</#ORDERS>) and it works fine.
However if the test template gets change a bit like in following way,
the regular expression gets hanged.

--
<#ORDERS>
This is Order number <#OrderNumber/>

Order Lines are

<#ORDERLINES>
Product Desc: <#PRODUCTDESCRIPTION/>

Product Information: <#ProductInformation/>

Supplier Name: <#SUPPLIERNAME/>

Supplier Contact: <#SUPPLIERCONTACTNO/>
</#ORDERLINES>
</#ORDERS
--

Can some one please help and let me know any suggestion to do the same
thing. I am a bit new with Regular expression

Thanks,
Asif

Nov 17 '05 #2
On 30 Oct 2005 23:56:03 -0800, sa********@gmail.com wrote:
--
<#ORDERS>
This is Order number <#OrderNumber/>

Order Lines are

<#ORDERLINES>
Product Desc: <#PRODUCTDESCRIPTION/>

Product Information: <#ProductInformation/>

Supplier Name: <#SUPPLIERNAME/>

Supplier Contact: <#SUPPLIERCONTACTNO/>
</#ORDERLINES>
</#ORDERS


I have to agree with the previous replier that you should probably be
using XmlDocument instead of regular expressions.

I do have a few comments to make anyway. First of all, the reason the
regular expression doesn't work on the second template is because the
final tag is missing the finishing '>' character.

THe reason your program hangs is because the regular expression is
badly constructed. It should be @"<#ORDERS>.*?</#ORDERS>" and you
should use the RegexOptions.Singleline option when executing the
regular expression.

--
Marcus Andrén
Nov 17 '05 #3
Hi Kevin,

Well, this is not an XML file. This would be a MS Word RTF formatted
document. This is actually a template which users will be
defining/creating on their own using a set of template codes and the
system will parse the document, replace all template codes (Yes,
<#ORDERS> is a template code) with appropriate values.

I hope my question would be a bit clear now.

Thanks for your reply any way.
Asif

Nov 17 '05 #4
Hi Marcus,

Thanks for the reply and your regular expression works and solve the
problem.
However i have a couple of issues now.

The regular expression i was using before handles those template codes
as well which are initself nested that is
<#ORDERS>
<Do some thing>
<#ORDERS>
<Do some thing more>
</#ORDERS>
<Do some final stuff>
</#ORDERS>

but the reg exp you mentioned is not returning the top level tag rather
it returns something like this:
<#ORDERS><Do some thing><#ORDERS> <Do some thing more></#ORDERS>

Any advice on this issue?

Thanks anyway for the help.

Regards,
Asif

Nov 17 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Fortepianissimo | last post by:
Could someone explains why the following code hangs (Python 2.3.3)? --- CODE STARTS --- import re p=re.compile(r'\S+(?:\.\S+)+\.com') t='......................................' p.search(t)...
3
by: Vidar Skjelanger | last post by:
I have a regex for matching VB6-functions, but it hangs on one specific function. The regex: ...
1
by: Mark | last post by:
Hi, I've seen some postings on this but not exactly relating to this posting. I'm reading in a large mail message as a string. In the string is an xml attachment that I need to parse out and...
6
by: Dave | last post by:
I'm struggling with something that should be fairly simple. I just don't know the regext syntax very well, unfortunately. I'd like to parse words out of what is basically a boolean search...
9
by: jmchadha | last post by:
I have got the following html: "something in html ... etc.. city1... etc... <a class="font1" href="city1.html" onclick="etc."click for <b>info</bon city1 </a> ... some html. city1.. can repeat...
5
by: =?Utf-8?B?SkF1bA==?= | last post by:
I am currently working on a project and need to get a return… even if that return is a failure. I must also add that I have no control over either the Regular Expression that will be used or the...
2
by: Tomislav Fistric | last post by:
Hi all. I have very uncommon problem. I am running IIS 6.0 on 2xXEON dual core 3.06 (8 virtual CPUs) with 4 gb of Ecc ram and i get system.outofmemory exception on asp.net website. Thing is...
6
by: Phil Barber | last post by:
I am using Regex to validate a file name. I have everything I need except I would like the dot(.) in the filename only to appear once. My question is it possible to allow one instance of character...
4
by: Danny Ni | last post by:
Hi, The following code snippet is causing CPU to max out on my local machine and production servers. It looks fine on Expresso though. Regex rgxVideo = new...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.