473,395 Members | 1,856 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Text Parsing with Qualifiers

Hi all,

Does anyone know of a GOOD example on parsing text with text qualifiers?

I am hoping to parse text with variable length delimiters/qualifiers. Also,
qualified text could run onto mulitple lines and contain characters like
vbcrlf (thus the multiple lines).

Anyhow, any help would be appreciated. Thanks!

--
Lucas Tam (RE********@rogers.com)
Please delete "REMOVE" from the e-mail address when replying.
http://members.ebay.com/aboutme/coolspot18/
Nov 20 '05 #1
7 5103
Nak
> Does anyone know of a GOOD example on parsing text with text qualifiers?

What exactly do you mean by text qualifiers? Characters?

Parsing strings in .NET has become even easier than in VB6, and it's
certainly easier than C++. Have you taken a look at the String class? It
contains many methods for maniplating strings

http://msdn.microsoft.com/library/de...classtopic.asp

You can even look into regular expressions for examining strings for
patterns. They are a bit fiddly to get the hang of to start with but they
are very very useful. I recently changed an HTML parsing routine that I had
for a regular expression alternative and the code size has bee dramatically
reduced.

http://msdn.microsoft.com/library/de...classtopic.asp

Anyway I hope this information can help you :-) If you let me know a little
bit more about what kind of strings you are wanting to maniplate I might be
able to give you some more tips.

Nick.

--
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
"No matter. Whatever the outcome, you are changed."

Fergus - September 5th 2003
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
"Lucas Tam" <RE********@rogers.com> wrote in message
news:Xn***************************@140.99.99.130.. .
Hi all,

I am hoping to parse text with variable length delimiters/qualifiers. Also, qualified text could run onto mulitple lines and contain characters like
vbcrlf (thus the multiple lines).

Anyhow, any help would be appreciated. Thanks!

--
Lucas Tam (RE********@rogers.com)
Please delete "REMOVE" from the e-mail address when replying.
http://members.ebay.com/aboutme/coolspot18/

Nov 20 '05 #2
"Nak" <a@a.com> wrote in news:O0**************@TK2MSFTNGP12.phx.gbl:
Does anyone know of a GOOD example on parsing text with text qualifiers?


What exactly do you mean by text qualifiers? Characters?


Ya, I am hoping to parse strings like:
"""This is a Quote""",01/01/2003,"Some Interesting Text, Here"

etc etc.

I've seen sample code that only handles single character delimiters/text
qualifiers, but I am hoping to find code that can handle any length text
qualifier/delimiters.

It's not TOO hard to parse such text, but if someone else has already
written some good code, might as well use it.

--
Lucas Tam (RE********@rogers.com)
Please delete "REMOVE" from the e-mail address when replying.
http://members.ebay.com/aboutme/coolspot18/
Nov 20 '05 #3
Lucas Tam wrote:
"Nak" <a@a.com> wrote in news:O0**************@TK2MSFTNGP12.phx.gbl:
Does anyone know of a GOOD example on parsing text with text qualifiers?
What exactly do you mean by text qualifiers? Characters?


Ya, I am hoping to parse strings like:
"""This is a Quote""",01/01/2003,"Some Interesting Text, Here"

etc etc.

I've seen sample code that only handles single character delimiters/text
qualifiers, but I am hoping to find code that can handle any length text
qualifier/delimiters.


So you really mean something like:

quotequotequoteThis is a Quotequotequotequotecomma01/01/2003commaquoteSome
Interesting Textcomma Herequote

( :-) )
It's not TOO hard to parse such text, but if someone else has already
written some good code, might as well use it.


If anyone hase some public VB (.NET or otherwise) code for generic handling of
this sort of thing, I'd like to see it too, but in .NET the best option is a
custom Regex, probably with extra code to handle context.

--
Regards,
Mark Hurd, B.Sc.(Ma.) (Hons.)
Nov 20 '05 #4
Mark Hurd wrote:
Lucas Tam wrote:
"Nak" <a@a.com> wrote in news:O0**************@TK2MSFTNGP12.phx.gbl:
> Does anyone know of a GOOD example on parsing text with text
> qualifiers?

What exactly do you mean by text qualifiers? Characters?


Ya, I am hoping to parse strings like:
"""This is a Quote""",01/01/2003,"Some Interesting Text, Here"

etc etc.

I've seen sample code that only handles single character delimiters/text
qualifiers, but I am hoping to find code that can handle any length text
qualifier/delimiters.


So you really mean something like:

quotequotequoteThis is a Quotequotequotequotecomma01/01/2003commaquoteSome
Interesting Textcomma Herequote

( :-) )
It's not TOO hard to parse such text, but if someone else has already
written some good code, might as well use it.


If anyone hase some public VB (.NET or otherwise) code for generic handling
of this sort of thing, I'd like to see it too, but in .NET the best option
is a custom Regex, probably with extra code to handle context.


I should add: if you're talking about parsing anything more complex, you
should look at .NET versions of lex and yacc, etc.

--
Regards,
Mark Hurd, B.Sc.(Ma.) (Hons.)
Nov 20 '05 #5
"Mark Hurd" <ma******@ozemail.com.au> wrote in
news:#V**************@TK2MSFTNGP09.phx.gbl:
I've seen sample code that only handles single character
delimiters/text qualifiers, but I am hoping to find code that can
handle any length text qualifier/delimiters.


So you really mean something like:

quotequotequoteThis is a
Quotequotequotequotecomma01/01/2003commaquoteSome Interesting
Textcomma Herequote


Exactly! I'm trying to build an import routine that is as flexible as
possible. Who knows, maybe someone does use odd delimters like that : )

--
Lucas Tam (RE********@rogers.com)
Please delete "REMOVE" from the e-mail address when replying.
http://members.ebay.com/aboutme/coolspot18/
Nov 20 '05 #6
"Mark Hurd" <ma******@ozemail.com.au> wrote in news:eC6NYQ0eDHA.3248
@tk2msftngp13.phx.gbl:
I should add: if you're talking about parsing anything more complex, you
should look at .NET versions of lex and yacc, etc.


Ah, I used Yacc briefly with Java. I didn't know it existed with .NET.
Thanks for the tip!

--
Lucas Tam (RE********@rogers.com)
Please delete "REMOVE" from the e-mail address when replying.
http://members.ebay.com/aboutme/coolspot18/
Nov 20 '05 #7
Lucas Tam wrote:
"Mark Hurd" <ma******@ozemail.com.au> wrote in
news:#V**************@TK2MSFTNGP09.phx.gbl:
I've seen sample code that only handles single character
delimiters/text qualifiers, but I am hoping to find code that can
handle any length text qualifier/delimiters.


So you really mean something like:

quotequotequoteThis is a
Quotequotequotequotecomma01/01/2003commaquoteSome Interesting
Textcomma Herequote


Exactly! I'm trying to build an import routine that is as flexible as
possible. Who knows, maybe someone does use odd delimters like that : )


When I posed the "comma" separated values example I was going to provide a
Regex for it, but at the time I didn't have enough time...

Here it is:

((((quote)(?<quoted>(([^q])|(q[^u])|(qu[^o])|(quo[^t])|(quot[^e])|((quote)(quo
te)))*)(quote)))|(?<unquoted>(([^c])|(c[^o])|(co[^m])|(com[^m])|(comm[^a]))*))
((comma)|$)

I've put in a couple of pairs of brackets to highlight how this could be
produced by an automated generator...

The intended use of the above regex is to loop through all matches, checking
there are no unmatched gaps - syntax errors -- and ignoring the null match at
the end of the string. The <quoted> group needs to have quotequote reduced to
quote -- .Replace "quotequote" "quote" -- and only on of <quoted> or
<unquoted> should have any content.

Can someone confirm whether there's an optimisation for this Regex using the
extended grouping features?
--
Regards,
Mark Hurd, B.Sc.(Ma.) (Hons.)
Nov 20 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

17
by: DanielESFA | last post by:
Hey guys :) This is a bit of a funny one... We're four guys working on the same project, everybody using KDevelop and g++ on Linux. Three of us are using Mandrake, with g++ 3.4.3 and 3.4.1....
1
by: Joe Saliba | last post by:
hello, i would like to know please if there's any mean to change delimeters (, or ;) and text qualifiers in a file in a i/o file written in vb ex: open #1 for ... write #1 ... thx *** Sent...
1
by: kim | last post by:
Hello! Here, in brief, is my problem. Access 2002. I need to export to a csv text file with text delimiters on every field. I find that if I save the original data in Excel and make sure the...
12
by: Charlie Zender | last post by:
Hi, I am unable to compile a large body of code with extremely pedantic compile time checks activate, so that warnings cause errors. With GCC 3.3.1, I do this with gcc -std=c99 -pedantic...
2
by: John | last post by:
Hello! When I compile the following code, I get this error message "error: passing ... discards qualifiers" and I don't understand why. Could anybody help me? Thank you! John
0
by: jqq | last post by:
W2k3 server, SQL 2005. @@version = Microsoft SQL Server 2005 - 9.00.1399.06 (Intel X86) Standard Edition on Windows NT 5.2 (Build 3790: Service Pack 1) I'm trying to set up my first SSIS...
1
by: Marc Miller | last post by:
Hello everyone, First I'll address this as a textfile manipulation question and if there is no answer there, then I'll need to ask it as a vb ADO question. I have a text file that I'm reading...
0
by: jmarr02s | last post by:
I wish to supply text qualifiers to the following date time stamp function, Expr1: FormatDate(Now()) in my query, using fField1: Chr(34) & & Chr(34)...How must I write this? Next, I export the...
3
by: toton | last post by:
Hi, I have some ascii files, which are having some formatted text. I want to read some section only from the total file. For that what I am doing is indexing the sections (denoted by .START in...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.