By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
458,166 Members | 1,738 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 458,166 IT Pros & Developers. It's quick & easy.

How to Parse a string with Embedded Double Quotes

P: n/a
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination, and I
want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello world"
have been reduced to a single space, and Split does not respect the double
quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the techniques I
already have.

TIA

Charles
Nov 21 '05 #1
Share this Question
Share on Google+
16 Replies


P: n/a
Charles,

I was looking at the problem, I was thinking will I give my answer because
it is so difficult to describe. Than I saw that it was you. Therefore it
should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

....
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination, and
I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello world"
have been reduced to a single space, and Split does not respect the double
quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the techniques
I already have.

TIA

Charles

Nov 21 '05 #2

P: n/a
Charles,

I was looking at the problem, I was thinking will I give my answer because
it is so difficult to describe. Than I saw that it was you. Therefore it
should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

....
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination, and
I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello world"
have been reduced to a single space, and Split does not respect the double
quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the techniques
I already have.

TIA

Charles

Nov 21 '05 #3

P: n/a
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back to
processing each part of the string character by character so that I match
double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer because
it is so difficult to describe. Than I saw that it was you. Therefore it
should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination, and
I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello
world" have been reduced to a single space, and Split does not respect
the double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the techniques
I already have.

TIA

Charles


Nov 21 '05 #4

P: n/a
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back to
processing each part of the string character by character so that I match
double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer because
it is so difficult to describe. Than I saw that it was you. Therefore it
should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination, and
I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello
world" have been reduced to a single space, and Split does not respect
the double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the techniques
I already have.

TIA

Charles


Nov 21 '05 #5

P: n/a

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or the
word token without the whitespace characters depending on which match the
Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back
to processing each part of the string character by character so that I
match double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination,
and I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello world"
have been reduced to a single space, and Split does not respect the
double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the
techniques I already have.

TIA

Charles



Nov 21 '05 #6

P: n/a
Hi Robby

Thanks for the reply. I am not sure that I understand the regular expression
(\s*"([\s\w]*)")|(\s*(\w+))
I tried the following, but of course it gives a syntax error because of the
embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a replace.

Any chance of a small snippet to get me on the right track, using the Match
object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...
Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back
to processing each part of the string character by character so that I
match double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination,
and I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello
world" have been reduced to a single space, and Split does not respect
the double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the
string character by character, but I am hoping that there is a
straight-forward technique for doing it, without looping, and using
some of the techniques I already have.

TIA

Charles



Nov 21 '05 #7

P: n/a

Create a console application

#########################

Imports System.Text.RegularExpressions

Module MainModule

Sub Main()

Dim rePost As New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")
Dim testString As String = "MyString 40 ""Hello world""
all "
Dim allMatches As MatchCollection = rePost.Matches(testString)

Dim matchPiece As Match
Dim I As Integer

For I = 0 To allMatches.Count - 1
matchPiece = allMatches(I)
Console.WriteLine("Piece {0} -> '{1}'", I,
matchPiece.Result("$2$4"))
Next I

End Sub

End Module

####################

--Robby

"Charles Law" <bl***@nowhere.com> wrote in message
news:eI**************@TK2MSFTNGP11.phx.gbl...
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
>I have a string similar to the following:
>
> " MyString 40 "Hello world" all "
>
> It contains white space that may be spaces or tabs, or a combination,
> and I want to produce an array with the following elements
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = "Hello world"
> arr(3) = "all"
>
> Using trim and a regular expression ("\s+"), I can reduce my string to
>
> "MyString 40 "Hello world" all"
>
> and with Split I can get
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = ""Hello"
> arr(3) = "world""
> arr(4) = "all"
>
> As you can see, it is not quite what I need. The spaces in "Hello
> world" have been reduced to a single space, and Split does not respect
> the double quotes, and splits "Hello world" over two elements.
>
> Does anyone have an idea how I could do this? I could process the
> string character by character, but I am hoping that there is a
> straight-forward technique for doing it, without looping, and using
> some of the techniques I already have.
>
> TIA
>
> Charles
>
>



Nov 21 '05 #8

P: n/a
Hi Robby - me again

I have it now; I just needed to apply a few of those grey cells I have
knocking about.

Cheers.

Charles
"Charles Law" <bl***@nowhere.com> wrote in message
news:eI**************@TK2MSFTNGP11.phx.gbl...
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
>I have a string similar to the following:
>
> " MyString 40 "Hello world" all "
>
> It contains white space that may be spaces or tabs, or a combination,
> and I want to produce an array with the following elements
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = "Hello world"
> arr(3) = "all"
>
> Using trim and a regular expression ("\s+"), I can reduce my string to
>
> "MyString 40 "Hello world" all"
>
> and with Split I can get
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = ""Hello"
> arr(3) = "world""
> arr(4) = "all"
>
> As you can see, it is not quite what I need. The spaces in "Hello
> world" have been reduced to a single space, and Split does not respect
> the double quotes, and splits "Hello world" over two elements.
>
> Does anyone have an idea how I could do this? I could process the
> string character by character, but I am hoping that there is a
> straight-forward technique for doing it, without looping, and using
> some of the techniques I already have.
>
> TIA
>
> Charles
>
>



Nov 21 '05 #9

P: n/a
Charles,

Can you give us an idea at the end how much time it took to find the regex
and how much time the straight forward technique and than as well a test
what will be the less time consuming method.

To get a good idea about the discussions using Regex or Straight forward, I
looked at it, and I think Straight forward should take me probably less than
30 minutes, so for you probably less than 15.

:-)

Cor

"Charles Law" <bl***@nowhere.com>
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
>I have a string similar to the following:
>
> " MyString 40 "Hello world" all "
>
> It contains white space that may be spaces or tabs, or a combination,
> and I want to produce an array with the following elements
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = "Hello world"
> arr(3) = "all"
>
> Using trim and a regular expression ("\s+"), I can reduce my string to
>
> "MyString 40 "Hello world" all"
>
> and with Split I can get
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = ""Hello"
> arr(3) = "world""
> arr(4) = "all"
>
> As you can see, it is not quite what I need. The spaces in "Hello
> world" have been reduced to a single space, and Split does not respect
> the double quotes, and splits "Hello world" over two elements.
>
> Does anyone have an idea how I could do this? I could process the
> string character by character, but I am hoping that there is a
> straight-forward technique for doing it, without looping, and using
> some of the techniques I already have.
>
> TIA
>
> Charles
>
>



Nov 21 '05 #10

P: n/a
Thanks again Robby. I came up with something similar in the end, using For
.... Each to go through the match collection, and reg.Replace instead of
match.Result, but it came down to the same thing.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:%2****************@TK2MSFTNGP15.phx.gbl...

Create a console application

#########################

Imports System.Text.RegularExpressions

Module MainModule

Sub Main()

Dim rePost As New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")
Dim testString As String = "MyString 40 ""Hello world"" all
"
Dim allMatches As MatchCollection = rePost.Matches(testString)

Dim matchPiece As Match
Dim I As Integer

For I = 0 To allMatches.Count - 1
matchPiece = allMatches(I)
Console.WriteLine("Piece {0} -> '{1}'", I,
matchPiece.Result("$2$4"))
Next I

End Sub

End Module

####################

--Robby

"Charles Law" <bl***@nowhere.com> wrote in message
news:eI**************@TK2MSFTNGP11.phx.gbl...
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a
replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which
match the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
> Charles,
>
> I was looking at the problem, I was thinking will I give my answer
> because it is so difficult to describe. Than I saw that it was you.
> Therefore it should not be a problem.
>
> In this kind of situations I replace the spaces I will not use for an
> absolute unused character.
>
> Do the split
>
> And replace the unused character again back for a space
>
> I assume that this is for you more than enough explanation.
>
> And now you read this you say, I knew that as well.
>
> :-)))
>
> Cor
>
> "Charles Law" <bl***@nowhere.com>
>
> ...
>>I have a string similar to the following:
>>
>> " MyString 40 "Hello world" all "
>>
>> It contains white space that may be spaces or tabs, or a combination,
>> and I want to produce an array with the following elements
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = "Hello world"
>> arr(3) = "all"
>>
>> Using trim and a regular expression ("\s+"), I can reduce my string
>> to
>>
>> "MyString 40 "Hello world" all"
>>
>> and with Split I can get
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = ""Hello"
>> arr(3) = "world""
>> arr(4) = "all"
>>
>> As you can see, it is not quite what I need. The spaces in "Hello
>> world" have been reduced to a single space, and Split does not
>> respect the double quotes, and splits "Hello world" over two
>> elements.
>>
>> Does anyone have an idea how I could do this? I could process the
>> string character by character, but I am hoping that there is a
>> straight-forward technique for doing it, without looping, and using
>> some of the techniques I already have.
>>
>> TIA
>>
>> Charles
>>
>>
>
>



Nov 21 '05 #11

P: n/a
Cor

Do you mean how long does it take to parse the string using RegEx against
parsing it character by character, or how long did it take to come up with
the solution?

I think the RegEx solution is by far the neatest, and most flexible. It is
also fewer lines of code. I do not have a solution parsing
character-by-character, so I cannot measure how long to create or run, but I
think you have it about right.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:um**************@TK2MSFTNGP10.phx.gbl...
Charles,

Can you give us an idea at the end how much time it took to find the regex
and how much time the straight forward technique and than as well a test
what will be the less time consuming method.

To get a good idea about the discussions using Regex or Straight forward,
I looked at it, and I think Straight forward should take me probably less
than 30 minutes, so for you probably less than 15.

:-)

Cor

"Charles Law" <bl***@nowhere.com>
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a
replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which
match the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
> Charles,
>
> I was looking at the problem, I was thinking will I give my answer
> because it is so difficult to describe. Than I saw that it was you.
> Therefore it should not be a problem.
>
> In this kind of situations I replace the spaces I will not use for an
> absolute unused character.
>
> Do the split
>
> And replace the unused character again back for a space
>
> I assume that this is for you more than enough explanation.
>
> And now you read this you say, I knew that as well.
>
> :-)))
>
> Cor
>
> "Charles Law" <bl***@nowhere.com>
>
> ...
>>I have a string similar to the following:
>>
>> " MyString 40 "Hello world" all "
>>
>> It contains white space that may be spaces or tabs, or a combination,
>> and I want to produce an array with the following elements
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = "Hello world"
>> arr(3) = "all"
>>
>> Using trim and a regular expression ("\s+"), I can reduce my string
>> to
>>
>> "MyString 40 "Hello world" all"
>>
>> and with Split I can get
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = ""Hello"
>> arr(3) = "world""
>> arr(4) = "all"
>>
>> As you can see, it is not quite what I need. The spaces in "Hello
>> world" have been reduced to a single space, and Split does not
>> respect the double quotes, and splits "Hello world" over two
>> elements.
>>
>> Does anyone have an idea how I could do this? I could process the
>> string character by character, but I am hoping that there is a
>> straight-forward technique for doing it, without looping, and using
>> some of the techniques I already have.
>>
>> TIA
>>
>> Charles
>>
>>
>
>



Nov 21 '05 #12

P: n/a

Hummm ... It took me two tries to get this Regular Expression. I don't
have the oportunity to use Regex a lot but I quite like them and dive in if
I have a spare moment. I'd say 4 to 6 minutes to solve it. In the old VB6
days I would have done a Find to get my qoute indexes, split with the double
qoute and resplit those outside the double qoute. Then trimmed them. That
would take more time to code and check for errors.

Robby
"Cor Ligthert" <no************@planet.nl> wrote in message
news:um**************@TK2MSFTNGP10.phx.gbl...
Charles,

Can you give us an idea at the end how much time it took to find the regex
and how much time the straight forward technique and than as well a test
what will be the less time consuming method.

To get a good idea about the discussions using Regex or Straight forward,
I looked at it, and I think Straight forward should take me probably less
than 30 minutes, so for you probably less than 15.

:-)

Cor

"Charles Law" <bl***@nowhere.com>
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a
replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which
match the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
> Charles,
>
> I was looking at the problem, I was thinking will I give my answer
> because it is so difficult to describe. Than I saw that it was you.
> Therefore it should not be a problem.
>
> In this kind of situations I replace the spaces I will not use for an
> absolute unused character.
>
> Do the split
>
> And replace the unused character again back for a space
>
> I assume that this is for you more than enough explanation.
>
> And now you read this you say, I knew that as well.
>
> :-)))
>
> Cor
>
> "Charles Law" <bl***@nowhere.com>
>
> ...
>>I have a string similar to the following:
>>
>> " MyString 40 "Hello world" all "
>>
>> It contains white space that may be spaces or tabs, or a combination,
>> and I want to produce an array with the following elements
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = "Hello world"
>> arr(3) = "all"
>>
>> Using trim and a regular expression ("\s+"), I can reduce my string
>> to
>>
>> "MyString 40 "Hello world" all"
>>
>> and with Split I can get
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = ""Hello"
>> arr(3) = "world""
>> arr(4) = "all"
>>
>> As you can see, it is not quite what I need. The spaces in "Hello
>> world" have been reduced to a single space, and Split does not
>> respect the double quotes, and splits "Hello world" over two
>> elements.
>>
>> Does anyone have an idea how I could do this? I could process the
>> string character by character, but I am hoping that there is a
>> straight-forward technique for doing it, without looping, and using
>> some of the techniques I already have.
>>
>> TIA
>>
>> Charles
>>
>>
>
>



Nov 21 '05 #13

P: n/a
Charles and Robby,

I was meaning "thinking, writing and performance".

Because I had said it, I had the idea I should do it.

This is the first time I see a regex faster than a straight foreward loop.

Imports System.Text.RegularExpressions
Public Class Hello
Public Shared Sub main()
Dim start As Integer = Environment.TickCount
For y As Integer = 0 To 10000
Dim mystring As String = _
" MyString 40 ""Hello world"" all"
Dim myarrlist As New ArrayList
Dim endWord As Integer
For i As Integer = 0 To mystring.Length - 1
If mystring.Substring(i, 1) <> " " Then
If mystring.Substring(i, 1) = """" Then
endWord = mystring.Substring(i + 1).IndexOf("""") +
1
myarrlist.Add(mystring.Substring(i + 1, endWord -
1))
i = i + endWord
Else
endWord = mystring.Substring(i).IndexOf(" ")
If endWord = -1 Then endWord = mystring.Length - i
myarrlist.Add(mystring.Substring(i, endWord))
i = i + endWord
End If
End If
Next
Next
Console.WriteLine((Environment.TickCount - start).ToString)
'Regex
start = Environment.TickCount
For y As Integer = 0 To 10000
Dim testString As String = "MyString 40 ""Hello world""
all "
Dim rePost As New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")
Dim allMatches As MatchCollection = rePost.Matches(testString)
Next
Console.WriteLine((Environment.TickCount - start).ToString)
End Sub
End Class
///
The speed is for me about 3:2 where I did not really study to get the loop
faster, because is should be done in a short time as I said.

So Charles next time you have to message "Robby can you help me again?"
Because it is of course a wonderfull thing done with Regex what Robby did.

:-)

Cor

"Charles Law" <bl***@nowhere.com>
Cor

Do you mean how long does it take to parse the string using RegEx against
parsing it character by character, or how long did it take to come up with
the solution?

I think the RegEx solution is by far the neatest, and most flexible. It is
also fewer lines of code. I do not have a solution parsing
character-by-character, so I cannot measure how long to create or run, but
I think you have it about right.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:um**************@TK2MSFTNGP10.phx.gbl...
Charles,

Can you give us an idea at the end how much time it took to find the
regex and how much time the straight forward technique and than as well a
test what will be the less time consuming method.

To get a good idea about the discussions using Regex or Straight forward,
I looked at it, and I think Straight forward should take me probably less
than 30 minutes, so for you probably less than 15.

:-)

Cor

"Charles Law" <bl***@nowhere.com>
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression

(\s*"([\s\w]*)")|(\s*(\w+))

I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a
replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes
or the word token without the whitespace characters depending on which
match the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
> Hi Cor
>
> You read my mind ;-)
>
> I had thought of using something like #, as it will never occur in my
> string. But then I started to look at how I would know which spaces to
> replace with #, and which to leave. Of course, to the human eye it is
> obvious that I only replace the spaces between " and ", but now I am
> back to processing each part of the string character by character so
> that I match double quotes correctly, and this is what I was trying to
> avoid.
>
> Perhaps there is a regex expression that will match double quotes, or
> a method that parses a string taking these into account, but sadly I
> do not know it yet.
>
> But please, keep the suggestions flowing.
>
> Charles
>
>
> "Cor Ligthert" <no************@planet.nl> wrote in message
> news:OX**************@TK2MSFTNGP09.phx.gbl...
>> Charles,
>>
>> I was looking at the problem, I was thinking will I give my answer
>> because it is so difficult to describe. Than I saw that it was you.
>> Therefore it should not be a problem.
>>
>> In this kind of situations I replace the spaces I will not use for an
>> absolute unused character.
>>
>> Do the split
>>
>> And replace the unused character again back for a space
>>
>> I assume that this is for you more than enough explanation.
>>
>> And now you read this you say, I knew that as well.
>>
>> :-)))
>>
>> Cor
>>
>> "Charles Law" <bl***@nowhere.com>
>>
>> ...
>>>I have a string similar to the following:
>>>
>>> " MyString 40 "Hello world" all "
>>>
>>> It contains white space that may be spaces or tabs, or a
>>> combination, and I want to produce an array with the following
>>> elements
>>>
>>> arr(0) = "MyString"
>>> arr(1) = 40
>>> arr(2) = "Hello world"
>>> arr(3) = "all"
>>>
>>> Using trim and a regular expression ("\s+"), I can reduce my string
>>> to
>>>
>>> "MyString 40 "Hello world" all"
>>>
>>> and with Split I can get
>>>
>>> arr(0) = "MyString"
>>> arr(1) = 40
>>> arr(2) = ""Hello"
>>> arr(3) = "world""
>>> arr(4) = "all"
>>>
>>> As you can see, it is not quite what I need. The spaces in "Hello
>>> world" have been reduced to a single space, and Split does not
>>> respect the double quotes, and splits "Hello world" over two
>>> elements.
>>>
>>> Does anyone have an idea how I could do this? I could process the
>>> string character by character, but I am hoping that there is a
>>> straight-forward technique for doing it, without looping, and using
>>> some of the techniques I already have.
>>>
>>> TIA
>>>
>>> Charles
>>>
>>>
>>
>>
>
>



Nov 21 '05 #14

P: n/a
Robby

I have just come across a valid (in my context) string that is split into
too many matches. The string is

"PartA PartB PartC(plus) PartD"

The regex breaks it into

PartA
PartB
PartC
plus
PartD

Can you see a refinement for the regular expression to keep PartC(plus) as
one element?

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back
to processing each part of the string character by character so that I
match double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination,
and I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello
world" have been reduced to a single space, and Split does not respect
the double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the
string character by character, but I am hoping that there is a
straight-forward technique for doing it, without looping, and using
some of the techniques I already have.

TIA

Charles



Nov 21 '05 #15

P: n/a
Also, despite what I said earlier to Cor, # could appear in my string, and
it also causes a split when I don't want it to. In fact, the only characters
that should cause a split (outside matched double quotes) are

space, tab, CR, LF, FF, or other control characters

Any ideas?

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back
to processing each part of the string character by character so that I
match double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination,
and I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello
world" have been reduced to a single space, and Split does not respect
the double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the
string character by character, but I am hoping that there is a
straight-forward technique for doing it, without looping, and using
some of the techniques I already have.

TIA

Charles



Nov 21 '05 #16

P: n/a
I think I have it. I have changed the pattern to

(\s*""([\s\w\x23\x28\x29]*)"")|(\s*([\w\x23\x28\x29]+))

which seems to do the trick, unless anyone can spot a flaw in this.

Charles
"Charles Law" <bl***@nowhere.com> wrote in message
news:en**************@TK2MSFTNGP11.phx.gbl...
Also, despite what I said earlier to Cor, # could appear in my string, and
it also causes a split when I don't want it to. In fact, the only
characters that should cause a split (outside matched double quotes) are

space, tab, CR, LF, FF, or other control characters

Any ideas?

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
>I have a string similar to the following:
>
> " MyString 40 "Hello world" all "
>
> It contains white space that may be spaces or tabs, or a combination,
> and I want to produce an array with the following elements
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = "Hello world"
> arr(3) = "all"
>
> Using trim and a regular expression ("\s+"), I can reduce my string to
>
> "MyString 40 "Hello world" all"
>
> and with Split I can get
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = ""Hello"
> arr(3) = "world""
> arr(4) = "all"
>
> As you can see, it is not quite what I need. The spaces in "Hello
> world" have been reduced to a single space, and Split does not respect
> the double quotes, and splits "Hello world" over two elements.
>
> Does anyone have an idea how I could do this? I could process the
> string character by character, but I am hoping that there is a
> straight-forward technique for doing it, without looping, and using
> some of the techniques I already have.
>
> TIA
>
> Charles
>
>



Nov 21 '05 #17

This discussion thread is closed

Replies have been disabled for this discussion.