473,396 Members | 1,827 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

How to Parse a string with Embedded Double Quotes

I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination, and I
want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello world"
have been reduced to a single space, and Split does not respect the double
quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the techniques I
already have.

TIA

Charles
Nov 21 '05 #1
16 5834
Charles,

I was looking at the problem, I was thinking will I give my answer because
it is so difficult to describe. Than I saw that it was you. Therefore it
should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

....
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination, and
I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello world"
have been reduced to a single space, and Split does not respect the double
quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the techniques
I already have.

TIA

Charles

Nov 21 '05 #2
Charles,

I was looking at the problem, I was thinking will I give my answer because
it is so difficult to describe. Than I saw that it was you. Therefore it
should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

....
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination, and
I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello world"
have been reduced to a single space, and Split does not respect the double
quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the techniques
I already have.

TIA

Charles

Nov 21 '05 #3
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back to
processing each part of the string character by character so that I match
double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer because
it is so difficult to describe. Than I saw that it was you. Therefore it
should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination, and
I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello
world" have been reduced to a single space, and Split does not respect
the double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the techniques
I already have.

TIA

Charles


Nov 21 '05 #4
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back to
processing each part of the string character by character so that I match
double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer because
it is so difficult to describe. Than I saw that it was you. Therefore it
should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination, and
I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello
world" have been reduced to a single space, and Split does not respect
the double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the techniques
I already have.

TIA

Charles


Nov 21 '05 #5

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or the
word token without the whitespace characters depending on which match the
Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back
to processing each part of the string character by character so that I
match double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination,
and I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello world"
have been reduced to a single space, and Split does not respect the
double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the string
character by character, but I am hoping that there is a straight-forward
technique for doing it, without looping, and using some of the
techniques I already have.

TIA

Charles



Nov 21 '05 #6
Hi Robby

Thanks for the reply. I am not sure that I understand the regular expression
(\s*"([\s\w]*)")|(\s*(\w+))
I tried the following, but of course it gives a syntax error because of the
embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a replace.

Any chance of a small snippet to get me on the right track, using the Match
object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...
Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back
to processing each part of the string character by character so that I
match double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination,
and I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello
world" have been reduced to a single space, and Split does not respect
the double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the
string character by character, but I am hoping that there is a
straight-forward technique for doing it, without looping, and using
some of the techniques I already have.

TIA

Charles



Nov 21 '05 #7

Create a console application

#########################

Imports System.Text.RegularExpressions

Module MainModule

Sub Main()

Dim rePost As New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")
Dim testString As String = "MyString 40 ""Hello world""
all "
Dim allMatches As MatchCollection = rePost.Matches(testString)

Dim matchPiece As Match
Dim I As Integer

For I = 0 To allMatches.Count - 1
matchPiece = allMatches(I)
Console.WriteLine("Piece {0} -> '{1}'", I,
matchPiece.Result("$2$4"))
Next I

End Sub

End Module

####################

--Robby

"Charles Law" <bl***@nowhere.com> wrote in message
news:eI**************@TK2MSFTNGP11.phx.gbl...
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
>I have a string similar to the following:
>
> " MyString 40 "Hello world" all "
>
> It contains white space that may be spaces or tabs, or a combination,
> and I want to produce an array with the following elements
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = "Hello world"
> arr(3) = "all"
>
> Using trim and a regular expression ("\s+"), I can reduce my string to
>
> "MyString 40 "Hello world" all"
>
> and with Split I can get
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = ""Hello"
> arr(3) = "world""
> arr(4) = "all"
>
> As you can see, it is not quite what I need. The spaces in "Hello
> world" have been reduced to a single space, and Split does not respect
> the double quotes, and splits "Hello world" over two elements.
>
> Does anyone have an idea how I could do this? I could process the
> string character by character, but I am hoping that there is a
> straight-forward technique for doing it, without looping, and using
> some of the techniques I already have.
>
> TIA
>
> Charles
>
>



Nov 21 '05 #8
Hi Robby - me again

I have it now; I just needed to apply a few of those grey cells I have
knocking about.

Cheers.

Charles
"Charles Law" <bl***@nowhere.com> wrote in message
news:eI**************@TK2MSFTNGP11.phx.gbl...
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
>I have a string similar to the following:
>
> " MyString 40 "Hello world" all "
>
> It contains white space that may be spaces or tabs, or a combination,
> and I want to produce an array with the following elements
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = "Hello world"
> arr(3) = "all"
>
> Using trim and a regular expression ("\s+"), I can reduce my string to
>
> "MyString 40 "Hello world" all"
>
> and with Split I can get
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = ""Hello"
> arr(3) = "world""
> arr(4) = "all"
>
> As you can see, it is not quite what I need. The spaces in "Hello
> world" have been reduced to a single space, and Split does not respect
> the double quotes, and splits "Hello world" over two elements.
>
> Does anyone have an idea how I could do this? I could process the
> string character by character, but I am hoping that there is a
> straight-forward technique for doing it, without looping, and using
> some of the techniques I already have.
>
> TIA
>
> Charles
>
>



Nov 21 '05 #9
Charles,

Can you give us an idea at the end how much time it took to find the regex
and how much time the straight forward technique and than as well a test
what will be the less time consuming method.

To get a good idea about the discussions using Regex or Straight forward, I
looked at it, and I think Straight forward should take me probably less than
30 minutes, so for you probably less than 15.

:-)

Cor

"Charles Law" <bl***@nowhere.com>
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
>I have a string similar to the following:
>
> " MyString 40 "Hello world" all "
>
> It contains white space that may be spaces or tabs, or a combination,
> and I want to produce an array with the following elements
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = "Hello world"
> arr(3) = "all"
>
> Using trim and a regular expression ("\s+"), I can reduce my string to
>
> "MyString 40 "Hello world" all"
>
> and with Split I can get
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = ""Hello"
> arr(3) = "world""
> arr(4) = "all"
>
> As you can see, it is not quite what I need. The spaces in "Hello
> world" have been reduced to a single space, and Split does not respect
> the double quotes, and splits "Hello world" over two elements.
>
> Does anyone have an idea how I could do this? I could process the
> string character by character, but I am hoping that there is a
> straight-forward technique for doing it, without looping, and using
> some of the techniques I already have.
>
> TIA
>
> Charles
>
>



Nov 21 '05 #10
Thanks again Robby. I came up with something similar in the end, using For
.... Each to go through the match collection, and reg.Replace instead of
match.Result, but it came down to the same thing.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:%2****************@TK2MSFTNGP15.phx.gbl...

Create a console application

#########################

Imports System.Text.RegularExpressions

Module MainModule

Sub Main()

Dim rePost As New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")
Dim testString As String = "MyString 40 ""Hello world"" all
"
Dim allMatches As MatchCollection = rePost.Matches(testString)

Dim matchPiece As Match
Dim I As Integer

For I = 0 To allMatches.Count - 1
matchPiece = allMatches(I)
Console.WriteLine("Piece {0} -> '{1}'", I,
matchPiece.Result("$2$4"))
Next I

End Sub

End Module

####################

--Robby

"Charles Law" <bl***@nowhere.com> wrote in message
news:eI**************@TK2MSFTNGP11.phx.gbl...
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a
replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which
match the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
> Charles,
>
> I was looking at the problem, I was thinking will I give my answer
> because it is so difficult to describe. Than I saw that it was you.
> Therefore it should not be a problem.
>
> In this kind of situations I replace the spaces I will not use for an
> absolute unused character.
>
> Do the split
>
> And replace the unused character again back for a space
>
> I assume that this is for you more than enough explanation.
>
> And now you read this you say, I knew that as well.
>
> :-)))
>
> Cor
>
> "Charles Law" <bl***@nowhere.com>
>
> ...
>>I have a string similar to the following:
>>
>> " MyString 40 "Hello world" all "
>>
>> It contains white space that may be spaces or tabs, or a combination,
>> and I want to produce an array with the following elements
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = "Hello world"
>> arr(3) = "all"
>>
>> Using trim and a regular expression ("\s+"), I can reduce my string
>> to
>>
>> "MyString 40 "Hello world" all"
>>
>> and with Split I can get
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = ""Hello"
>> arr(3) = "world""
>> arr(4) = "all"
>>
>> As you can see, it is not quite what I need. The spaces in "Hello
>> world" have been reduced to a single space, and Split does not
>> respect the double quotes, and splits "Hello world" over two
>> elements.
>>
>> Does anyone have an idea how I could do this? I could process the
>> string character by character, but I am hoping that there is a
>> straight-forward technique for doing it, without looping, and using
>> some of the techniques I already have.
>>
>> TIA
>>
>> Charles
>>
>>
>
>



Nov 21 '05 #11
Cor

Do you mean how long does it take to parse the string using RegEx against
parsing it character by character, or how long did it take to come up with
the solution?

I think the RegEx solution is by far the neatest, and most flexible. It is
also fewer lines of code. I do not have a solution parsing
character-by-character, so I cannot measure how long to create or run, but I
think you have it about right.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:um**************@TK2MSFTNGP10.phx.gbl...
Charles,

Can you give us an idea at the end how much time it took to find the regex
and how much time the straight forward technique and than as well a test
what will be the less time consuming method.

To get a good idea about the discussions using Regex or Straight forward,
I looked at it, and I think Straight forward should take me probably less
than 30 minutes, so for you probably less than 15.

:-)

Cor

"Charles Law" <bl***@nowhere.com>
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a
replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which
match the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
> Charles,
>
> I was looking at the problem, I was thinking will I give my answer
> because it is so difficult to describe. Than I saw that it was you.
> Therefore it should not be a problem.
>
> In this kind of situations I replace the spaces I will not use for an
> absolute unused character.
>
> Do the split
>
> And replace the unused character again back for a space
>
> I assume that this is for you more than enough explanation.
>
> And now you read this you say, I knew that as well.
>
> :-)))
>
> Cor
>
> "Charles Law" <bl***@nowhere.com>
>
> ...
>>I have a string similar to the following:
>>
>> " MyString 40 "Hello world" all "
>>
>> It contains white space that may be spaces or tabs, or a combination,
>> and I want to produce an array with the following elements
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = "Hello world"
>> arr(3) = "all"
>>
>> Using trim and a regular expression ("\s+"), I can reduce my string
>> to
>>
>> "MyString 40 "Hello world" all"
>>
>> and with Split I can get
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = ""Hello"
>> arr(3) = "world""
>> arr(4) = "all"
>>
>> As you can see, it is not quite what I need. The spaces in "Hello
>> world" have been reduced to a single space, and Split does not
>> respect the double quotes, and splits "Hello world" over two
>> elements.
>>
>> Does anyone have an idea how I could do this? I could process the
>> string character by character, but I am hoping that there is a
>> straight-forward technique for doing it, without looping, and using
>> some of the techniques I already have.
>>
>> TIA
>>
>> Charles
>>
>>
>
>



Nov 21 '05 #12

Hummm ... It took me two tries to get this Regular Expression. I don't
have the oportunity to use Regex a lot but I quite like them and dive in if
I have a spare moment. I'd say 4 to 6 minutes to solve it. In the old VB6
days I would have done a Find to get my qoute indexes, split with the double
qoute and resplit those outside the double qoute. Then trimmed them. That
would take more time to code and check for errors.

Robby
"Cor Ligthert" <no************@planet.nl> wrote in message
news:um**************@TK2MSFTNGP10.phx.gbl...
Charles,

Can you give us an idea at the end how much time it took to find the regex
and how much time the straight forward technique and than as well a test
what will be the less time consuming method.

To get a good idea about the discussions using Regex or Straight forward,
I looked at it, and I think Straight forward should take me probably less
than 30 minutes, so for you probably less than 15.

:-)

Cor

"Charles Law" <bl***@nowhere.com>
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression
(\s*"([\s\w]*)")|(\s*(\w+))


I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a
replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which
match the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
> Charles,
>
> I was looking at the problem, I was thinking will I give my answer
> because it is so difficult to describe. Than I saw that it was you.
> Therefore it should not be a problem.
>
> In this kind of situations I replace the spaces I will not use for an
> absolute unused character.
>
> Do the split
>
> And replace the unused character again back for a space
>
> I assume that this is for you more than enough explanation.
>
> And now you read this you say, I knew that as well.
>
> :-)))
>
> Cor
>
> "Charles Law" <bl***@nowhere.com>
>
> ...
>>I have a string similar to the following:
>>
>> " MyString 40 "Hello world" all "
>>
>> It contains white space that may be spaces or tabs, or a combination,
>> and I want to produce an array with the following elements
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = "Hello world"
>> arr(3) = "all"
>>
>> Using trim and a regular expression ("\s+"), I can reduce my string
>> to
>>
>> "MyString 40 "Hello world" all"
>>
>> and with Split I can get
>>
>> arr(0) = "MyString"
>> arr(1) = 40
>> arr(2) = ""Hello"
>> arr(3) = "world""
>> arr(4) = "all"
>>
>> As you can see, it is not quite what I need. The spaces in "Hello
>> world" have been reduced to a single space, and Split does not
>> respect the double quotes, and splits "Hello world" over two
>> elements.
>>
>> Does anyone have an idea how I could do this? I could process the
>> string character by character, but I am hoping that there is a
>> straight-forward technique for doing it, without looping, and using
>> some of the techniques I already have.
>>
>> TIA
>>
>> Charles
>>
>>
>
>



Nov 21 '05 #13
Charles and Robby,

I was meaning "thinking, writing and performance".

Because I had said it, I had the idea I should do it.

This is the first time I see a regex faster than a straight foreward loop.

Imports System.Text.RegularExpressions
Public Class Hello
Public Shared Sub main()
Dim start As Integer = Environment.TickCount
For y As Integer = 0 To 10000
Dim mystring As String = _
" MyString 40 ""Hello world"" all"
Dim myarrlist As New ArrayList
Dim endWord As Integer
For i As Integer = 0 To mystring.Length - 1
If mystring.Substring(i, 1) <> " " Then
If mystring.Substring(i, 1) = """" Then
endWord = mystring.Substring(i + 1).IndexOf("""") +
1
myarrlist.Add(mystring.Substring(i + 1, endWord -
1))
i = i + endWord
Else
endWord = mystring.Substring(i).IndexOf(" ")
If endWord = -1 Then endWord = mystring.Length - i
myarrlist.Add(mystring.Substring(i, endWord))
i = i + endWord
End If
End If
Next
Next
Console.WriteLine((Environment.TickCount - start).ToString)
'Regex
start = Environment.TickCount
For y As Integer = 0 To 10000
Dim testString As String = "MyString 40 ""Hello world""
all "
Dim rePost As New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")
Dim allMatches As MatchCollection = rePost.Matches(testString)
Next
Console.WriteLine((Environment.TickCount - start).ToString)
End Sub
End Class
///
The speed is for me about 3:2 where I did not really study to get the loop
faster, because is should be done in a short time as I said.

So Charles next time you have to message "Robby can you help me again?"
Because it is of course a wonderfull thing done with Regex what Robby did.

:-)

Cor

"Charles Law" <bl***@nowhere.com>
Cor

Do you mean how long does it take to parse the string using RegEx against
parsing it character by character, or how long did it take to come up with
the solution?

I think the RegEx solution is by far the neatest, and most flexible. It is
also fewer lines of code. I do not have a solution parsing
character-by-character, so I cannot measure how long to create or run, but
I think you have it about right.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:um**************@TK2MSFTNGP10.phx.gbl...
Charles,

Can you give us an idea at the end how much time it took to find the
regex and how much time the straight forward technique and than as well a
test what will be the less time consuming method.

To get a good idea about the discussions using Regex or Straight forward,
I looked at it, and I think Straight forward should take me probably less
than 30 minutes, so for you probably less than 15.

:-)

Cor

"Charles Law" <bl***@nowhere.com>
Hi Robby

Thanks for the reply. I am not sure that I understand the regular
expression

(\s*"([\s\w]*)")|(\s*(\w+))

I tried the following, but of course it gives a syntax error because of
the embedded double quotes:

Dim reg As Regex = New Regex("(\s*"([\s\w]*)")|(\s*(\w+))")

So I tried escaping the double quotes, like this

Dim reg As Regex = New Regex("(\s*""([\s\w]*)"")|(\s*(\w+))")

but this cleared my string out to a couple of spaces when I did a
replace.

Any chance of a small snippet to get me on the right track, using the
Match object?

Thanks very much.

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes
or the word token without the whitespace characters depending on which
match the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
> Hi Cor
>
> You read my mind ;-)
>
> I had thought of using something like #, as it will never occur in my
> string. But then I started to look at how I would know which spaces to
> replace with #, and which to leave. Of course, to the human eye it is
> obvious that I only replace the spaces between " and ", but now I am
> back to processing each part of the string character by character so
> that I match double quotes correctly, and this is what I was trying to
> avoid.
>
> Perhaps there is a regex expression that will match double quotes, or
> a method that parses a string taking these into account, but sadly I
> do not know it yet.
>
> But please, keep the suggestions flowing.
>
> Charles
>
>
> "Cor Ligthert" <no************@planet.nl> wrote in message
> news:OX**************@TK2MSFTNGP09.phx.gbl...
>> Charles,
>>
>> I was looking at the problem, I was thinking will I give my answer
>> because it is so difficult to describe. Than I saw that it was you.
>> Therefore it should not be a problem.
>>
>> In this kind of situations I replace the spaces I will not use for an
>> absolute unused character.
>>
>> Do the split
>>
>> And replace the unused character again back for a space
>>
>> I assume that this is for you more than enough explanation.
>>
>> And now you read this you say, I knew that as well.
>>
>> :-)))
>>
>> Cor
>>
>> "Charles Law" <bl***@nowhere.com>
>>
>> ...
>>>I have a string similar to the following:
>>>
>>> " MyString 40 "Hello world" all "
>>>
>>> It contains white space that may be spaces or tabs, or a
>>> combination, and I want to produce an array with the following
>>> elements
>>>
>>> arr(0) = "MyString"
>>> arr(1) = 40
>>> arr(2) = "Hello world"
>>> arr(3) = "all"
>>>
>>> Using trim and a regular expression ("\s+"), I can reduce my string
>>> to
>>>
>>> "MyString 40 "Hello world" all"
>>>
>>> and with Split I can get
>>>
>>> arr(0) = "MyString"
>>> arr(1) = 40
>>> arr(2) = ""Hello"
>>> arr(3) = "world""
>>> arr(4) = "all"
>>>
>>> As you can see, it is not quite what I need. The spaces in "Hello
>>> world" have been reduced to a single space, and Split does not
>>> respect the double quotes, and splits "Hello world" over two
>>> elements.
>>>
>>> Does anyone have an idea how I could do this? I could process the
>>> string character by character, but I am hoping that there is a
>>> straight-forward technique for doing it, without looping, and using
>>> some of the techniques I already have.
>>>
>>> TIA
>>>
>>> Charles
>>>
>>>
>>
>>
>
>



Nov 21 '05 #14
Robby

I have just come across a valid (in my context) string that is split into
too many matches. The string is

"PartA PartB PartC(plus) PartD"

The regex breaks it into

PartA
PartB
PartC
plus
PartD

Can you see a refinement for the regular expression to keep PartC(plus) as
one element?

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back
to processing each part of the string character by character so that I
match double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination,
and I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello
world" have been reduced to a single space, and Split does not respect
the double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the
string character by character, but I am hoping that there is a
straight-forward technique for doing it, without looping, and using
some of the techniques I already have.

TIA

Charles



Nov 21 '05 #15
Also, despite what I said earlier to Cor, # could appear in my string, and
it also causes a split when I don't want it to. In fact, the only characters
that should cause a split (outside matched double quotes) are

space, tab, CR, LF, FF, or other control characters

Any ideas?

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am back
to processing each part of the string character by character so that I
match double quotes correctly, and this is what I was trying to avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do not
know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
I have a string similar to the following:

" MyString 40 "Hello world" all "

It contains white space that may be spaces or tabs, or a combination,
and I want to produce an array with the following elements

arr(0) = "MyString"
arr(1) = 40
arr(2) = "Hello world"
arr(3) = "all"

Using trim and a regular expression ("\s+"), I can reduce my string to

"MyString 40 "Hello world" all"

and with Split I can get

arr(0) = "MyString"
arr(1) = 40
arr(2) = ""Hello"
arr(3) = "world""
arr(4) = "all"

As you can see, it is not quite what I need. The spaces in "Hello
world" have been reduced to a single space, and Split does not respect
the double quotes, and splits "Hello world" over two elements.

Does anyone have an idea how I could do this? I could process the
string character by character, but I am hoping that there is a
straight-forward technique for doing it, without looping, and using
some of the techniques I already have.

TIA

Charles



Nov 21 '05 #16
I think I have it. I have changed the pattern to

(\s*""([\s\w\x23\x28\x29]*)"")|(\s*([\w\x23\x28\x29]+))

which seems to do the trick, unless anyone can spot a flaw in this.

Charles
"Charles Law" <bl***@nowhere.com> wrote in message
news:en**************@TK2MSFTNGP11.phx.gbl...
Also, despite what I said earlier to Cor, # could appear in my string, and
it also causes a split when I don't want it to. In fact, the only
characters that should cause a split (outside matched double quotes) are

space, tab, CR, LF, FF, or other control characters

Any ideas?

Charles
"Robby" <ed****@not.my.email.com> wrote in message
news:eC**************@TK2MSFTNGP10.phx.gbl...

Try

(\s*"([\s\w]*)")|(\s*(\w+))

Then do a Replace on each Match object with

$2$4

This will return either your double qouted string with out the qoutes or
the word token without the whitespace characters depending on which match
the Match object holds.

You just have to love Regular Expressions.

--Robby


"Charles Law" <bl***@nowhere.com> wrote in message
news:OP****************@TK2MSFTNGP11.phx.gbl...
Hi Cor

You read my mind ;-)

I had thought of using something like #, as it will never occur in my
string. But then I started to look at how I would know which spaces to
replace with #, and which to leave. Of course, to the human eye it is
obvious that I only replace the spaces between " and ", but now I am
back to processing each part of the string character by character so
that I match double quotes correctly, and this is what I was trying to
avoid.

Perhaps there is a regex expression that will match double quotes, or a
method that parses a string taking these into account, but sadly I do
not know it yet.

But please, keep the suggestions flowing.

Charles
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OX**************@TK2MSFTNGP09.phx.gbl...
Charles,

I was looking at the problem, I was thinking will I give my answer
because it is so difficult to describe. Than I saw that it was you.
Therefore it should not be a problem.

In this kind of situations I replace the spaces I will not use for an
absolute unused character.

Do the split

And replace the unused character again back for a space

I assume that this is for you more than enough explanation.

And now you read this you say, I knew that as well.

:-)))

Cor

"Charles Law" <bl***@nowhere.com>

...
>I have a string similar to the following:
>
> " MyString 40 "Hello world" all "
>
> It contains white space that may be spaces or tabs, or a combination,
> and I want to produce an array with the following elements
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = "Hello world"
> arr(3) = "all"
>
> Using trim and a regular expression ("\s+"), I can reduce my string to
>
> "MyString 40 "Hello world" all"
>
> and with Split I can get
>
> arr(0) = "MyString"
> arr(1) = 40
> arr(2) = ""Hello"
> arr(3) = "world""
> arr(4) = "all"
>
> As you can see, it is not quite what I need. The spaces in "Hello
> world" have been reduced to a single space, and Split does not respect
> the double quotes, and splits "Hello world" over two elements.
>
> Does anyone have an idea how I could do this? I could process the
> string character by character, but I am hoping that there is a
> straight-forward technique for doing it, without looping, and using
> some of the techniques I already have.
>
> TIA
>
> Charles
>
>



Nov 21 '05 #17

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: new kid | last post by:
hi, i am new to php programming.i am getting a parse error in configuratuion file of the web site which i am designing ,i.e(th file in which u specify the paths to various folders like folder...
5
by: Les Juby | last post by:
A client needs a routine to alert him as to which memo records in an Access-2000 database have had double apostrophes inserted in the text. These are stopping a Java mouseover from executing. ...
1
by: Thomas Matthews | last post by:
Hi, My goal is to extract a text field into a string type. The text field is from a buffer (array) of characters and delimited by white space or double quotes. *** I'm looking to perform this...
2
by: Trint Smith | last post by:
Ok, My program has been formating .txt files for input into sql server and ran into a problem...the .txt is an export from an accounting package and is only supposed to contain comas (,) between...
0
by: Charles Law | last post by:
I have a string similar to the following: " MyString 40 "Hello world" all " It contains white space that may be spaces or tabs, or a combination, and I want to produce an array...
6
by: Cerebral Believer | last post by:
Hi folks, I am having trouble identifying the error here can anyone help? Parse error: syntax error on line 318 $e-mail_1 = $_POST; Also will my use of " " (double quotes) instead of ' '...
5
by: meendar | last post by:
Hi, I just want to parse a character string as below Char * c=abcsyd"loddggg"kjskjdfsdf; I need to stripe out loddgg from c (inside ""). How can i do this in c?
2
by: Lawrence Krubner | last post by:
Imagine a template system that works by getting a file, as a string, and then putting it through eval(), something like this: $formAsString = $controller->command("readFileAndReturnString",...
12
by: Mark | last post by:
Is "{$_SERVER}/inc/mysql-connect.php" more efficient than $_SERVER."/inc/mysql-connect.php" ? Or does it still do a concatenation behind the scenes? I hear concatenation is expensive... (even...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.