473,800 Members | 2,623 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

regex - better way?

rjb
Hi!

Could somebody have a look and help me to optimize the code below.
It may look like very bad way of coding, but this stuff is very, very new
for me.

I've included just few lines.

Regex regxUserName = new Regex(@"(?<=Use r-Name = )\""([^\""]+)\""",
RegexOptions.No ne);
Regex regxSessionId = new Regex(@"(?<=Acc t-Multi-Session-Id
= )\""([^\""]+)\""", RegexOptions.No ne);
Regex regxInputGigawo rds = new Regex(@"(?<=Acc t-Input-Gigawords = )\w*",
RegexOptions.No ne);
..
..
..

Match mt = regxUserName.Ma tch(sb.ToString ());
strUserName = mt.Groups[1].ToString();
Match mt2 = regxSessionId.M atch(sb.ToStrin g());
strSessionId = mt2.Groups[1].ToString();
Match mt3 = regxInputGigawo rds.Match(sb.To String());
strInputGigawor ds = mt3.Groups[0].ToString();
..
..
..

I'm using this to extract data from the following file.

Mon Sep 27 22:17:15 2004
Acct-Status-Type = Interim-Update
User-Name = "0007933B22 B9"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147738"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096323434
Acct-Session-Time = 153766
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 17970689
Acct-Output-Octets = 8331353
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0
thank you
rjb
Nov 16 '05 #1
8 1864

"rjb" <RJB@no_spam_VP .PL> wrote in message news:cl******** **@news.onet.pl ...
Hi!

Could somebody have a look and help me to optimize the code below.
It may look like very bad way of coding, but this stuff is very, very new
for me.


Are you having any performance issues with this? If you aren't then the
easiest and most maintainable solution is fine(regex is easier to grasp,
once you know regex, than string manipulations and *way* easier to maintain
than a generated parser).

Nov 16 '05 #2
I would recommend not using regular expression,
but rather load all keys and values from the file into a hashtable
and work with that. It is a very smooth way. This is a code sample
how you would do it:
<code>
// Load the file (filename).
StreamReader sr = new StreamReader(fi lename);
Hashtable infoTable = new Hashtable();
string [] kvPair = null;
string line = null;
while (null != (line = sr.ReadLine()))
{
kvPair = line.Split('=') ;
infoTable.Add(k vPair [0].Trim(), kvPair [1].Trim());
}
sr.Close();
// Print all keys and values.
IDictionaryEnum erator de = infoTable.GetEn umerator();
while (de.MoveNext())
{
Console.WriteLi ne("{0} = {1}", de.Key, de.Value);
}
// Print the user name.
Console.WriteLi ne("The user is {0}.", infoTable
["User-Name"].ToString());

</code>

You might want to check that kvPair really has two elements, before putting
it into the table, and also handle eventual exceptions thrown when you try
to open the file.

--
Regards,
Dennis JD Myrén
Oslo Kodebureau
"rjb" <RJB@no_spam_VP .PL> wrote in message news:cl******** **@news.onet.pl ...
Hi!

Could somebody have a look and help me to optimize the code below.
It may look like very bad way of coding, but this stuff is very, very new
for me.

I've included just few lines.

Regex regxUserName = new Regex(@"(?<=Use r-Name = )\""([^\""]+)\""",
RegexOptions.No ne);
Regex regxSessionId = new Regex(@"(?<=Acc t-Multi-Session-Id
= )\""([^\""]+)\""", RegexOptions.No ne);
Regex regxInputGigawo rds = new Regex(@"(?<=Acc t-Input-Gigawords = )\w*",
RegexOptions.No ne);
.
.
.

Match mt = regxUserName.Ma tch(sb.ToString ());
strUserName = mt.Groups[1].ToString();
Match mt2 = regxSessionId.M atch(sb.ToStrin g());
strSessionId = mt2.Groups[1].ToString();
Match mt3 = regxInputGigawo rds.Match(sb.To String());
strInputGigawor ds = mt3.Groups[0].ToString();
.
.
.

I'm using this to extract data from the following file.

Mon Sep 27 22:17:15 2004
Acct-Status-Type = Interim-Update
User-Name = "0007933B22 B9"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147738"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096323434
Acct-Session-Time = 153766
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 17970689
Acct-Output-Octets = 8331353
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0
thank you
rjb

Nov 16 '05 #3
rjb
Thank you for your response.

Daniel - I don't have any issue with performance. I just thought that this
looks "bad".
My experience with regular expresion = couple of hours. Before then I didn't
know such
a thing exists :) I'm not a programmer...

Dennis - thank you for your code. I'm very keen on learning new techniques.

To give you the whole picture. My file looks like:

Mon Sep 27 22:17:15 2004
Acct-Status-Type = Interim-Update
User-Name = "0007933B22 B9"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147738"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096323434
Acct-Session-Time = 153766
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 17970689
Acct-Output-Octets = 8331353
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0

Mon Sep 27 22:17:35 2004
Acct-Status-Type = Interim-Update
User-Name = "00079326AA C8"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147817"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096323454
Acct-Session-Time = 900
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 130612
Acct-Output-Octets = 2058421
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0

Mon Sep 27 22:32:34 2004
Acct-Status-Type = Interim-Update
User-Name = "0007933041 0A"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147813"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096324353
Acct-Session-Time = 19429
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 4137490
Acct-Output-Octets = 11070040
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0

.....and so on. A lot of groups.

Basically what I want is to have the output below from each group:

0007933B22B9 27 Sep 2004 147738 0 0 17970689 8331353
00079326AAC8 27 Sep 2004 147817 0 0 130612 2058421
00079330410A 27 Sep 2004 147813 0 0 4137490 11070040
etc...

All this will go to a database.

Thank you for your time.
rjb
"Dennis Myrén" <dennis[DELETETHIS]@oslokb.no> wrote in message
news:BN******** ***********@new s4.e.nsc.no...
I would recommend not using regular expression,
but rather load all keys and values from the file into a hashtable
and work with that. It is a very smooth way. This is a code sample
how you would do it:
<code>
// Load the file (filename).
StreamReader sr = new StreamReader(fi lename);
Hashtable infoTable = new Hashtable();
string [] kvPair = null;
string line = null;
while (null != (line = sr.ReadLine()))
{
kvPair = line.Split('=') ;
infoTable.Add(k vPair [0].Trim(), kvPair [1].Trim());
}
sr.Close();
// Print all keys and values.
IDictionaryEnum erator de = infoTable.GetEn umerator();
while (de.MoveNext())
{
Console.WriteLi ne("{0} = {1}", de.Key, de.Value);
}
// Print the user name.
Console.WriteLi ne("The user is {0}.", infoTable
["User-Name"].ToString());

</code>

You might want to check that kvPair really has two elements, before putting it into the table, and also handle eventual exceptions thrown when you try
to open the file.

--
Regards,
Dennis JD Myrén
Oslo Kodebureau
"rjb" <RJB@no_spam_VP .PL> wrote in message

news:cl******** **@news.onet.pl ...
Hi!

Could somebody have a look and help me to optimize the code below.
It may look like very bad way of coding, but this stuff is very, very new for me.

I've included just few lines.

Regex regxUserName = new Regex(@"(?<=Use r-Name = )\""([^\""]+)\""",
RegexOptions.No ne);
Regex regxSessionId = new Regex(@"(?<=Acc t-Multi-Session-Id
= )\""([^\""]+)\""", RegexOptions.No ne);
Regex regxInputGigawo rds = new Regex(@"(?<=Acc t-Input-Gigawords = )\w*",
RegexOptions.No ne);
.
.
.

Match mt = regxUserName.Ma tch(sb.ToString ());
strUserName = mt.Groups[1].ToString();
Match mt2 = regxSessionId.M atch(sb.ToStrin g());
strSessionId = mt2.Groups[1].ToString();
Match mt3 = regxInputGigawo rds.Match(sb.To String());
strInputGigawor ds = mt3.Groups[0].ToString();
.
.
.

I'm using this to extract data from the following file.

Mon Sep 27 22:17:15 2004
Acct-Status-Type = Interim-Update
User-Name = "0007933B22 B9"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147738"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096323434
Acct-Session-Time = 153766
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 17970689
Acct-Output-Octets = 8331353
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0
thank you
rjb


Nov 16 '05 #4
Well, then there is some more work.
But it is still very doable using only StreamReader and string.Split.
If you know the file will never be huge, you could just
call ReadToEnd on the StreamReader and perform a split on that string,
splitting on new lines ('\n'), and then work with that array, because it
will be easier when you are not bound to forward-only processing of the
data.

I would suggest you define a class that represent each group to get a little
of structure, like:

public sealed
class Group
{

private Group ( )
{
}

DateTime _timeStamp = null;
Hashtable _table = null;

public DateTime TimeStamp
{
get
{
return _timeStamp;
}
}

public Hashtable DataTable
{
get
{
return _table;
}
}
}
--
Regards,
Dennis JD Myrén
Oslo Kodebureau
"rjb" <RJB@no_spam_VP .PL> wrote in message news:cl******** **@news.onet.pl ...
Thank you for your response.

Daniel - I don't have any issue with performance. I just thought that
this
looks "bad".
My experience with regular expresion = couple of hours. Before then I
didn't
know such
a thing exists :) I'm not a programmer...

Dennis - thank you for your code. I'm very keen on learning new
techniques.

To give you the whole picture. My file looks like:

Mon Sep 27 22:17:15 2004
Acct-Status-Type = Interim-Update
User-Name = "0007933B22 B9"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147738"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096323434
Acct-Session-Time = 153766
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 17970689
Acct-Output-Octets = 8331353
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0

Mon Sep 27 22:17:35 2004
Acct-Status-Type = Interim-Update
User-Name = "00079326AA C8"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147817"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096323454
Acct-Session-Time = 900
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 130612
Acct-Output-Octets = 2058421
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0

Mon Sep 27 22:32:34 2004
Acct-Status-Type = Interim-Update
User-Name = "0007933041 0A"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147813"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096324353
Acct-Session-Time = 19429
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 4137490
Acct-Output-Octets = 11070040
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0

....and so on. A lot of groups.

Basically what I want is to have the output below from each group:

0007933B22B9 27 Sep 2004 147738 0 0 17970689 8331353
00079326AAC8 27 Sep 2004 147817 0 0 130612 2058421
00079330410A 27 Sep 2004 147813 0 0 4137490 11070040
etc...

All this will go to a database.

Thank you for your time.
rjb
"Dennis Myrén" <dennis[DELETETHIS]@oslokb.no> wrote in message
news:BN******** ***********@new s4.e.nsc.no...
I would recommend not using regular expression,
but rather load all keys and values from the file into a hashtable
and work with that. It is a very smooth way. This is a code sample
how you would do it:
<code>
// Load the file (filename).
StreamReader sr = new StreamReader(fi lename);
Hashtable infoTable = new Hashtable();
string [] kvPair = null;
string line = null;
while (null != (line = sr.ReadLine()))
{
kvPair = line.Split('=') ;
infoTable.Add(k vPair [0].Trim(), kvPair [1].Trim());
}
sr.Close();
// Print all keys and values.
IDictionaryEnum erator de = infoTable.GetEn umerator();
while (de.MoveNext())
{
Console.WriteLi ne("{0} = {1}", de.Key, de.Value);
}
// Print the user name.
Console.WriteLi ne("The user is {0}.", infoTable
["User-Name"].ToString());

</code>

You might want to check that kvPair really has two elements, before

putting
it into the table, and also handle eventual exceptions thrown when you
try
to open the file.

--
Regards,
Dennis JD Myrén
Oslo Kodebureau
"rjb" <RJB@no_spam_VP .PL> wrote in message

news:cl******** **@news.onet.pl ...
> Hi!
>
> Could somebody have a look and help me to optimize the code below.
> It may look like very bad way of coding, but this stuff is very, very new > for me.
>
> I've included just few lines.
>
> Regex regxUserName = new Regex(@"(?<=Use r-Name = )\""([^\""]+)\""",
> RegexOptions.No ne);
> Regex regxSessionId = new Regex(@"(?<=Acc t-Multi-Session-Id
> = )\""([^\""]+)\""", RegexOptions.No ne);
> Regex regxInputGigawo rds = new Regex(@"(?<=Acc t-Input-Gigawords
> = )\w*",
> RegexOptions.No ne);
> .
> .
> .
>
> Match mt = regxUserName.Ma tch(sb.ToString ());
> strUserName = mt.Groups[1].ToString();
> Match mt2 = regxSessionId.M atch(sb.ToStrin g());
> strSessionId = mt2.Groups[1].ToString();
> Match mt3 = regxInputGigawo rds.Match(sb.To String());
> strInputGigawor ds = mt3.Groups[0].ToString();
> .
> .
> .
>
> I'm using this to extract data from the following file.
>
> Mon Sep 27 22:17:15 2004
> Acct-Status-Type = Interim-Update
> User-Name = "0007933B22 B9"
> NAS-IP-Address = 192.168.10.40
> Service-Type = DATA
> Acct-Multi-Session-Id = "147738"
> Acct-Session-Id = "3"
> Acct-Delay-Time = 0
> Event-Timestamp = 1096323434
> Acct-Session-Time = 153766
> Acct-Input-Gigawords = 0
> Acct-Output-Gigawords = 0
> Acct-Input-Octets = 17970689
> Acct-Output-Octets = 8331353
> Acct-Terminate-Cause = 0
> Framed-IP-Address = 0.0.0.0
> Acct-Input-Packets = 0
> Acct-Output-Packets = 0
> NAS-Port-Type = Async
> NAS-Port-Id = 0
>
>
> thank you
> rjb
>
>



Nov 16 '05 #5
Hello RJB,

I agree with Dennis' conclusions. I find simple parsing FAR easier to use,
understand, and debug, than regular expressions.
This is especially true since your data repeats in the data file.

I don't agree with Dennis that you need to read the entire document into
memory, though. I've seen data documents like this, and they can be quite
large. Simply detecting the blank line and the date is sufficient to
seperate groups and do a little processing.

Below, I've taken Dennis' code and added some logic... (warning: uncompiled
code)

// initialize your database object
SqlConnection myConnect = new SqlConnection (Your Connection String);
myConnect.Open( );

// Load the file (filename).
StreamReader sr = new StreamReader(fi lename);
Hashtable infoTable = new Hashtable();
string [] kvPair = null;
string line = null;
while (null != (line = sr.ReadLine()))
{
if (line.Trim.Len = 0) // you've hit a blank line... group ends
{
string Sql_String = string.Format(" Insert MyTable (date,
username, inputoctets, outputoctects) values ('{0}', '{1}', '{2}', '{3}')",
infoTable["Date"] , infoTable["User-Name"],
infoTable["Input-Octets"], infoTable["Output-Octets"]);
SqlCommand myCommand = new SqlCommand(Sql_ String, myConnect);
myCommand.Execu teNonQuery();
infoTable.Clear ();

}
else
{
kvPair = line.Split('=') ;
if (kvPair.Length = 1) // this is the date!
{
infoTable.Add(" Date",line.Trim ());
}
else
{
infoTable.Add(k vPair [0].Trim(), kvPair [1].Trim());
}
}
}
sr.Close();
myConnect.Close ();

Assumptions: there's a blank line at the end of the file.
There is no blank line at the beginning of the file.

This code was not compiled... please forgive any syntax errors. I'm typing
from memory.

--- Nick

"rjb" <RJB@no_spam_VP .PL> wrote in message news:cl******** **@news.onet.pl ...
Thank you for your response.

Daniel - I don't have any issue with performance. I just thought that this looks "bad".
My experience with regular expresion = couple of hours. Before then I didn't know such
a thing exists :) I'm not a programmer...

Dennis - thank you for your code. I'm very keen on learning new techniques.
To give you the whole picture. My file looks like:

Mon Sep 27 22:17:15 2004
Acct-Status-Type = Interim-Update
User-Name = "0007933B22 B9"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147738"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096323434
Acct-Session-Time = 153766
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 17970689
Acct-Output-Octets = 8331353
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0

Mon Sep 27 22:17:35 2004
Acct-Status-Type = Interim-Update
User-Name = "00079326AA C8"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147817"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096323454
Acct-Session-Time = 900
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 130612
Acct-Output-Octets = 2058421
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0

Mon Sep 27 22:32:34 2004
Acct-Status-Type = Interim-Update
User-Name = "0007933041 0A"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147813"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096324353
Acct-Session-Time = 19429
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 4137490
Acct-Output-Octets = 11070040
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0

....and so on. A lot of groups.

Basically what I want is to have the output below from each group:

0007933B22B9 27 Sep 2004 147738 0 0 17970689 8331353
00079326AAC8 27 Sep 2004 147817 0 0 130612 2058421
00079330410A 27 Sep 2004 147813 0 0 4137490 11070040
etc...

All this will go to a database.

Thank you for your time.
rjb
"Dennis Myrén" <dennis[DELETETHIS]@oslokb.no> wrote in message
news:BN******** ***********@new s4.e.nsc.no...
I would recommend not using regular expression,
but rather load all keys and values from the file into a hashtable
and work with that. It is a very smooth way. This is a code sample
how you would do it:
<code>
// Load the file (filename).
StreamReader sr = new StreamReader(fi lename);
Hashtable infoTable = new Hashtable();
string [] kvPair = null;
string line = null;
while (null != (line = sr.ReadLine()))
{
kvPair = line.Split('=') ;
infoTable.Add(k vPair [0].Trim(), kvPair [1].Trim());
}
sr.Close();
// Print all keys and values.
IDictionaryEnum erator de = infoTable.GetEn umerator();
while (de.MoveNext())
{
Console.WriteLi ne("{0} = {1}", de.Key, de.Value);
}
// Print the user name.
Console.WriteLi ne("The user is {0}.", infoTable
["User-Name"].ToString());

</code>

You might want to check that kvPair really has two elements, before

putting
it into the table, and also handle eventual exceptions thrown when you try
to open the file.

--
Regards,
Dennis JD Myrén
Oslo Kodebureau
"rjb" <RJB@no_spam_VP .PL> wrote in message

news:cl******** **@news.onet.pl ... Hi!

Could somebody have a look and help me to optimize the code below.
It may look like very bad way of coding, but this stuff is very, very new for me.

I've included just few lines.

Regex regxUserName = new Regex(@"(?<=Use r-Name = )\""([^\""]+)\""",
RegexOptions.No ne);
Regex regxSessionId = new Regex(@"(?<=Acc t-Multi-Session-Id
= )\""([^\""]+)\""", RegexOptions.No ne);
Regex regxInputGigawo rds = new Regex(@"(?<=Acc t-Input-Gigawords = )\w*", RegexOptions.No ne);
.
.
.

Match mt = regxUserName.Ma tch(sb.ToString ());
strUserName = mt.Groups[1].ToString();
Match mt2 = regxSessionId.M atch(sb.ToStrin g());
strSessionId = mt2.Groups[1].ToString();
Match mt3 = regxInputGigawo rds.Match(sb.To String());
strInputGigawor ds = mt3.Groups[0].ToString();
.
.
.

I'm using this to extract data from the following file.

Mon Sep 27 22:17:15 2004
Acct-Status-Type = Interim-Update
User-Name = "0007933B22 B9"
NAS-IP-Address = 192.168.10.40
Service-Type = DATA
Acct-Multi-Session-Id = "147738"
Acct-Session-Id = "3"
Acct-Delay-Time = 0
Event-Timestamp = 1096323434
Acct-Session-Time = 153766
Acct-Input-Gigawords = 0
Acct-Output-Gigawords = 0
Acct-Input-Octets = 17970689
Acct-Output-Octets = 8331353
Acct-Terminate-Cause = 0
Framed-IP-Address = 0.0.0.0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
NAS-Port-Type = Async
NAS-Port-Id = 0
thank you
rjb



Nov 16 '05 #6

"Nick Malik" <ni*******@hotm ail.nospam.com> wrote in message
news:Hqsgd.2123 1$HA.7002@attbi _s01...
Hello RJB,

I agree with Dennis' conclusions. I find simple parsing FAR easier to
use,
understand, and debug, than regular expressions.
This is especially true since your data repeats in the data file.

I don't agree with Dennis that you need to read the entire document into
memory, though. I've seen data documents like this, and they can be quite
large. Simply detecting the blank line and the date is sufficient to
seperate groups and do a little processing.

Below, I've taken Dennis' code and added some logic... (warning:
uncompiled
code)

// initialize your database object
SqlConnection myConnect = new SqlConnection (Your Connection String);
myConnect.Open( );

// Load the file (filename).
StreamReader sr = new StreamReader(fi lename);
Hashtable infoTable = new Hashtable();
string [] kvPair = null;
string line = null;
while (null != (line = sr.ReadLine()))
{
if (line.Trim.Len = 0) // you've hit a blank line... group ends
{
string Sql_String = string.Format(" Insert MyTable (date,
username, inputoctets, outputoctects) values ('{0}', '{1}', '{2}',
'{3}')",
infoTable["Date"] , infoTable["User-Name"],
infoTable["Input-Octets"], infoTable["Output-Octets"]);
SqlCommand myCommand = new SqlCommand(Sql_ String, myConnect);
myCommand.Execu teNonQuery();
infoTable.Clear ();

}
else
{
kvPair = line.Split('=') ;
if (kvPair.Length = 1) // this is the date!
{
infoTable.Add(" Date",line.Trim ());
}
else
{
infoTable.Add(k vPair [0].Trim(), kvPair [1].Trim());
}


As a note. If you want to remove quotes you'll have to process that here.
This particular algorithm will result in quoted strings being added to your
DB.
kvPair[1].Trim().Trim('" '); would be sufficent, if mildly messy
Nov 16 '05 #7

"Daniel O'Connell [C# MVP]" <onyxkirx@--NOSPAM--comcast.net> wrote in
message news:u8******** ******@TK2MSFTN GP10.phx.gbl...
<<clipped code block>>
As a note. If you want to remove quotes you'll have to process that here.
This particular algorithm will result in quoted strings being added to your DB.
kvPair[1].Trim().Trim('" '); would be sufficent, if mildly messy


Good point. I missed that detail.

now, if I just had Edit and Continue...
(just kidding :-)

--- Nick
Nov 16 '05 #8

"Nick Malik" <ni*******@hotm ail.nospam.com> wrote in message
news:XeQgd.5483 18$8_6.160046@a ttbi_s04...

"Daniel O'Connell [C# MVP]" <onyxkirx@--NOSPAM--comcast.net> wrote in
message news:u8******** ******@TK2MSFTN GP10.phx.gbl...

<<clipped code block>>

As a note. If you want to remove quotes you'll have to process that here.
This particular algorithm will result in quoted strings being added to

your
DB.
kvPair[1].Trim().Trim('" '); would be sufficent, if mildly messy


Good point. I missed that detail.

now, if I just had Edit and Continue...
(just kidding :-)


LOL
Nov 16 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2418
by: Alan Pretre | last post by:
Can anyone help me figure out a regex pattern for the following input example: xxx:a=b,c=d,yyy:e=f,zzz:www:g=h,i=j,l=m I would want four matches from this: 1. xxx a=b,c=d 2. yyy e=f 3. zzz (empty) 4. www g=h,i=j,l=m
4
3855
by: Cor | last post by:
Hi Newsgroup, I have given an answer in this newsgroup about a "Replace". There came an answer on that I did not understand, so I have done some tests. I got the idea that someone said, that the split method and the regex.replace method was better than the string.replace method and replace function. I did not believe that.
1
1554
by: Terry Olsen | last post by:
I download xml logs from several servers every day and read the data out of them using the XmlTextReader. But about 10% of them each day throw exceptions because they are not well formed. I don't want to lose the data in the files that won't load into an XmlDocument. So I was thinking maybe using a RegEx function, sending a Node Name to the function and having it return the InnerText. Is this a good use for RegEx, or is there a better...
11
3115
by: Steve | last post by:
Hi All, I'm having a tough time converting the following regex.compile patterns into the new re.compile format. There is also a differences in the regsub.sub() vs. re.sub() Could anyone lend a hand? import regsub
6
5904
by: Martin Evans | last post by:
Sorry, yet another REGEX question. I've been struggling with trying to get a regular expression to do the following example in Python: Search and replace all instances of "sleeping" with "dead". This parrot is sleeping. Really, it is sleeping. to This parrot is dead. Really, it is dead.
9
2092
by: jmchadha | last post by:
I have got the following html: "something in html ... etc.. city1... etc... <a class="font1" href="city1.html" onclick="etc."click for <b>info</bon city1 </a> ... some html. city1.. can repeat lot of times here.... Requirement: ------------------- I want to get the value of "href" i.e "city1.html" by searching "city1" between the <a</atag. Please note that "city1" can repeat lot of
4
3149
by: Morgan Cheng | last post by:
In my case, I have to remove any line containing "0.000000" from input string. In below case, it takes about 100 ms for 2k size input string. Regex.Replace(inputString, ".*0\\.000000.*\n", ""); I want to optimize it, so i make a static member instance instead of using static func of Regex; static Regex filter= new Regex(".*0\\.000000.*\n", RegexOptions.Compiled); And use the static instance to replace the string....
7
2066
by: Nightcrawler | last post by:
Hi all, I am trying to use regular expressions to parse out mp3 titles into three different groups (artist, title and remix). I currently have three ways to name a mp3 file: Artist - Title Artist - Title (Remix) Artist - Title
0
1737
by: Karch | last post by:
I have these two methods that are chewing up a ton of CPU time in my application. Does anyone have any suggestions on how to optimize them or rewrite them without Regex? The most time-consuming operation by a long-shot is the regex.Replace. Basically the only purpose of it is to remove spaces between opening/closing tags and the element name. Surely there is a better way. private string FixupJavascript(string htmlCode) { string result...
4
2305
by: Danny Ni | last post by:
Hi, The following code snippet is causing CPU to max out on my local machine and production servers. It looks fine on Expresso though. Regex rgxVideo = new Regex(@"<embed(\s++\s*=\s*(""*""|'*'|*))*\s+src=\s*(""|')?http://www.g4tv.com/i?sv3?/(?<videokey>\d+)(""|')?(\s++\s*=\s*(""*""|'*'|*))*\s*(/\s*>|>\s*</embed>)", RegexOptions.IgnoreCase); string strBody = "<embed name=\"VideoPlayer\" src=\"http://localhost/lv3/26757\" width=\"480\"...
0
9551
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10507
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10279
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10036
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9092
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6815
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5473
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5607
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4150
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.