364,033 Members | 4774 Browsing Online
Community for Developers & IT Professionals
Bytes IT Community

Parsing Amazon Access Log file

adarwish
P: 9
Please Help me in parsing Amazon s3 Access log file. using .NET C#

the is an example for a row from the file

Expand|Select|Wrap|Line Numbers
  1. e9393f8b003f121858490f62e9339f29ab5919261b8da33746cedb356e29f774 alzwad [18/Jan/2012:22:27:24 +0000] 41.155.169.152 - 6A70EBA012DB351A REST.GET.OBJECT Me/mefiles/videos/icons/240x320/18-1-12-9-19-56-egypt-movie3.jpg "GET /Me/mefiles/videos/icons/240x320/18-1-12-9-19-56-egypt-movie3.jpg?id=01229466985&ph=01229466985&width=240&high=320 HTTP/1.1" 200 - 1590 1590 34 33 "-" "Nokia2700c-2/2.0 (07.15) Profile/MIDP-2.1 Configuration/CLDC-1.1 UNTRUSTED/1.0" -
  2.  
the parameters are separated with a (Space, Double Quotes and Square Brackets)

can any one tell me how to parse it.
Jan 20 '12 #1

✓ answered by danp129

Expand|Select|Wrap|Line Numbers
  1. using System.Text.RegularExpressions;
  2. using System.Text;
  3. ////////
  4. private void Button1_Click(System.Object sender, System.EventArgs e)
  5. {
  6.     string logLine = "e9393f8b003f121858490f62e9339f29ab5919261b8da33746cedb356e29f774 alzwad [18/Jan/2012:22:27:24 +0000] 41.155.169.152 - 6A70EBA012DB351A REST.GET.OBJECT Me/mefiles/videos/icons/240x320/18-1-12-9-19-56-egypt-movie3.jpg \"GET /Me/mefiles/videos/icons/240x320/18-1-12-9-19-56-egypt-movie3.jpg?id=01229466985&ph=01229466985&width=240&high=320 HTTP/1.1\" 200 - 1590 1590 34 33 \"-\" \"Nokia2700c-2/2.0 (07.15) Profile/MIDP-2.1 Configuration/CLDC-1.1 UNTRUSTED/1.0\" -";
  7.     Regex re = new Regex("(\\S+) (\\S+) \\[(.*?)\\] (\\S+) (\\S+) (\\S+) (\\S+) (\\S+) \"([^\"]+)\" (\\S+) (\\S+) (\\S+) (\\S+) (\\S+) (\\S+) \"([^\"]+)\" \"([^\"]+)\"", RegexOptions.IgnoreCase);
  8.     RegularExpressions.Match reMatch = re.Matches(logLine).Item(0);
  9.     for (int i = 1; i <= reMatch.Groups.Count - 1; i += 1) {
  10.         Debug.Print(reMatch.Groups(i).Captures(0).Value.Trim('"'));
  11.     }
  12. }
The regex pattern is from https://forums.aws.amazon.com/messag...ssageID=186803
Share this Question
Share on Google+
9 Replies


Rabbit
Expert Mod 5K+
P: 6,652
There is no "Space, Double Quotes and Square Brackets" combination in the row you posted.
Jan 20 '12 #2

adarwish
P: 9
This is an example

"GET /Me/mefiles/videos/icons/240x320/18-1-12-9-19-56-egypt-movie3.jpg?id=01229466985&ph=01229466985&width=240 &high=320 HTTP/1.1"
Jan 20 '12 #3

Rabbit
Expert Mod 5K+
P: 6,652
That has no space followed by double quote followed by square bracket. If that is truly what you want, then you need to redescribe what you want because that's not what you said in the first post.
Jan 20 '12 #4

adarwish
P: 9
Dear Rabbit,

thanks for replying.

Actually the problem in: if i split on spaces i will found some parameters contains spaces but surrounded with double quotes or square brackets
Jan 20 '12 #5

Rabbit
Expert Mod 5K+
P: 6,652
What it sounds like is you don't want to split on spaces at all but on double quotes and brackets.
Jan 20 '12 #6

adarwish
P: 9
i want to split on spaces but the problem is some parameters contains spaces but surrounded with double quotes or square brackets
Jan 20 '12 #7

Rabbit
Expert Mod 5K+
P: 6,652
Then split on the quotes and brackets first. Then split on spaces on those that remain.
Jan 20 '12 #8

danp129
Expert 100+
P: 313
Expand|Select|Wrap|Line Numbers
  1. using System.Text.RegularExpressions;
  2. using System.Text;
  3. ////////
  4. private void Button1_Click(System.Object sender, System.EventArgs e)
  5. {
  6.     string logLine = "e9393f8b003f121858490f62e9339f29ab5919261b8da33746cedb356e29f774 alzwad [18/Jan/2012:22:27:24 +0000] 41.155.169.152 - 6A70EBA012DB351A REST.GET.OBJECT Me/mefiles/videos/icons/240x320/18-1-12-9-19-56-egypt-movie3.jpg \"GET /Me/mefiles/videos/icons/240x320/18-1-12-9-19-56-egypt-movie3.jpg?id=01229466985&ph=01229466985&width=240&high=320 HTTP/1.1\" 200 - 1590 1590 34 33 \"-\" \"Nokia2700c-2/2.0 (07.15) Profile/MIDP-2.1 Configuration/CLDC-1.1 UNTRUSTED/1.0\" -";
  7.     Regex re = new Regex("(\\S+) (\\S+) \\[(.*?)\\] (\\S+) (\\S+) (\\S+) (\\S+) (\\S+) \"([^\"]+)\" (\\S+) (\\S+) (\\S+) (\\S+) (\\S+) (\\S+) \"([^\"]+)\" \"([^\"]+)\"", RegexOptions.IgnoreCase);
  8.     RegularExpressions.Match reMatch = re.Matches(logLine).Item(0);
  9.     for (int i = 1; i <= reMatch.Groups.Count - 1; i += 1) {
  10.         Debug.Print(reMatch.Groups(i).Captures(0).Value.Trim('"'));
  11.     }
  12. }
The regex pattern is from https://forums.aws.amazon.com/messag...ssageID=186803
Jan 20 '12 #9

adarwish
P: 9
Thanks danp129 it Worked fine
Jan 20 '12 #10

Post your reply

Help answer this question



Didn't find the answer to your ASP.NET question?

You can also browse similar questions: ASP.NET