473,381 Members | 1,602 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,381 software developers and data experts.

RE: Extract string from log file

from each line separate out url and request parts. split the request into key-value pairs, use urllib to unquote key-value pairs......as show below...

import urllib
line = "GET /stat.gif?stat=v&c=F-Secure&v=1.1%20Build%2014231&s=av%7BNorton%20360%2 0%28Symantec%20Corporation%29+69%3B%7Dsw%7BNorton% 20360%20%28Symantec%20Corporation%29+69%3B%7Dfw%7B Norton%20360%20%28Symantec%20Corporation%29+5%3B%7 Dv%7BMicrosoft%20Windows%20XP+insecure%3BMicrosoft %20Windows%20XP%20Professional+f%3B26027%3B26447%3 B26003%3B22452%3B%7D&r=0.9496 HTTP/1.1"
words = line.split()
for word in words:
if word.find('?') >= 0:
req = word[word.find('?') + 1:]
kwds = req.split('&')
for kv in kwds:
print urllib.unquote(kv)
stat=v
c=F-Secure
v=1.1 Build 14231
s=av{Norton 360 (Symantec Corporation)+69;}sw{Norton 360 (Symantec Corporation)+69;}fw{Norton 360 (Symantec Corporation)+5;}v{Microsoft Windows XP+insecure;Microsoft Windows XP Professional+f;26027;26447;26003;22452;}
r=0.9496

good luck
Edwin

-----Original Message-----
From: py************************************************ **@python.org
[mailto:py***************************************** *********@python.org]
On Behalf Of jo*********@googlemail.com
Sent: Saturday, August 09, 2008 10:48 AM
To: py*********@python.org
Subject: Extract string from log file
203.114.10.66 - - [01/Aug/2008:05:41:21 +0300] "GET /stat.gif?
stat=v&c=F-Secure&v=1.1%20Build%2014231&s=av%7BNorton
%20360%20%28Symantec%20Corporation%29+69%3B%7Dsw%7 BNorton
%20360%20%28Symantec%20Corporation%29+69%3B%7Dfw%7 BNorton
%20360%20%28Symantec%20Corporation%29+5%3B%7Dv%7BM icrosoft%20Windows
%20XP+insecure%3BMicrosoft%20Windows%20XP%20Profes sional+f
%3B26027%3B26447%3B26003%3B22452%3B%7D&r=0.9496 HTTP/1.1" 200 43
"http://dfstage1.f-secure.com/fshc/1.1/release/devbw/1.1.14231/
card.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1; .NET CLR 2.0.50727)"

does anyone know how can i extract certain string from this log file
using regular expression in python or using XML. can teach me.
--
http://mail.python.org/mailman/listinfo/python-list
The information contained in this message and any attachment may be
proprietary, confidential, and privileged or subject to the work
product doctrine and thus protected from disclosure. If the reader
of this message is not the intended recipient, or an employee or
agent responsible for delivering this message to the intended
recipient, you are hereby notified that any dissemination,
distribution or copying of this communication is strictly prohibited.
If you have received this communication in error, please notify me
immediately by replying to this message and deleting it and all
copies and backups thereof. Thank you.
Aug 9 '08 #1
1 2665
On Aug 9, 11:22*pm, Edwin.Mad...@VerizonWireless.com wrote:
from each line separate out url and request parts. split the request intokey-value pairs, use urllib to unquote key-value pairs......as show below....

import urllib
line = "GET /stat.gif?stat=v&c=F-Secure&v=1.1%20Build%2014231&s=av%7BNorton%20360%2 0%28Symantec%20Corporation%29+69%3B%7Dsw%7BNorton% 20360%20%28Symantec%20Corporation%29+69%3B%7Dfw%7B Norton%20360%20%28Symantec%20Corporation%29+5%3B%7 Dv%7BMicrosoft%20Windows%20XP+insecure%3BMicrosoft %20Windows%20XP%20Professional+f%3B26027%3B26447%3 B26003%3B22452%3B%7D&r=0.9496 HTTP/1.1"
words = line.split()
for word in words:
if word.find('?') >= 0:
* * * * req = word[word.find('?') + 1:]
* * * kwds = req.split('&')
* * * for kv in kwds:
* * * * print urllib.unquote(kv)

stat=v
c=F-Secure
v=1.1 Build 14231
s=av{Norton 360 (Symantec Corporation)+69;}sw{Norton 360 (Symantec Corporation)+69;}fw{Norton 360 (Symantec Corporation)+5;}v{Microsoft Windows XP+insecure;Microsoft Windows XP Professional+f;26027;26447;26003;22452;}
r=0.9496

good luck
Edwin

-----Original Message-----
From: python-list-bounces+edwin.madari=verizonwireless....@python.or g

[mailto:python-list-bounces+edwin.madari=verizonwireless....@python.or g]
On Behalf Of josephty...@googlemail.com
Sent: Saturday, August 09, 2008 10:48 AM
To: python-l...@python.org
Subject: Extract string from log file

203.114.10.66 - - [01/Aug/2008:05:41:21 +0300] "GET /stat.gif?
stat=v&c=F-Secure&v=1.1%20Build%2014231&s=av%7BNorton
%20360%20%28Symantec%20Corporation%29+69%3B%7Dsw%7 BNorton
%20360%20%28Symantec%20Corporation%29+69%3B%7Dfw%7 BNorton
%20360%20%28Symantec%20Corporation%29+5%3B%7Dv%7BM icrosoft%20Windows
%20XP+insecure%3BMicrosoft%20Windows%20XP%20Profes sional+f
%3B26027%3B26447%3B26003%3B22452%3B%7D&r=0.9496 HTTP/1.1" 200 43
"http://dfstage1.f-secure.com/fshc/1.1/release/devbw/1.1.14231/
card.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1; .NET CLR 2.0.50727)"

does anyone know how can i extract certain string from this log file
using regular expression in python or using XML. can teach me.
--http://mail.python.org/mailman/listinfo/python-list

The information contained in this message and any attachment may be
proprietary, confidential, and privileged or subject to the work
product doctrine and thus protected from disclosure. *If the reader
of this message is not the intended recipient, or an employee or
agent responsible for delivering this message to the intended
recipient, you are hereby notified that any dissemination,
distribution or copying of this communication is strictly prohibited.
If you have received this communication in error, please notify me
immediately by replying to this message and deleting it and all
copies and backups thereof. *Thank you.

do you mind to explain further. based on the source code that you gave
me. what will it output. i wonder. Sorry i am new to string
extraction. i do understand your python coding. the only thing i don't
understand is this part.
for word in words:
if word.find('?') >= 0:
req = word[word.find('?') + 1:]
kwds = req.split('&')
for kv in kwds:
print urllib.unquote(kv)

what does this code do?
anyway, is this code automatic. what i mean is can it extract the
string everytime when a new log file is being output by the sever?
Aug 9 '08 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Sharon | last post by:
hi, I want to extract a string from a file, if the file is like this: 1 This is the string 2 3 4 how could I extract the string, starting from the 10th position (i.e. "T") and...
6
by: Mohammad-Reza | last post by:
Hi I want to extract icon of an exe file and want to know how. I look at the MSDN and find out that I can use ExtractIconEx() Windows API but in there are some changes to that api in c# I made...
8
by: nick | last post by:
Hi all can any one please tell me what is wrong in this code?? I'm new to deal with text files and extract data. i'm trying to look for data in a text file (3~4 pages) some lines start with a...
5
by: deko | last post by:
If I have random and unpredictable user agent strings containing URLs, what is the best way to extract the URL? For example, let's say the string looks like this: registered NYSE 943 <a...
1
by: nkg1234567 | last post by:
I'm trying to extract HTML from a website in the form of a string, and then I want to extract particular elements from the string using the substr function: here is some sample code that I have thus...
7
by: erikcw | last post by:
Hi all, I'm trying to extract zip file (containing an xml file) from an email so I can process it. But I'm running up against some brick walls. I've been googling and reading all afternoon, and...
0
by: napolpie | last post by:
DISCUSSION IN USER nappie writes: Hello, I'm Peter and I'm new in python codying and I'm using parsying to extract data from one meteo Arpege file. This file is long file and it's composed by...
3
by: learningvbnet | last post by:
Hi, I am trying to extract zipped files using Winzip in my VB.net application and I ran into 2 stone walls. 1. How do you handle file names with spaces. See psiProcess.Arguments For...
5
by: Steve | last post by:
Hi all Does anybody please know a way to extract an Image from a pdf file and save it as a TIFF? I have used a scanner to scan documents which are then placed on a server, but I need to...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.