473,289 Members | 1,840 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,289 software developers and data experts.

Parsing a log file, etc

16
i have to make 3 seperate functions that:

function1. Counts the lines in a my 'logfile' and returns the answer in idle.

function2. counts how many times a certain user is in the 'logfile' for instance how many times 'johnny.killed.peter' has been logged in the 'logfile' and returns only the answer

function3. shows what files the user 'johnny.killed.peter' has looked at and writes the names of the files in a new file called 'pole3.out' with one name per row and it has to b ein alphabetical order with the most popular one first.

number 3 can be a dictionary.

thanks for all ur help guys u are really saving my life!
Oct 25 '07 #1
12 1957
bartonc
6,596 Expert 4TB
i have to make 3 seperate functions that:<snip>
function2. counts how many times a certain user is in the 'logfile' for instance how many times 'johnny.killed.peter' has been logged in the 'logfile' and returns only the answer

function3. shows what files the user 'johnny.killed.peter' has looked at and writes the names of the files in a new file called 'pole3.out' with one name per row and it has to b ein alphabetical order with the most popular one first.

number 3 can be a dictionary.
We'll actually need a sample of your log file for this.
Oct 25 '07 #2
bartonc
6,596 Expert 4TB
i have to make 3 seperate functions that:

function1. Counts the lines in a my 'logfile' and returns the answer in idle.
<snip>
thanks for all ur help guys u are really saving my life!
Expand|Select|Wrap|Line Numbers
  1. >>> def NumLines(filename):
  2. ...     try:
  3. ...         f = open(filename)
  4. ...         nLines = len(f.readlines())
  5. ...         print 'There are %d lines in %s' %(nLines, filename)
  6. ...     except IOError, error:
  7. ...         print error
  8. ...         
  9. >>> NumLines('module1.py')
  10. There are 165 lines in module1.py
  11. >>> NumLines('module1.p')
  12. [Errno 2] No such file or directory: 'module1.p'
  13. >>> 
Oct 25 '07 #3
DDCane
16
has anyone come up with anything on function 2 and 3? ive been fooling around with this for s couple days and cant seem to get it right. thanks for all the help.


Expand|Select|Wrap|Line Numbers
  1. >>> def NumLines(filename):
  2. ...     try:
  3. ...         f = open(filename)
  4. ...         nLines = len(f.readlines())
  5. ...         print 'There are %d lines in %s' %(nLines, filename)
  6. ...     except IOError, error:
  7. ...         print error
  8. ...         
  9. >>> NumLines('module1.py')
  10. There are 165 lines in module1.py
  11. >>> NumLines('module1.p')
  12. [Errno 2] No such file or directory: 'module1.p'
  13. >>> 
Oct 29 '07 #4
DDCane
16
def countclient(filename, client):
f = open(filename)
x = len(f.readlines(client))
return x

does that look like something that could work?






has anyone come up with anything on function 2 and 3? ive been fooling around with this for s couple days and cant seem to get it right. thanks for all the help.
Oct 29 '07 #5
bvdet
2,851 Expert Mod 2GB
has anyone come up with anything on function 2 and 3? ive been fooling around with this for s couple days and cant seem to get it right. thanks for all the help.
If we had a sample of your log file, I am sure we could show you how to get the data from it
Oct 29 '07 #6
bartonc
6,596 Expert 4TB
does that look like something that could work?
readlines() takes no arguments and it's good practice to close() th file when you are done with it.
Expand|Select|Wrap|Line Numbers
  1. def countclient(filename, client):
  2.     f = open(filename)
  3.     x = len(f.readlines())
  4.     f.close()
  5.     return x
Your posting is a bit out of hand. At 14 posts you are requited to follow all site rules (Posting Guidelines). Most perturbing is your lack of CODE tags. Instructions are right there on the right hand side of the page while you post.
Oct 30 '07 #7
bvdet
2,851 Expert Mod 2GB
def countclient(filename, client):
f = open(filename)
x = len(f.readlines(client))
return x

does that look like something that could work?
Function 2 could be something like this:
Expand|Select|Wrap|Line Numbers
  1. user = 'bill.smith'
  2. logfile = 'log.txt'
  3.  
  4. f = open(logfile)
  5. count = 0
  6. for line in f:
  7.     if user in line:
  8.         count += 1
  9. f.close()
  10.  
  11. print count
Oct 30 '07 #8
bvdet
2,851 Expert Mod 2GB
As a function, it may look something like this:
Expand|Select|Wrap|Line Numbers
  1. def log_count(fn, user):
  2.     f = open(logfile)
  3.     logList = [line for line in f if user in line]
  4.     f.close()
  5.  
  6.     return logList    
  7.  
  8. user = 'bill.smith'
  9. logfile = 'log.txt'
  10.  
  11. logList = log_count(logfile, user)
  12. print len(logList)
  13.  
Oct 30 '07 #9
ghostdog74
511 Expert 256MB
Expand|Select|Wrap|Line Numbers
  1. >>> def NumLines(filename):
  2. ...     try:
  3. ...         f = open(filename)
  4. ...         nLines = len(f.readlines())
  5. ...         print 'There are %d lines in %s' %(nLines, filename)
  6. ...     except IOError, error:
  7. ...         print error
  8. ...         
  9. >>> NumLines('module1.py')
  10. There are 165 lines in module1.py
  11. >>> NumLines('module1.p')
  12. [Errno 2] No such file or directory: 'module1.p'
  13. >>> 
just a minor note using readlines method. len(readlines()) will count the extra blank line as a line,
eg
sample input:
line1
line2
line3

line4
line5
the above will produce 6. If that's what OP wants then its alright. this can be worked around by stripping off using .strip() :)

another method (out of the many) to count without including blank/new lines
Expand|Select|Wrap|Line Numbers
  1. for num,item in enumerate(open("file")): pass
  2. print num
  3.  
Oct 30 '07 #10
ghostdog74
511 Expert 256MB
function2. counts how many times a certain user is in the 'logfile' for instance how many times 'johnny.killed.peter' has been logged in the 'logfile' and returns only the answer
Expand|Select|Wrap|Line Numbers
  1. >>> data=open("file").read()
  2. >>> data.count("johnny.killed.peter")
  3.  
you make your own function.
Oct 30 '07 #11
bvdet
2,851 Expert Mod 2GB
def countclient(filename, client):
f = open(filename)
x = len(f.readlines(client))
return x

does that look like something that could work?
Now that you have some more information us, can you show us some more of your work? It would be best for you to attempt this on your own, then we can help from there.
Oct 30 '07 #12
bvdet
2,851 Expert Mod 2GB
Assume the visited page is in the format '....GET page_name HTTP....' in each line of the log file.
Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. # Compile a list of entries for a specific 'user' in log file 'fn'
  4. def log_count(fn, user):
  5.     f = open(logfile)
  6.     logList = [line for line in f if line.startswith(user)]
  7.     f.close()
  8.     return logList
  9.  
  10. # Create a dictionary using logList
  11. # Dictionary keys are the web pages visited
  12. # Dictionary values are the number of visits
  13. def visit_count(logList):
  14.     patt = re.compile(r'GET (.+) HTTP')
  15.     dd = {}
  16.     for log in logList:
  17.         page = patt.search(log).group(1)
  18.         if page in dd:
  19.             dd[page] += 1
  20.         else:
  21.             dd[page] = 1
  22.     return dd
  23.  
  24. # Create list of value, key pairs from dictionary
  25. # Sort on value (decending) and key (ascending)
  26. # Disply sorted data
  27. def visit_process(dd):
  28.     def comp(a, b):
  29.         x = cmp(a[0], b[0])
  30.         if not x:
  31.             return cmp(a[1], b[1])
  32.         return -x
  33.     pageList = zip(visitDict.values(), visitDict.keys())
  34.     pageList.sort(comp)
  35.     for item in pageList:
  36.         print 'Page "%s" was visited %d time%s.' % (item[1], item[0], ['', 's'][item[0]>1 or 0])
Oct 31 '07 #13

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Willem Ligtenberg | last post by:
I decided to use SAX to parse my xml file. But the parser crashes on: File "/usr/lib/python2.3/site-packages/_xmlplus/sax/handler.py", line 38, in fatalError raise exception...
2
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home...
3
by: Pir8 | last post by:
I have a complex xml file, which contains stories within a magazine. The structure of the xml file is as follows: <?xml version="1.0" encoding="ISO-8859-1" ?> <magazine> <story>...
1
by: Christoph Bisping | last post by:
Hello! Maybe someone is able to give me a little hint on this: I've written a vb.net app which is mainly an interpreter for specialized CAD/CAM files. These files mainly contain simple movement...
4
by: Rick Walsh | last post by:
I have an HTML table in the following format: <table> <tr><td>Header 1</td><td>Header 2</td></tr> <tr><td>1</td><td>2</td></tr> <tr><td>3</td><td>4</td></tr> <tr><td>5</td><td>6</td></tr>...
3
by: toton | last post by:
Hi, I have some ascii files, which are having some formatted text. I want to read some section only from the total file. For that what I am doing is indexing the sections (denoted by .START in...
9
by: Paulers | last post by:
Hello, I have a log file that contains many multi-line messages. What is the best approach to take for extracting data out of each message and populating object properties to be stored in an...
13
by: Chris Carlen | last post by:
Hi: Having completed enough serial driver code for a TMS320F2812 microcontroller to talk to a terminal, I am now trying different approaches to command interpretation. I have a very simple...
13
by: charliefortune | last post by:
I am fetching some product feeds with PHP like this $merch = substr($key,1); $feed = file_get_contents($_POST); $fp = fopen("./feeds/feed".$merch.".txt","w+"); fwrite ($fp,$feed); fclose...
2
by: Felipe De Bene | last post by:
I'm having problems parsing an HTML file with the following syntax : <TABLE cellspacing=0 cellpadding=0 ALIGN=CENTER BORDER=1 width='100%'> <TH BGCOLOR='#c0c0c0' Width='3%'>User ID</TH> <TH...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: marcoviolo | last post by:
Dear all, I would like to implement on my worksheet an vlookup dynamic , that consider a change of pivot excel via win32com, from an external excel (without open it) and save the new file into a...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.