473,385 Members | 1,356 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Grep Equivalent for Python

Hello all,

I come from a shell/perl background and have just to learn python. To
start with, I'm trying to obtain system information from a Linux
server using the /proc FS. For example, in order to obtain the amount
of physical memory on the server, I would do the following in shell:

grep ^MemTotal /proc/meminfo | awk '{print $2}'

That would get me the exact number that I need. Now, I'm trying to do
this in python. Here is where I have gotten so far:

memFile = open('/proc/meminfo')
for line in memFile.readlines():
print re.search('MemTotal', line)
memFile.close()

I guess what I'm trying to logically do is... read through the file
line by line, grab the pattern I want and assign that to a variable.
The above doesn't really work, it comes back with something like
"<_sre.SRE_Match object at 0xb7f9d6b0>" when a match is found.

Any help with this would be greatly appreciated.
Tom

Mar 14 '07 #1
15 8171
tereglow wrote:
Hello all,

I come from a shell/perl background and have just to learn python. To
start with, I'm trying to obtain system information from a Linux
server using the /proc FS. For example, in order to obtain the amount
of physical memory on the server, I would do the following in shell:

grep ^MemTotal /proc/meminfo | awk '{print $2}'

That would get me the exact number that I need. Now, I'm trying to do
this in python. Here is where I have gotten so far:

memFile = open('/proc/meminfo')
for line in memFile.readlines():
print re.search('MemTotal', line)
memFile.close()

I guess what I'm trying to logically do is... read through the file
line by line, grab the pattern I want and assign that to a variable.
The above doesn't really work, it comes back with something like
"<_sre.SRE_Match object at 0xb7f9d6b0>" when a match is found.

Any help with this would be greatly appreciated.
Tom
Regular expressions aren't really needed here. Untested code follows:

for line in open('/proc/meminfo').readlines:
if line.startswith("Memtotal:"):
name, amt, unit = line.split()
print name, amt, unit
break

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
See you at PyCon? http://us.pycon.org/TX2007

Mar 14 '07 #2
On 14 Mar, 13:37, "tereglow" <tom.rectenw...@eglow.netwrote:
Hello all,

I come from a shell/perl background and have just to learn python.
Welcome aboard!
To start with, I'm trying to obtain system information from a Linux
server using the /proc FS. For example, in order to obtain the amount
of physical memory on the server, I would do the following in shell:

grep ^MemTotal /proc/meminfo | awk '{print $2}'

That would get me the exact number that I need. Now, I'm trying to do
this in python. Here is where I have gotten so far:

memFile = open('/proc/meminfo')
for line in memFile.readlines():
print re.search('MemTotal', line)
You could even use the regular expression '^MemTotal' as seen in your
original, or use the match function instead of search. However...
memFile.close()

I guess what I'm trying to logically do is... read through the file
line by line, grab the pattern I want and assign that to a variable.
The above doesn't really work, it comes back with something like
"<_sre.SRE_Match object at 0xb7f9d6b0>" when a match is found.
This is because re.search and re.match (and other things) return match
objects if the regular expression has been found in the provided
string. See this page in the library documentation:

http://docs.python.org/lib/match-objects.html
Any help with this would be greatly appreciated.
The easiest modification to your code is to replace the print
statement with this:

match = re.search('MemTotal', line)
if match is not None:
print match.group()

You can simplify this using various idioms, I imagine, but what you
have to do is to test for a match, then to print the text that
matched. The "group" method lets you get the whole matching text (if
you don't provide any arguments), or individual groups (applicable
when you start putting groups in your regular expressions).

Paul

Mar 14 '07 #3
Steve Holden a écrit :
Regular expressions aren't really needed here. Untested code follows:

for line in open('/proc/meminfo').readlines:
for line in open('/proc/meminfo').readlines():
if line.startswith("Memtotal:"):
name, amt, unit = line.split()
print name, amt, unit
break

regards
Steve
Mar 14 '07 #4
I come from a shell/perl background and have just to learn python. To
start with, I'm trying to obtain system information from a Linux
server using the /proc FS. For example, in order to obtain the amount
of physical memory on the server, I would do the following in shell:

grep ^MemTotal /proc/meminfo | awk '{print $2}'

That would get me the exact number that I need. Now, I'm trying to do
this in python. Here is where I have gotten so far:

memFile = open('/proc/meminfo')
for line in memFile.readlines():
print re.search('MemTotal', line)
memFile.close()
import sys
sys.stdout.write(L.split()[1] + '\n' for L in open('/proc/meminfo') if
L.startswith('MemTotal'))

--
mvh Björn
Mar 14 '07 #5
In <et**********@news2.u-psud.fr>, Laurent Pointal wrote:
Steve Holden a écrit :
>Regular expressions aren't really needed here. Untested code follows:

for line in open('/proc/meminfo').readlines:
for line in open('/proc/meminfo').readlines():
for line in open('/proc/meminfo'):
> if line.startswith("Memtotal:"):
name, amt, unit = line.split()
print name, amt, unit
break
Of course it's cleaner to assign the file object to a name and close the
file explicitly after the loop.

Ciao,
Marc 'BlackJack' Rintsch
Mar 14 '07 #6
On Mar 14, 11:57 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
In <et8u0e$k2...@news2.u-psud.fr>, Laurent Pointal wrote:
Steve Holden a écrit :
Regular expressions aren't really needed here. Untested code follows:
for line in open('/proc/meminfo').readlines:
for line in open('/proc/meminfo').readlines():

for line in open('/proc/meminfo'):
if line.startswith("Memtotal:"):
name, amt, unit = line.split()
print name, amt, unit
break

Of course it's cleaner to assign the file object to a name and close the
file explicitly after the loop.

Ciao,
Marc 'BlackJack' Rintsch
Thanks all for the help with this, am learning a lot; really
appreciate it.
Tom

Mar 14 '07 #7
Okay,

It is now working as follows:

memFile = open('/proc/meminfo')
for line in memFile.readlines():
if line.startswith("MemTotal"):
memStr = line.split()
memTotal = memStr[1]
memFile.close()
print "Memory: " + memTotal + "kB"

I'm learning the whole try, finally exception stuff so will add that
in as well. Now, I'm trying to figure out the CPU speed. In shell,
I'd do:

grep "^cpu MHz" /proc/cpuinfo | awk '{print $4}' | head -1

The "head -1" is added because if the server has 2 or more processors,
2 or more lines will result, and I only need data from the first
line. So, now I'm looking for the equivalent to "head (or tail" in
Python. Is this a case where I'll need to break down and use the re
module? No need to give me the answer, a hint in the right direction
would be great though.

Thanks again,
Tom

Mar 14 '07 #8
tereglow wrote:
Okay,

It is now working as follows:

memFile = open('/proc/meminfo')
for line in memFile.readlines():
if line.startswith("MemTotal"):
memStr = line.split()
memTotal = memStr[1]
memFile.close()
print "Memory: " + memTotal + "kB"

I'm learning the whole try, finally exception stuff so will add that
in as well. Now, I'm trying to figure out the CPU speed. In shell,
I'd do:

grep "^cpu MHz" /proc/cpuinfo | awk '{print $4}' | head -1

The "head -1" is added because if the server has 2 or more processors,
2 or more lines will result, and I only need data from the first
line. So, now I'm looking for the equivalent to "head (or tail" in
Python. Is this a case where I'll need to break down and use the re
module? No need to give me the answer, a hint in the right direction
would be great though.

Thanks again,
Tom
If you are interested in a number of fields I'd create a dict or set
containing the keys you are interested in. For each line, if the text
indicates you are interested in the value then extract the value and
store it in a dict against the text as a key.

Something like (untested):

kwdlist = "cpu MHz|MemTotal"
d = dict((x, None) for x in kwdlist.split("|"))
memFile = open('/proc/meminfo')
for line in memFile.readlines():
keyword = line.split(":")[0]
if keyword in d and d[keyword] is None:
d[keyword] = line.split()[1]
memFile.close()

This should give you a dict with non-None values against the keywords
you have found. Because of the "and d[keyword] is None" test you won;t
overwrite an existing value, meaning you only see the first value for
any given keyword.

Again, bear in mind this code is untested.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
See you at PyCon? http://us.pycon.org/TX2007

Mar 14 '07 #9
On Mar 14, 9:37 am, "tereglow" <tom.rectenw...@eglow.netwrote:
Hello all,

I come from a shell/perl background and have just to learn python. To
start with, I'm trying to obtain system information from a Linux
server using the /proc FS. For example, in order to obtain the amount
of physical memory on the server, I would do the following in shell:

grep ^MemTotal /proc/meminfo | awk '{print $2}'

That would get me the exact number that I need. Now, I'm trying to do
this in python. Here is where I have gotten so far:

memFile = open('/proc/meminfo')
for line in memFile.readlines():
print re.search('MemTotal', line)
memFile.close()

I guess what I'm trying to logically do is... read through the file
line by line, grab the pattern I want and assign that to a variable.
The above doesn't really work, it comes back with something like
"<_sre.SRE_Match object at 0xb7f9d6b0>" when a match is found.
Ok, two things:

1. You don't need the .readlines() part. Python can iterate over the
file object itself.
2. You don't need the re module for this particular situation; you
could simply use the 'in' operator.

You could write it like this:

memFile = open('/proc/meminfo')
for line in memFile:
if 'MemTotal' in line: print line
memFile.close()

[]s
FZero

Mar 14 '07 #10
On Mar 14, 10:57 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
In <et8u0e$k2...@news2.u-psud.fr>, Laurent Pointal wrote:
Steve Holden a écrit :
Regular expressions aren't really needed here. Untested code follows:
for line in open('/proc/meminfo').readlines:
for line in open('/proc/meminfo').readlines():

for line in open('/proc/meminfo'):
Yeah, that's nicer.
Of course it's cleaner to assign the file object to a name and close the
file explicitly after the loop.
For certain definitions of "cleaner" (insert old argument about how
ref-counting semantics or at least immediate gc of locally scoped
variables when leaving scope _should be_ (not _are_) language-
guaranteed because it makes for cleaner, more programmer-friendly code
and often avoids ugly hacks like assigning a spurious name and/or
using "with" constructs).

But if you're going to do that, "with" is the better option IMO:

from __future__ import with_statement
....
with open('/proc/meminfo') as infile:
for line in infile:

Of course, that alternative requires Python 2.5

Mar 14 '07 #11
tereglow <to************@eglow.netwrote:
...
server using the /proc FS. For example, in order to obtain the amount
of physical memory on the server, I would do the following in shell:

grep ^MemTotal /proc/meminfo | awk '{print $2}'
If you would indeed do that, maybe it's also worth learning something
more about the capabilities of your "existing" tools, since

awk '/^MemTotal/ {print $2}' /proc/meminfo

is a more compact and faster way to perform exactly the same task.

(You already received a ton of good responses about doing this in
Python, but the "pipe grep into awk instead of USING awk properly in the
first place!" issue has been a pet peeve of mine for almost 30 years
now, and you know what they say about old dogs + new tricks!-).
Alex
Mar 15 '07 #12
Alex Martelli wrote:
tereglow <to************@eglow.netwrote:
...
server using the /proc FS. For example, in order to obtain the amount
of physical memory on the server, I would do the following in shell:

grep ^MemTotal /proc/meminfo | awk '{print $2}'

If you would indeed do that, maybe it's also worth learning something
more about the capabilities of your "existing" tools, since

awk '/^MemTotal/ {print $2}' /proc/meminfo

is a more compact and faster way to perform exactly the same task.

That's correct on small files, but note that at least on my platform
(Linux, GNU grep and awk), "grep" is just massively insanely faster
than awk (to the point where doing the equivalent on /usr/share/dict/
words takes 10-20% of the time with "grep re|awk..." as it does with
"awk '/re...'"

On this small proc file, the plain-awk version is faster (and it's
cleaner looking), as Alex points out.

But for large files, "grep" is often _way_ faster than awk/perl/python/
whatever alternative, easily swamping the fork/exec cost. It's often
quite handy to keep that in mind.

Mar 15 '07 #13
On Mar 15, 1:47 am, a...@mac.com (Alex Martelli) wrote:
tereglow <tom.rectenw...@eglow.netwrote:

...
server using the /proc FS. For example, in order to obtain the amount
of physical memory on the server, I would do the following in shell:
grep^MemTotal /proc/meminfo | awk '{print $2}'

If you would indeed do that, maybe it's also worth learning something
more about the capabilities of your "existing" tools, since

awk '/^MemTotal/ {print $2}' /proc/meminfo

is a more compact and faster way to perform exactly the same task.

(You already received a ton of good responses about doing this in
Python, but the "pipegrepinto awk instead of USING awk properly in the
first place!" issue has been a pet peeve of mine for almost 30 years
now, and you know what they say about old dogs + new tricks!-).

Alex
I had no idea you could do that. Thanks for the tip, I need to start
reading that awk/sed book collecting dust on my shelf!
Mar 18 '07 #14
In article <11**********************@e1g2000hsg.googlegroups. com>,
tereglow <to************@eglow.netwrote:
>On Mar 15, 1:47 am, a...@mac.com (Alex Martelli) wrote:
>tereglow <tom.rectenw...@eglow.netwrote:
>>>
grep^MemTotal /proc/meminfo | awk '{print $2}'

If you would indeed do that, maybe it's also worth learning something
more about the capabilities of your "existing" tools, since

awk '/^MemTotal/ {print $2}' /proc/meminfo

is a more compact and faster way to perform exactly the same task.

(You already received a ton of good responses about doing this in
Python, but the "pipegrepinto awk instead of USING awk properly in the
first place!" issue has been a pet peeve of mine for almost 30 years
now, and you know what they say about old dogs + new tricks!-).

I had no idea you could do that. Thanks for the tip, I need to start
reading that awk/sed book collecting dust on my shelf!
Your other option is to completely abandon awk/sed. I started writing
stuff like this in Turbo Pascal back in the early 80s because there
simply wasn't anything like awk/sed available for CP/M. In the 90s, when
I needed to do similar things, I used Perl. Now I use Python.

From my POV, there is really no reason to learn the advanced shell
utilities.
--
Aahz (aa**@pythoncraft.com) <* http://www.pythoncraft.com/

"Typing is cheap. Thinking is expensive." --Roy Smith
Mar 18 '07 #15
On Mar 18, 7:33 pm, a...@pythoncraft.com (Aahz) wrote:
In article <1174255846.721961.108...@e1g2000hsg.googlegroups. com>,

tereglow <tom.rectenw...@eglow.netwrote:
On Mar 15, 1:47 am, a...@mac.com (Alex Martelli) wrote:
tereglow <tom.rectenw...@eglow.netwrote:
>>grep^MemTotal /proc/meminfo | awk '{print $2}'
If you would indeed do that, maybe it's also worth learning something
more about the capabilities of your "existing" tools, since
awk '/^MemTotal/ {print $2}' /proc/meminfo
is a more compact and faster way to perform exactly the same task.
(You already received a ton of good responses about doing this in
Python, but the "pipegrepinto awk instead of USING awk properly in the
first place!" issue has been a pet peeve of mine for almost 30 years
now, and you know what they say about old dogs + new tricks!-).
I had no idea you could do that. Thanks for the tip, I need to start
reading that awk/sed book collecting dust on my shelf!

Your other option is to completely abandon awk/sed. I started writing
stuff like this in Turbo Pascal back in the early 80s because there
simply wasn't anything like awk/sed available for CP/M. In the 90s, when
I needed to do similar things, I used Perl. Now I use Python.

From my POV, there is really no reason to learn the advanced shell
utilities.
--
Aahz (a...@pythoncraft.com) <* http://www.pythoncraft.com/

"Typing is cheap. Thinking is expensive." --Roy Smith- Hide quoted text -

- Show quoted text -
Well, my goal is to become proficient enough at Python, that I can
replace most shell functionality with it. I was partially successful
when learning Perl. The trouble is that I started with shell, awk/sed/
grep, that sort of stuff. It is somewhat difficult to break free of
"shell" like thinking and programming when you have created habits of
coding in that style. I've recently started to work with an
application (HP Application Mapping) that has an awful lot of Jython
code running in the background; so that project sort of inspired me to
start really digging into Python; to gain a better understanding of
the application, and for any added benefit; especially in regards to
automating systems administration. Am really enjoying learning it,
though it hurts the head a bit, trying to re-train myself. Things
take time though, I'm not giving up!

Mar 19 '07 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: sf | last post by:
Just started thinking about learning python. Is there any place where I can get some free examples, especially for following kind of problem ( it must be trivial for those using python) I have...
3
by: David Isaac | last post by:
What's the standard replacement for the obsolete grep module? Thanks, Alan Isaac
4
by: js | last post by:
Just my curiosity. Can python beats perl at speed of grep-like processing? $ wget http://www.gutenberg.org/files/7999/7999-h.zip $ unzip 7999-h.zip $ cd 7999-h $ cat *.htm bigfile $ du -h...
13
by: Anton Slesarev | last post by:
I've read great paper about generators: http://www.dabeaz.com/generators/index.html Author say that it's easy to write analog of common linux tools such as awk,grep etc. He say that performance...
47
by: Henning_Thornblad | last post by:
What can be the cause of the large difference between re.search and grep? This script takes about 5 min to run on my computer: #!/usr/bin/env python import re row="" for a in range(156000):...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.