473,398 Members | 2,113 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

Reading a portion of a file

I am using a script with a single file containing all data in multiple
sections. Each section begins with "#VS:CMD:command:START" and ends
with "#VS:CMD:command:STOP". There is a blank line in between each
section. I'm looking for the best way to grab one section at a time.
Will I have to read the entire file to a string and parse it further
or is it possible to grab the section directly when doing a read? I'm
guessing regex is the best possible way. Any help is greatly
appreciated.

Thanks

Mar 8 '07 #1
7 1598
On Mar 8, 5:12 pm, cmfvulcan...@gmail.com wrote:
I am using a script with a single file containing all data in multiple
sections. Each section begins with "#VS:CMD:command:START" and ends
with "#VS:CMD:command:STOP". There is a blank line in between each
section. I'm looking for the best way to grab one section at a time.
Will I have to read the entire file to a string and parse it further
or is it possible to grab the section directly when doing a read? I'm
guessing regex is the best possible way. Any help is greatly
appreciated.
Seems like something along these line will do:

_file_ = "filepart.txt"

begin_tag = '#VS:CMD:command:START'
end_tag = '#VS:CMD:command:STOP'

sections = []
new_section = []
for line in open(_file_):
line = line.strip()
if begin_tag in line:
new_section = []
elif end_tag in line:
sections.append(new_section)
else:
if line: new_section.append(line)

for s in sections: print s

If your want more control, perhaps flagging "inside_section",
"outside_section" is an idea.
Mar 8 '07 #2
On Mar 8, 11:52 am, "Rune Strand" <rune.str...@gmail.comwrote:
On Mar 8, 5:12 pm, cmfvulcan...@gmail.com wrote:
I am using a script with a single file containing all data in multiple
sections. Each section begins with "#VS:CMD:command:START" and ends
with "#VS:CMD:command:STOP". There is a blank line in between each
section. I'm looking for the best way to grab one section at a time.
Will I have to read the entire file to a string and parse it further
or is it possible to grab the section directly when doing a read? I'm
guessing regex is the best possible way. Any help is greatly
appreciated.

Seems like something along these line will do:

_file_ = "filepart.txt"

begin_tag = '#VS:CMD:command:START'
end_tag = '#VS:CMD:command:STOP'

sections = []
new_section = []
for line in open(_file_):
line = line.strip()
if begin_tag in line:
new_section = []
elif end_tag in line:
sections.append(new_section)
else:
if line: new_section.append(line)

for s in sections: print s

If your want more control, perhaps flagging "inside_section",
"outside_section" is an idea.
You probably don't want to use regex for something this simple; it's
likely to make things even more complicated. Is there a space between
the begin_tag and the first word of a section (same question with the
end_tag)?

Mar 8 '07 #3
On Mar 8, 12:46 pm, "Jordan" <jordan.tayl...@gmail.comwrote:
On Mar 8, 11:52 am, "Rune Strand" <rune.str...@gmail.comwrote:
On Mar 8, 5:12 pm, cmfvulcan...@gmail.com wrote:
I am using a script with a single file containing all data in multiple
sections. Each section begins with "#VS:CMD:command:START" and ends
with "#VS:CMD:command:STOP". There is a blank line in between each
section. I'm looking for the best way to grab one section at a time.
Will I have to read the entire file to a string and parse it further
or is it possible to grab the section directly when doing a read? I'm
guessing regex is the best possible way. Any help is greatly
appreciated.
Seems like something along these line will do:
_file_ = "filepart.txt"
begin_tag = '#VS:CMD:command:START'
end_tag = '#VS:CMD:command:STOP'
sections = []
new_section = []
for line in open(_file_):
line = line.strip()
if begin_tag in line:
new_section = []
elif end_tag in line:
sections.append(new_section)
else:
if line: new_section.append(line)
for s in sections: print s
If your want more control, perhaps flagging "inside_section",
"outside_section" is an idea.

You probably don't want to use regex for something this simple; it's
likely to make things even more complicated. Is there a space between
the begin_tag and the first word of a section (same question with the
end_tag)?
Sent the post too soon. What is the endline character for the file
type? What type of file is it? An example section would be nice
too. Cheers.

Mar 8 '07 #4
On Mar 8, 12:50 pm, "Jordan" <jordan.tayl...@gmail.comwrote:
On Mar 8, 12:46 pm, "Jordan" <jordan.tayl...@gmail.comwrote:
On Mar 8, 11:52 am, "Rune Strand" <rune.str...@gmail.comwrote:
On Mar 8, 5:12 pm, cmfvulcan...@gmail.com wrote:
I am using a script with a single file containing all data in multiple
sections. Each section begins with "#VS:CMD:command:START" and ends
with "#VS:CMD:command:STOP". There is a blank line in between each
section. I'm looking for the best way to grab one section at a time.
Will I have to read the entire file to a string and parse it further
or is it possible to grab the section directly when doing a read? I'm
guessing regex is the best possible way. Any help is greatly
appreciated.
Seems like something along these line will do:
_file_ = "filepart.txt"
begin_tag = '#VS:CMD:command:START'
end_tag = '#VS:CMD:command:STOP'
sections = []
new_section = []
for line in open(_file_):
line = line.strip()
if begin_tag in line:
new_section = []
elif end_tag in line:
sections.append(new_section)
else:
if line: new_section.append(line)
for s in sections: print s
If your want more control, perhaps flagging "inside_section",
"outside_section" is an idea.
You probably don't want to use regex for something this simple; it's
likely to make things even more complicated. Is there a space between
the begin_tag and the first word of a section (same question with the
end_tag)?

Sent the post too soon. What is the endline character for the file
type? What type of file is it? An example section would be nice
too. Cheers.
Ok, regex was my first thought because I used to use grep with Perl
and shell scripting to grab everything from one pattern to another
pattern. The file is just an unformatted file. What is below is
exactly what is in the file. There are no spaces between the beginning
and ending tags and the content. Would you recommend using spaces
there? And if so, why?

A sample of the file:

#VS:COMMAND:df:START
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vzfs 20971520 517652 20453868 3% /
tmpfs 2016032 44 2015988 1% /var/run
tmpfs 2016032 0 2016032 0% /var/lock
tmpfs 2016032 0 2016032 0% /dev/shm
tmpfs 2016032 44 2015988 1% /var/run
tmpfs 2016032 0 2016032 0% /var/lock
#VS:COMMAND:df:STOP

#VS:FILE:/proc/loadavg:START
0.00 0.00 0.00 1/32 14543
#VS:FILE:/proc/loadavg:STOP

#VS:FILE:/proc/meminfo:START
MemTotal: 524288 kB
MemFree: 450448 kB
Buffers: 0 kB
Cached: 0 kB
SwapCached: 0 kB
Active: 0 kB
Inactive: 0 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 524288 kB
LowFree: 450448 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
Mapped: 73840 kB
Slab: 0 kB
CommitLimit: 0 kB
Committed_AS: 248704 kB
PageTables: 0 kB
VmallocTotal: 0 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
#VS:FILE:/proc/meminfo:STOP

#VS:FILE:/proc/stat:START
cpu 67188 0 26366 391669264 656686 0 0
cpu0 24700 0 10830 195807826 373309 0 0
cpu1 42488 0 15536 195861438 283376 0 0
intr 0
swap 0 0
ctxt 18105366807
btime 1171391058
processes 26501285
procs_running 1
procs_blocked 0
#VS:FILE:/proc/stat:STOP

#VS:FILE:/proc/uptime:START
1962358.88 1577059.05
#VS:FILE:/proc/uptime:STOP

Mar 8 '07 #5
On Mar 8, 10:35 am, cmfvulcan...@gmail.com wrote:

(snipped)

>
Ok, regex was my first thought because I used to use grep with Perl
and shell scripting to grab everything from one pattern to another
pattern. The file is just an unformatted file. What is below is
exactly what is in the file. There are no spaces between the beginning
and ending tags and the content. Would you recommend using spaces
there? And if so, why?

A sample of the file:

You can use iterators:

import StringIO
import itertools

def group(line):
if line[-6:-1] == 'START':
group.current = group.current + 1
return group.current

group.current = 0

data = """
#VS:COMMAND:df:START
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vzfs 20971520 517652 20453868 3% /
tmpfs 2016032 44 2015988 1% /var/run
tmpfs 2016032 0 2016032 0% /var/lock
tmpfs 2016032 0 2016032 0% /dev/shm
tmpfs 2016032 44 2015988 1% /var/run
tmpfs 2016032 0 2016032 0% /var/lock
#VS:COMMAND:df:STOP

#VS:FILE:/proc/loadavg:START
0.00 0.00 0.00 1/32 14543
#VS:FILE:/proc/loadavg:STOP

#VS:FILE:/proc/meminfo:START
MemTotal: 524288 kB
MemFree: 450448 kB
Buffers: 0 kB
Cached: 0 kB
SwapCached: 0 kB
Active: 0 kB
Inactive: 0 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 524288 kB
LowFree: 450448 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
Mapped: 73840 kB
Slab: 0 kB
CommitLimit: 0 kB
Committed_AS: 248704 kB
PageTables: 0 kB
VmallocTotal: 0 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
#VS:FILE:/proc/meminfo:STOP

#VS:FILE:/proc/stat:START
cpu 67188 0 26366 391669264 656686 0 0
cpu0 24700 0 10830 195807826 373309 0 0
cpu1 42488 0 15536 195861438 283376 0 0
intr 0
swap 0 0
ctxt 18105366807
btime 1171391058
processes 26501285
procs_running 1
procs_blocked 0
#VS:FILE:/proc/stat:STOP

#VS:FILE:/proc/uptime:START
1962358.88 1577059.05
#VS:FILE:/proc/uptime:STOP
""".lstrip("\n");

fh = StringIO.StringIO(data)

sections = itertools.groupby(itertools.ifilter(lambda line: len(line)
1, fh),
lambda line: group(line))

for key, section in sections:
for line in section:
print key, line,
--
Hope this helps,
Steven

Mar 9 '07 #6
Here is the code I've come up with. Please feel free to critique it
and let me know what you would change. Also, as you can see I call
"open(SERVER,'r')" twice; but I want to only call it once, what would
the best way to do this be?

------------------------------------------------------------

import re

SERVER = "192.168.1.60"

# Pull all data from server file.
FILE = open(SERVER,'r')
ALLINFO = FILE.read()

# Grab a list of all sections in the server file.
SECTIONS = re.findall("(?m)^\#VS:\w*:.*:", ALLINFO)

# Remove duplicates from the list.
if SECTIONS:
SECTIONS.sort()
LAST = SECTIONS[-1]
for I in range(len(SECTIONS)-2, -1, -1):
if LAST==SECTIONS[i]: del SECTIONS[i]
else: LAST=SECTIONS[i]

# Pull data from each section and assign it a dictionary item.
# Data can be called using SECTIONDICT['section'] i.e
SECTIONDICT['df']
SECTIONDICT = {}
for SECT in SECTIONS:
PRESECTNAME1 = SECT[9:len(SECT) - 1]
PRESECTNAME2 = PRESECTNAME1.split("/")
SECTNAME = PRESECTNAME2[len(PRESECTNAME1.split("/")) - 1]
START = SECT + "START"
STOP = SECT + "STOP"
for LINE in open(SERVER,'r'):
LINE = LINE.strip()
if START in LINE:
SECTIONLISTTEMP = []
elif STOP in LINE:
SECTIONDICT[SECTNAME] = SECTIONLISTTEMP
SECTIONLISTTEMP = []
print "-" * 80
print "SECTION: %s" % SECTNAME
print SECTIONDICT[SECTNAME]
else:
if LINE:
SECTIONLISTTEMP.append(LINE)

FILE.close()

------------------------------------------------------------

Mar 9 '07 #7
En Fri, 09 Mar 2007 11:28:15 -0300, Vulcanius <cm**********@gmail.com>
escribió:
Here is the code I've come up with. Please feel free to critique it
and let me know what you would change. Also, as you can see I call
"open(SERVER,'r')" twice; but I want to only call it once, what would
the best way to do this be?
You got yesterday a reply from rune.strand@g... without regexps that looks
pretty functional, have you seen it?
SECTIONDICT = {}
for SECT in SECTIONS:
PRESECTNAME1 = SECT[9:len(SECT) - 1]
PRESECTNAME2 = PRESECTNAME1.split("/")
Ugh... don't use UPPERCASE names for variables, please!
Better to follow this style guide: http://www.python.org/dev/peps/pep-0008/

--
Gabriel Genellina

Mar 10 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: ohaya | last post by:
Hi, I'm a real newbie, but have been asked to try to fix a problem in one of our JSP pages that is suppose to read in a text file and display it. From my testing thus far, it appears this page...
4
by: JoelWhitehouse | last post by:
Hi! I want to write a script that will read a .php file on a remote server and print to the current page a portion of the text contained in the remote file. I am just wondering what the best...
1
by: Anatoly Kurilin | last post by:
Hi, I need to periodically read data from an DBF file which structure is permanent but its name and path change from reading to reading. Is it possible to do without setting a link. For instance,...
15
by: Anand Ganesh | last post by:
HI All, I have an Image. I want to clip a portion of it and copy to another image. How to do this? I know the bounding rectangle to clip. Any suggestions please. Thanks for your time and...
4
by: Henk | last post by:
Hi, I am new to the c-programming language and at the moment I am struggling with the following: I want to read a file using fread() and then put it in to memory. I want to use a (singel)...
4
by: Earl | last post by:
How do I read in a type from XML? I can successfully write the value out to the file as a type, but I cannot read it back in. Here is partial code for the save portion: Private NT As Type =...
4
by: Ganesh Muthuvelu | last post by:
Hi STAN, Stan: Thanks for your response to my previous post on reading a XSD file using your article in "https://blogs.msdn.com/stan_kitsis/archive/2005/08/06/448572.aspx". it works quite well...
4
by: News | last post by:
Hi Everyone, The attached code creates client connections to websphere queue managers and then processes an inquiry against them. The program functions when it gets options from the command...
4
by: Phoe6 | last post by:
Hi, I have a configfile, in fact, I am providing a configfile in the format: Name: Foo Author: Bar Testcases: tct123
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.