PNY 20
New Member
Hi there,
I am having some trouble with list manipulation and was hoping someone could help me.
I have no problem reading in a text file as a list using filename.readli nes(). However, I am having some trouble with searching through this list to display the elements that I want.
I have a text file with this general format:
***************
text text text text
text text text text
text text text text
text text text text
***************
text text text text
text text text text
***************
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
***************
etc. and what I'm having trouble trying to do is:
1) group together the text sections which are in between the astericks, so as to make separate 'records' (in a sense...)
2) if I want to search for a certain word within aforementioned 'records', I want to display the entire 'record' in which the search word is inside of (and this search word can be inside 1 or more 'records')
No matter what combination of loops I try, I am unsuccessful. Either the first line of the file prints out for ever, or my code isn't able to print anything out at all.
Please help!
25 2757 bvdet 2,851
Recognized Expert Moderator Specialist
Hi there,
I am having some trouble with list manipulation and was hoping someone could help me.
I have no problem reading in a text file as a list using filename.readli nes(). However, I am having some trouble with searching through this list to display the elements that I want.
I have a text file with this general format:
***************
text text text text
text text text text
text text text text
text text text text
***************
text text text text
text text text text
***************
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
***************
etc. and what I'm having trouble trying to do is:
1) group together the text sections which are in between the astericks, so as to make separate 'records' (in a sense...)
2) if I want to search for a certain word within aforementioned 'records', I want to display the entire 'record' in which the search word is inside of (and this search word can be inside 1 or more 'records')
No matter what combination of loops I try, I am unsuccessful. Either the first line of the file prints out for ever, or my code isn't able to print anything out at all.
Please help!
This piece of code will get you started: - fList = [x.strip().split() for x in open('your_file').readlines()]
You now have a list of all lines split into words and stripped of newlines.
Initialize an output list.
Iterate through the list 'fList'.
If the keyword is 'in' the item, append the item to the output list.
If encapsulated in a function, return the output list.
HTH :)
using this sample input you provided: say you want to search for "word" -
***************
-
text text text text
-
text text text text
-
text text text text
-
text text text text
-
***************
-
text word text text
-
text text text text
-
***************
-
text text text text
-
text text text text
-
text text text text
-
text text text text
-
text text text text
-
text text text text
-
text text text text
-
***************
-
Assuming the number of asterix remain constant. -
>>> data = open("file").read().split("***************")
-
>>> data
-
['', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '\ntext word text text\ntext text text text\n', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '']
-
>>> for items in data:
-
... if "word" in items:
-
... print ''.join(items) #print block where "word" is found
-
...
-
-
text word text text
-
text text text text
-
bartonc 6,596
Recognized Expert Expert
Hi there,
I am having some trouble with list manipulation and was hoping someone could help me.
I have no problem reading in a text file as a list using filename.readli nes(). However, I am having some trouble with searching through this list to display the elements that I want.
I have a text file with this general format:
***************
text text text text
text text text text
text text text text
text text text text
***************
text text text text
text text text text
***************
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
***************
etc. and what I'm having trouble trying to do is:
1) group together the text sections which are in between the astericks, so as to make separate 'records' (in a sense...)
2) if I want to search for a certain word within aforementioned 'records', I want to display the entire 'record' in which the search word is inside of (and this search word can be inside 1 or more 'records')
No matter what combination of loops I try, I am unsuccessful. Either the first line of the file prints out for ever, or my code isn't able to print anything out at all.
Please help!
This is certainly a job for regular expressions. Regular exressions (the re module on Python) are a very under utilized tool. This is probably due to their quirky syntax and terse nature. I don't have the skills (yet) to help at this moment, but I an exited to tear into my new copy of Masering Regular Expressions very soon. There are some web site tutorials to get you started.
Basically, you write an expression that says "give me all the text between *** and *** that also has the searched for word(s), and the re module returns all that text. It is very worthwhile to (at least) experiment with. Have fun!
bvdet 2,851
Recognized Expert Moderator Specialist
using this sample input you provided: say you want to search for "word" -
***************
-
text text text text
-
text text text text
-
text text text text
-
text text text text
-
***************
-
text word text text
-
text text text text
-
***************
-
text text text text
-
text text text text
-
text text text text
-
text text text text
-
text text text text
-
text text text text
-
text text text text
-
***************
-
Assuming the number of asterix remain constant. -
>>> data = open("file").read().split("***************")
-
>>> data
-
['', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '\ntext word text text\ntext text text text\n', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '']
-
>>> for items in data:
-
... if "word" in items:
-
... print ''.join(items) #print block where "word" is found
-
...
-
-
text word text text
-
text text text text
-
Oops - I misread OP that a record is in between asterisks instead of each line. This is basically the same as ghostdog74's post: - import re
-
-
def get_records(fn,key):
-
fList = [i for i in re.split(r'\*', open(fn).read()) if i != '']
-
records = []
-
for item in fList:
-
if key in item:
-
# to get rid of newlines
-
# records.append(' '.join(item.strip().split('\n')))
-
records.append(item)
-
return records
-
-
-
fn = 'your_file'
-
key = 'word'
-
records = get_records(fn, key)
-
for record in records:
-
print record
>>>
text word text text
text text text text
PNY 20
New Member
Thanks so much everyone! My code is working!!! Thanks so much for all your help!! :D
Oops - I misread OP that a record is in between asterisks instead of each line. This is basically the same as ghostdog74's post: - import re
-
-
def get_records(fn,key):
-
fList = [i for i in re.split(r'\*', open(fn).read()) if i != '']
-
records = []
-
for item in fList:
-
if key in item:
-
# to get rid of newlines
-
# records.append(' '.join(item.strip().split('\n')))
-
records.append(item)
-
return records
-
-
-
fn = 'your_file'
-
key = 'word'
-
records = get_records(fn, key)
-
for record in records:
-
print record
>>>
text word text text
text text text text
Oops - I misread OP that a record is in between asterisks instead of each line. This is basically the same as ghostdog74's post: -
....
-
fList = [i for i in re.split(r'\*', open(fn).read()) if i != '']
-
...
-
one minor issue here is, you are splitting on only 1 "*" and the output will have lots of '' , due to many "*" being splitted -
['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
-
may be a better way is to split on many "*". eg - re.split('\*+', open("file").read())
eg output -
['', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '\ntext text text text\ntext text text text\n', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '']
-
bvdet 2,851
Recognized Expert Moderator Specialist
one minor issue here is, you are splitting on only 1 "*" and the output will have lots of '' , due to many "*" being splitted -
['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
-
may be a better way is to split on many "*". eg - re.split('\*+', open("file").read())
eg output -
['', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '\ntext text text text\ntext text text text\n', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '']
-
That's why I excluded the '' in the list comprehension. Good point though.
PNY 20
New Member
Hi again,
I have a (hopefully) simple question:
Now that the code is working and is able to search and display (in the cmd prompt) the proper search results, I am now outputting to a text file, which is all fine and dandy, but for some reason, it's only ouputting the last occurence of the search word (and its associated record it is found in).
Here is what I have so far: -
import re, sys
-
-
fn = sys.argv[1]
-
key = sys.argv[2]
-
-
def get_records(fn, key):
-
fList = [i for i in re.split(r'\*', open(fn).read()) if i != '']
-
records = []
-
for item in fList:
-
if key in item:
-
records.append(item)
-
return records
-
-
records = get_records(fn, key)
-
for record in records:
-
former, sys.stdout = sys.stdout, open('output.txt', 'w')
-
print record
-
results, sys.stdout = sys.stdout, former
-
results.close()
-
Please help (again!) :)
That's why I excluded the '' in the list comprehension. Good point though.
PNY 20
New Member
Hi again,
I realized what my mistake was but I have yet another question regarding the same code.
I would like to insert a line (the 'underscore' line) in between each 'record'. How would I do that?
Sign in to post your reply or Sign up for a free account.
Similar topics |
by: Joseph Barron |
last post by:
Here is a SIMPLE problem that I'm trying to solve.
It works in Netscape 6.2, but IE6 gives ""No such interface
supported."
Below are page1.htm and page2.htm .
In page1.htm, there are two dropdown lists. If you change the
selection of the left one (e.g. choose parentoption2), it
should open up page2.htm in a popup window.
|
by: Kieran Simkin |
last post by:
Hi all,
I'm having some trouble with a linked list function and was wondering if
anyone could shed any light on it. Basically I have a singly-linked list
which stores pid numbers of a process's children - when a child is fork()ed
its pid is added to the linked list. I then have a SIGCHLD handler which is
supposed to remove the pid from the list when a child exits. The problem I'm
having is that very very occasionally and seemingly...
|
by: Steve Lambert |
last post by:
Hi,
I've knocked up a number of small routines to create and manipulate a linked
list of any structure. If anyone could take a look at this code and give me
their opinion and details of any potential pitfalls I'd be extremely
grateful.
Cheers
Steve
|
by: Ben |
last post by:
Hi,
I am a newbie with C and am trying to get a simple linked list working
for my program. The structure of each linked list stores the char
*data and *next referencing to the next link. The problem I get is
that I am trying to link a struct that I have defined and its refusing
to link. I have tried casting my struct into char * but attempts to
cast it back to its original struct to access its contents only seg
faults.
|
by: Ville Vainio |
last post by:
I tried to clear a list today (which I do rather rarely, considering
that just doing l = works most of the time) and was shocked, SHOCKED
to notice that there is no clear() method. Dicts have it, sets have it,
why do lists have to be second class citizens?
| |
by: joshd |
last post by:
Hello,
Im sorry if this question has been asked before, but I did search
before posting and couldnt find an answer to my problem. I have two
classes each with corresponding linked lists, list1 and list2, each
node within list1 has various data and needs to have a pointer to the
corresponding node in list2, but I cant figure out how to do this.
Could someone explain what I might be missing, or maybe point me in the
direction of a good...
|
by: L'eau Prosper Research |
last post by:
Press Release:
L'eau Prosper Research (Website: http://www.leauprosper.com) releases
new TradeStation 8 Add-on - L'eau Prosper Market Manipulation
Profiling Tools Set.
L'eau Prosper Market Manipulation Profiling Tools Set is a set of
advanced tools that help Serious Traders analyze the market direction,
market manipulative behavior and predicting the change of trend.
|
by: L'eau Prosper Research |
last post by:
NEW TradeStation 8 Add-on - L'eau Prosper Market Manipulation
Profiling Tools Set By L'eau Prosper Research
Press Release:
L'eau Prosper Research (Website: http://www.leauprosper.com) releases
new TradeStation 8 Add-on - L'eau Prosper Market Manipulation
Profiling Tools Set.
L'eau Prosper Market Manipulation Profiling Tools Set is a set of
|
by: dave.dex |
last post by:
Hi all,
I've been searching the docs like mad and I'm a little new to python
so apologies if this is a basic question.
I would like to extract the results of the following query into a list
- SELECT columnname FROM tablename. I use the following code.
# Create a connection object and create a cursor
db = MySQLdb.Connect(<my-db-info)
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |