473,804 Members | 1,974 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

list manipulation

20 New Member
Hi there,

I am having some trouble with list manipulation and was hoping someone could help me.

I have no problem reading in a text file as a list using filename.readli nes(). However, I am having some trouble with searching through this list to display the elements that I want.

I have a text file with this general format:

***************
text text text text
text text text text
text text text text
text text text text
***************
text text text text
text text text text
***************
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
***************

etc. and what I'm having trouble trying to do is:
1) group together the text sections which are in between the astericks, so as to make separate 'records' (in a sense...)
2) if I want to search for a certain word within aforementioned 'records', I want to display the entire 'record' in which the search word is inside of (and this search word can be inside 1 or more 'records')

No matter what combination of loops I try, I am unsuccessful. Either the first line of the file prints out for ever, or my code isn't able to print anything out at all.

Please help!
Mar 27 '07 #1
25 2757
bvdet
2,851 Recognized Expert Moderator Specialist
Hi there,

I am having some trouble with list manipulation and was hoping someone could help me.

I have no problem reading in a text file as a list using filename.readli nes(). However, I am having some trouble with searching through this list to display the elements that I want.

I have a text file with this general format:

***************
text text text text
text text text text
text text text text
text text text text
***************
text text text text
text text text text
***************
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
***************

etc. and what I'm having trouble trying to do is:
1) group together the text sections which are in between the astericks, so as to make separate 'records' (in a sense...)
2) if I want to search for a certain word within aforementioned 'records', I want to display the entire 'record' in which the search word is inside of (and this search word can be inside 1 or more 'records')

No matter what combination of loops I try, I am unsuccessful. Either the first line of the file prints out for ever, or my code isn't able to print anything out at all.

Please help!
This piece of code will get you started:
Expand|Select|Wrap|Line Numbers
  1. fList = [x.strip().split() for x in open('your_file').readlines()]
You now have a list of all lines split into words and stripped of newlines.

Initialize an output list.
Iterate through the list 'fList'.
If the keyword is 'in' the item, append the item to the output list.
If encapsulated in a function, return the output list.

HTH :)
Mar 27 '07 #2
ghostdog74
511 Recognized Expert Contributor
using this sample input you provided: say you want to search for "word"
Expand|Select|Wrap|Line Numbers
  1. ***************
  2. text text text text
  3. text text text text
  4. text text text text
  5. text text text text
  6. ***************
  7. text word text text
  8. text text text text
  9. ***************
  10. text text text text
  11. text text text text
  12. text text text text
  13. text text text text
  14. text text text text
  15. text text text text
  16. text text text text
  17. ***************
  18.  
Assuming the number of asterix remain constant.
Expand|Select|Wrap|Line Numbers
  1. >>> data = open("file").read().split("***************")
  2. >>> data
  3. ['', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '\ntext word text text\ntext text text text\n', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '']
  4. >>> for items in  data:
  5. ...  if "word" in items:
  6. ...   print ''.join(items) #print block where "word" is found
  7. ...
  8.  
  9. text word text text
  10. text text text text
  11.  
Mar 27 '07 #3
bartonc
6,596 Recognized Expert Expert
Hi there,

I am having some trouble with list manipulation and was hoping someone could help me.

I have no problem reading in a text file as a list using filename.readli nes(). However, I am having some trouble with searching through this list to display the elements that I want.

I have a text file with this general format:

***************
text text text text
text text text text
text text text text
text text text text
***************
text text text text
text text text text
***************
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
text text text text
***************

etc. and what I'm having trouble trying to do is:
1) group together the text sections which are in between the astericks, so as to make separate 'records' (in a sense...)
2) if I want to search for a certain word within aforementioned 'records', I want to display the entire 'record' in which the search word is inside of (and this search word can be inside 1 or more 'records')

No matter what combination of loops I try, I am unsuccessful. Either the first line of the file prints out for ever, or my code isn't able to print anything out at all.

Please help!
This is certainly a job for regular expressions. Regular exressions (the re module on Python) are a very under utilized tool. This is probably due to their quirky syntax and terse nature. I don't have the skills (yet) to help at this moment, but I an exited to tear into my new copy of Masering Regular Expressions very soon. There are some web site tutorials to get you started.

Basically, you write an expression that says "give me all the text between *** and *** that also has the searched for word(s), and the re module returns all that text. It is very worthwhile to (at least) experiment with. Have fun!
Mar 27 '07 #4
bvdet
2,851 Recognized Expert Moderator Specialist
using this sample input you provided: say you want to search for "word"
Expand|Select|Wrap|Line Numbers
  1. ***************
  2. text text text text
  3. text text text text
  4. text text text text
  5. text text text text
  6. ***************
  7. text word text text
  8. text text text text
  9. ***************
  10. text text text text
  11. text text text text
  12. text text text text
  13. text text text text
  14. text text text text
  15. text text text text
  16. text text text text
  17. ***************
  18.  
Assuming the number of asterix remain constant.
Expand|Select|Wrap|Line Numbers
  1. >>> data = open("file").read().split("***************")
  2. >>> data
  3. ['', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '\ntext word text text\ntext text text text\n', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '']
  4. >>> for items in  data:
  5. ...  if "word" in items:
  6. ...   print ''.join(items) #print block where "word" is found
  7. ...
  8.  
  9. text word text text
  10. text text text text
  11.  
Oops - I misread OP that a record is in between asterisks instead of each line. This is basically the same as ghostdog74's post:
Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. def get_records(fn,key):
  4.     fList = [i for i in re.split(r'\*', open(fn).read()) if i != '']
  5.     records = []
  6.     for item in fList:
  7.         if key in item:
  8.             # to get rid of newlines
  9.             # records.append(' '.join(item.strip().split('\n')))
  10.             records.append(item)
  11.     return records   
  12.  
  13.  
  14. fn = 'your_file'
  15. key = 'word'
  16. records =  get_records(fn, key)
  17. for record in records:
  18.     print record
>>>
text word text text
text text text text
Mar 27 '07 #5
PNY
20 New Member
Thanks so much everyone! My code is working!!! Thanks so much for all your help!! :D


Oops - I misread OP that a record is in between asterisks instead of each line. This is basically the same as ghostdog74's post:
Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. def get_records(fn,key):
  4.     fList = [i for i in re.split(r'\*', open(fn).read()) if i != '']
  5.     records = []
  6.     for item in fList:
  7.         if key in item:
  8.             # to get rid of newlines
  9.             # records.append(' '.join(item.strip().split('\n')))
  10.             records.append(item)
  11.     return records   
  12.  
  13.  
  14. fn = 'your_file'
  15. key = 'word'
  16. records =  get_records(fn, key)
  17. for record in records:
  18.     print record
>>>
text word text text
text text text text
Mar 27 '07 #6
ghostdog74
511 Recognized Expert Contributor
Oops - I misread OP that a record is in between asterisks instead of each line. This is basically the same as ghostdog74's post:
Expand|Select|Wrap|Line Numbers
  1. ....
  2.     fList = [i for i in re.split(r'\*', open(fn).read()) if i != '']
  3. ...
  4.  
one minor issue here is, you are splitting on only 1 "*" and the output will have lots of '' , due to many "*" being splitted
Expand|Select|Wrap|Line Numbers
  1. ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
  2.  
may be a better way is to split on many "*". eg
Expand|Select|Wrap|Line Numbers
  1.  re.split('\*+', open("file").read()) 
eg output
Expand|Select|Wrap|Line Numbers
  1. ['', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '\ntext text text text\ntext text text text\n', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '']
  2.  
Mar 28 '07 #7
bvdet
2,851 Recognized Expert Moderator Specialist
one minor issue here is, you are splitting on only 1 "*" and the output will have lots of '' , due to many "*" being splitted
Expand|Select|Wrap|Line Numbers
  1. ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
  2.  
may be a better way is to split on many "*". eg
Expand|Select|Wrap|Line Numbers
  1.  re.split('\*+', open("file").read()) 
eg output
Expand|Select|Wrap|Line Numbers
  1. ['', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '\ntext text text text\ntext text text text\n', '\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\ntext text text text\n', '']
  2.  
That's why I excluded the '' in the list comprehension. Good point though.
Mar 28 '07 #8
PNY
20 New Member
Hi again,

I have a (hopefully) simple question:

Now that the code is working and is able to search and display (in the cmd prompt) the proper search results, I am now outputting to a text file, which is all fine and dandy, but for some reason, it's only ouputting the last occurence of the search word (and its associated record it is found in).

Here is what I have so far:

Expand|Select|Wrap|Line Numbers
  1. import re, sys
  2.  
  3. fn = sys.argv[1]
  4. key = sys.argv[2]
  5.  
  6. def get_records(fn, key):
  7.      fList = [i for i in re.split(r'\*', open(fn).read()) if i != '']
  8.      records = []
  9.      for item in fList:
  10.           if key in item:
  11.                records.append(item)
  12.      return records
  13.  
  14. records = get_records(fn, key)
  15. for record in records:
  16.      former, sys.stdout = sys.stdout, open('output.txt', 'w')
  17.      print record
  18.      results, sys.stdout = sys.stdout, former
  19.      results.close()
  20.  
Please help (again!) :)

That's why I excluded the '' in the list comprehension. Good point though.
Mar 28 '07 #9
PNY
20 New Member
Hi again,

I realized what my mistake was but I have yet another question regarding the same code.

I would like to insert a line (the 'underscore' line) in between each 'record'. How would I do that?
Mar 28 '07 #10

Sign in to post your reply or Sign up for a free account.

Similar topics

1
8295
by: Joseph Barron | last post by:
Here is a SIMPLE problem that I'm trying to solve. It works in Netscape 6.2, but IE6 gives ""No such interface supported." Below are page1.htm and page2.htm . In page1.htm, there are two dropdown lists. If you change the selection of the left one (e.g. choose parentoption2), it should open up page2.htm in a popup window.
7
2616
by: Kieran Simkin | last post by:
Hi all, I'm having some trouble with a linked list function and was wondering if anyone could shed any light on it. Basically I have a singly-linked list which stores pid numbers of a process's children - when a child is fork()ed its pid is added to the linked list. I then have a SIGCHLD handler which is supposed to remove the pid from the list when a child exits. The problem I'm having is that very very occasionally and seemingly...
6
4603
by: Steve Lambert | last post by:
Hi, I've knocked up a number of small routines to create and manipulate a linked list of any structure. If anyone could take a look at this code and give me their opinion and details of any potential pitfalls I'd be extremely grateful. Cheers Steve
10
2521
by: Ben | last post by:
Hi, I am a newbie with C and am trying to get a simple linked list working for my program. The structure of each linked list stores the char *data and *next referencing to the next link. The problem I get is that I am trying to link a struct that I have defined and its refusing to link. I have tried casting my struct into char * but attempts to cast it back to its original struct to access its contents only seg faults.
77
17073
by: Ville Vainio | last post by:
I tried to clear a list today (which I do rather rarely, considering that just doing l = works most of the time) and was shocked, SHOCKED to notice that there is no clear() method. Dicts have it, sets have it, why do lists have to be second class citizens?
12
3955
by: joshd | last post by:
Hello, Im sorry if this question has been asked before, but I did search before posting and couldnt find an answer to my problem. I have two classes each with corresponding linked lists, list1 and list2, each node within list1 has various data and needs to have a pointer to the corresponding node in list2, but I cant figure out how to do this. Could someone explain what I might be missing, or maybe point me in the direction of a good...
0
2592
by: L'eau Prosper Research | last post by:
Press Release: L'eau Prosper Research (Website: http://www.leauprosper.com) releases new TradeStation 8 Add-on - L'eau Prosper Market Manipulation Profiling Tools Set. L'eau Prosper Market Manipulation Profiling Tools Set is a set of advanced tools that help Serious Traders analyze the market direction, market manipulative behavior and predicting the change of trend.
0
2356
by: L'eau Prosper Research | last post by:
NEW TradeStation 8 Add-on - L'eau Prosper Market Manipulation Profiling Tools Set By L'eau Prosper Research Press Release: L'eau Prosper Research (Website: http://www.leauprosper.com) releases new TradeStation 8 Add-on - L'eau Prosper Market Manipulation Profiling Tools Set. L'eau Prosper Market Manipulation Profiling Tools Set is a set of
2
6293
by: dave.dex | last post by:
Hi all, I've been searching the docs like mad and I'm a little new to python so apologies if this is a basic question. I would like to extract the results of the following query into a list - SELECT columnname FROM tablename. I use the following code. # Create a connection object and create a cursor db = MySQLdb.Connect(<my-db-info)
0
9716
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10604
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10361
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9179
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6874
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5536
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5676
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4316
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3006
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.