Connecting Tech Pros Worldwide Help | Site Map

file to nested list

Newbie
 
Join Date: Feb 2009
Posts: 31
#1: Mar 16 '09
I'm having alot of troubles with this...so basically i am given a file for example:

File =

>123
qwerty
>567
tyuiyuy
>987
poiuyt

and basically the output is supposed to be [[">123" , "qwerty"] , [">567" , "tyuiuy"] , ["987" , "poiuyt"]]

i have no idea how to do that can someone help me please?.
Newbie
 
Join Date: Jan 2009
Posts: 17
#2: Mar 16 '09

re: file to nested list


I'm assuming you know basic file I/O?

Basically, what you need to do is create an empty list (let's call it 'root'), open the file you want with open() (let's call this 'file'), then have a loop where you get two lines at a time using file.readline() (store these in two separate variables, 'a' and 'b' for example). Then you need to do root.append([a, b]), where [a, b] will end up being [">123" , "qwerty"] from your example. When there are no more lines to read, stop the loop, and there you have your list (root).

Does that make any sense?
Newbie
 
Join Date: Feb 2009
Posts: 31
#3: Mar 16 '09

re: file to nested list


Quote:

Originally Posted by musicfreak View Post

I'm assuming you know basic file I/O?

Basically, what you need to do is create an empty list (let's call it 'root'), open the file you want with open() (let's call this 'file'), then have a loop where you get two lines at a time using file.readline() (store these in two separate variables, 'a' and 'b' for example). Then you need to do root.append([a, b]), where [a, b] will end up being [">123" , "qwerty"] from your example. When there are no more lines to read, stop the loop, and there you have your list (root).

Does that make any sense?

yep that makes a lot of sense, and i understand fully what your trying to say except i forgot to say that the file could also be

file =

>123
qwerty
ioweio
>567
tyuiyuy
>987
poiuyt

output = [[">123" , "qwertyioweio"] , [">567" , "tyuiuy"] , ["987" , "poiuyt"]]

the output for >123 is "qwertyioweio" because for simplicity's sake i made the file look that way to look simple, so basically after "qwerty" it went to a new line because it was full. i hope that made sense....

also is there a way to use a for loop for this question?
Newbie
 
Join Date: Jan 2009
Posts: 17
#4: Mar 16 '09

re: file to nested list


Oh I see. Well in that case, you're going to need to make a temporary list for each "sub list," and if the line starts with '>', make that line the first part of the list, and vice versa. Then just append that list to the 'root'.

You could use a for loop, although it would be easier not to. Is this for an assignment? If you have to use a for loop, then you could read the file all at once into a string and split() it by newlines, and then loop through each line (for line in lines:, or something like that).

EDIT: Actually, now that I think about it, the way I just mentioned would probably be simpler haha.
Newbie
 
Join Date: Feb 2009
Posts: 31
#5: Mar 16 '09

re: file to nested list


is your way by using a while loop?...=S(which i hate)..
Newbie
 
Join Date: Jan 2009
Posts: 17
#6: Mar 16 '09

re: file to nested list


Yeah, the first method I mentioned would use a while loop (it was the first thing that popped into my head since that's how I learned it), but I guess using a for loop would, in the end, be much simpler and cleaner. Go for the 'for' loop. :)
bvdet's Avatar
Moderator
 
Join Date: Oct 2006
Location: Nashville, TN
Posts: 1,560
#7: Mar 16 '09

re: file to nested list


Let us assume that you have read the file using file object method readlines().
Expand|Select|Wrap|Line Numbers
  1. data = ['>123\n', 'qwerty\n', 'ioweio\n', '>567\n', 'tyuiyuy\n', '>987\n', 'poiuyt\n']
  2.  
  3. output = []
  4.  
  5. for item in data:
  6.     if item.startswith('>'):
  7.         output.append([item.strip(), ''])
  8.     else:
  9.         output[-1][1] += item.strip()
  10.  
  11. print output
Output:
>>> [['>123', 'qwertyioweio'], ['>567', 'tyuiyuy'], ['>987', 'poiuyt']]
>>>
Newbie
 
Join Date: Feb 2009
Posts: 31
#8: Mar 16 '09

re: file to nested list


i think im close...can you debug my code

Expand|Select|Wrap|Line Numbers
  1. import urllib
  2.  
  3. def qwerty(f):
  4.  
  5.     a = urllib.urlopen(f)
  6.     s = ""
  7.     l = a.readlines()
  8.     y = s.join(l)
  9.     #p = y.replace('\r', "")
  10.     q = y.split('\r')
  11.  
  12.     output = [] 
  13.  
  14.     for item in q: 
  15.         if item.startswith('>'): 
  16.             output.append([item.strip(), '']) 
  17.         else: 
  18.             output[-1][1] += item.strip() 
  19.  
  20.     print output 
and the output that comes out when i test it with

Expand|Select|Wrap|Line Numbers
  1. qwerty("http://www.utsc.utoronto.ca/~szamosi/a20/assignments/a3/starter/sequences.txt")
is it will give me the whole thing as a list within a list, it doesnt seperate the numbers within...because in you "data" string the \n appears after the string whereas in mine it appears before..i think =s
bvdet's Avatar
Moderator
 
Join Date: Oct 2006
Location: Nashville, TN
Posts: 1,560
#9: Mar 16 '09

re: file to nested list


You have made it too complicated. There is no need to join, split, or replace. String method strip() will remove all leading and training whitespace. You created an open file object, but never closed it. You should try to make your variable names more descriptive instead of using single letters. In my example, variable data is comparable to your variable l.

HTH
Newbie
 
Join Date: Feb 2009
Posts: 31
#10: Mar 16 '09

re: file to nested list


ah ok now i see...thanks for your repliess!!
boxfish's Avatar
Expert
 
Join Date: Mar 2008
Location: California
Posts: 478
#11: Mar 17 '09

re: file to nested list


Quote:

Originally Posted by bvdet View Post

You should try to make your variable names more descriptive instead of using single letters. In my example, variable data is comparable to your variable l.

Sorry, bvdet, but read this. How about calling it file_lines instead?
bvdet's Avatar
Moderator
 
Join Date: Oct 2006
Location: Nashville, TN
Posts: 1,560
#12: Mar 17 '09

re: file to nested list


Quote:

Originally Posted by boxfish View Post

Sorry, bvdet, but read this. How about calling it file_lines instead?

That would work. I sometimes name a variable of short duration, such as an open file object (f) or loop counter (i), with a single letter. Stringing several in a row makes the code unreadable. I might do something like this:
Expand|Select|Wrap|Line Numbers
  1. def qwerty(fn): # fn for file name
  2.  
  3.     f = urllib.urlopen(fn) # **EDIT**
  4.     fList = a.readlines() # or lineList or line_list or lines
  5.     f.close()
boxfish's Avatar
Expert
 
Join Date: Mar 2008
Location: California
Posts: 478
#13: Mar 18 '09

re: file to nested list


Yeah, all my loop variables are called i, j, k, l, m, n, o, etc. It makes my nested loops unreadable.
Edit:
Wow, urllib is a built in library I can use! I had no idea.
Reply