473,385 Members | 1,331 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

simple way to un-nest (flatten?) list

djc
There is I am sure an easy way to do this, but I seem to be brain dead
tonight. So:

I have a table such that I can do

[line for line in table if line[7]=='JDOC']
and
[line for line in table if line[7]=='Aslib']
and
[line for line in table if line[7]=='ASLIB']
etc

I also have a dictionary
r= {'a':('ASLIB','Aslib'),'j':('JDOC', 'jdoc')}
so I can extract values
r.values()
[('ASLIB', 'Aslib'), ('JDOC', 'jdoc')]

I would like to do

[line for line in table if line[7] in ('JDOC','jdoc','Aslib','ASLIB')]

so how should I get from
{'a':('ASLIB','Aslib'),'j':('JDOC','jdoc')}
to
('Aslib','ASLIB','JDOC','jdoc')

--
djc
Nov 5 '06 #1
5 2680
djc wrote:
There is I am sure an easy way to do this, but I seem to be brain dead
tonight. So:

I have a table such that I can do

[line for line in table if line[7]=='JDOC']
and
[line for line in table if line[7]=='Aslib']
and
[line for line in table if line[7]=='ASLIB']
etc

I also have a dictionary
r= {'a':('ASLIB','Aslib'),'j':('JDOC', 'jdoc')}
so I can extract values
r.values()
[('ASLIB', 'Aslib'), ('JDOC', 'jdoc')]

I would like to do

[line for line in table if line[7] in ('JDOC','jdoc','Aslib','ASLIB')]

so how should I get from
{'a':('ASLIB','Aslib'),'j':('JDOC','jdoc')}
to
('Aslib','ASLIB','JDOC','jdoc')
Meet itertools:

from itertools import chain
names = set(chain(*r.itervalues()))
print [line for line in table if line[7] in names]
George

Nov 5 '06 #2
On Sun, 05 Nov 2006 21:43:33 +0000, djc wrote:
There is I am sure an easy way to do this, but I seem to be brain dead
tonight. So:

I have a table such that I can do

[line for line in table if line[7]=='JDOC']
and
[line for line in table if line[7]=='Aslib']
and
[line for line in table if line[7]=='ASLIB']
etc

I also have a dictionary
r= {'a':('ASLIB','Aslib'),'j':('JDOC', 'jdoc')}
so I can extract values
r.values()
[('ASLIB', 'Aslib'), ('JDOC', 'jdoc')]

I would like to do

[line for line in table if line[7] in ('JDOC','jdoc','Aslib','ASLIB')]
What is the purpose of the "if line[7]" bit?
so how should I get from
{'a':('ASLIB','Aslib'),'j':('JDOC','jdoc')}
to
('Aslib','ASLIB','JDOC','jdoc')

Assuming you don't care what order the strings are in:

r = {'a':('ASLIB','Aslib'),'j':('JDOC','jdoc')}
result = sum(r.values(), ())

If you do care about the order:

r = {'a':('ASLIB','Aslib'),'j':('JDOC','jdoc')}
keys = r.keys()
keys.sort()
result = []
for key in keys:
result.extend(r[key])
result = tuple(result)
--
Steven.

Nov 6 '06 #3
djc
George Sakkis wrote:
Meet itertools:

from itertools import chain
names = set(chain(*r.itervalues()))
print [line for line in table if line[7] in names]
Steven D'Aprano wrote:
Assuming you don't care what order the strings are in:

r = {'a':('ASLIB','Aslib'),'j':('JDOC','jdoc')}
result = sum(r.values(), ())

If you do care about the order:

r = {'a':('ASLIB','Aslib'),'j':('JDOC','jdoc')}
keys = r.keys()
keys.sort()
result = []
for key in keys:
result.extend(r[key])
result = tuple(result)
Thank you everybody.
As it is possible that the tuples will not always be the same word in
variant cases
result = sum(r.values(), ())
will do fine and is as simple as I suspected the answer would be.

--
djc
Nov 6 '06 #4
djc:
As it is possible that the tuples will not always be the same word in
variant cases
result = sum(r.values(), ())
will do fine and is as simple as I suspected the answer would be.
It is simple, but I suggest you to take a look at the speed of that
part of your code into your program. With this you can see the
difference:

from time import clock
d = dict((i,range(300)) for i in xrange(300))

t = clock()
r1 = sum(d.values(), [])
print clock() - t

t = clock()
r2 = []
for v in d.values(): r2.extend(v)
print clock() - t

assert r1 == r2

Bye,
bearophile

Nov 6 '06 #5
djc
be************@lycos.com wrote:
It is simple, but I suggest you to take a look at the speed of that
part of your code into your program. With this you can see the
difference:

from time import clock
d = dict((i,range(300)) for i in xrange(300))

t = clock()
r1 = sum(d.values(), [])
print clock() - t

t = clock()
r2 = []
for v in d.values(): r2.extend(v)
print clock() - t
Yes, interesting, and well worth noting

1 for v in d.values(): r1.extend(v)

2 from itertools import chain
set(chain(*d.itervalues()))

3 set(v for t in d.values() for v in t)

4 sum(d.values(), [])

5 reduce((lambda l,v: l+v), d.values())

on IBM R60e [CoreDuo 1.6MHz/2GB]
d = dict((i,range(x)) for i in xrange(x))
x t1 t2 t3 t4 t5
300 0.0 0.02 0.04 0.31 0.32
500 0.01 0.09 0.1 1.67 1.69
1000 0.02 0.3 0.4 16.17 16.15
0.03 0.28 0.42 16.37 16.31
1500 0.03 0.76 0.94 57.05 57.13
2000 0.07 1.2 1.66 136.6 136.97
2500 0.11 2.34 2.64 268.44 268.85

but on the other hand, as the intended application is a small command
line app where x is unlikely to reach double figures and there are only
two users, myself included:
d =
{'a':['ASLIB','Aslib'],'j':['JDOC','jdoc'],'x':['test','alt','3rd'],'y':['single',]}
0.0 0.0 0.0 0.0 0.0

And sum(d.values(), []) has the advantage of raising a TypeError in the
case of a possible mangled input.

{'a':['ASLIB','Aslib'],'j':['JDOC','jdoc'],'x':['test','alt','3rd'],'y':'single'}
r1
['ASLIB', 'Aslib', 'test', 'alt', '3rd', 'JDOC', 'jdoc', 's', 'i', 'n',
'g', 'l', 'e']
r2
set(['Aslib', 'JDOC', 'g', '3rd', 'i', 'l', 'n', 'ASLIB', 's', 'test',
'jdoc', 'alt', 'e'])
r4 = sum(d.values(), [])
TypeError: can only concatenate list (not "str") to list

--
djc
Nov 8 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Kofi | last post by:
I need just the names of tables, views and sprocs within a SQL Server database. What's the easiest way to do this?
10
by: Jason | last post by:
Hi, I have a few million data items being received by my program and I wish to store them in a list, with each item being inserted in any position in the list. Any performance tips so that my...
2
by: ckroom | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Could anybody tell me if this code is right? Thanks. struct lista_alumnos {
9
by: law | last post by:
When I start unselecting in a list box, when I deselect the last one, it produces the following error: Run time error 5, and this is highlighted: strSQL = strSQL & "WHERE tblProduct.Product IN...
1
by: timmso | last post by:
I am trying to build a simple asp.net project. What sort of control do I need to use to simply display a list of links in a table format? For example, let's say I have a database table: tblNames...
4
by: KraftDiner | last post by:
Is there a cleaner way to implement this code? if len(self.listOfObjects) == 0: self.listOfObjects.append(self.currentObject) elif: self.listOfObjects = self.currentObject listOfObjects is a...
10
by: marcstuart | last post by:
How do I divide a list into a set group of sublist's- if the list is not evenly dividable ? consider this example: x = y = 3 # number of lists I want to break x into z = y/x what I...
0
by: John Machin | last post by:
Guilherme Polo wrote: He didn't need to. He explicitly said "list" (which permits duplicates) and didn't mention a self-imposed uniqueness constraint.
4
by: Chickenman | last post by:
First time poster here so hopefully will not get flamed too badly ;) I am trying to teach myself C# and am using VS2005 right now on XP. Here is the code snippet I have after doing a...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.