473,320 Members | 2,133 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Dictionaries and loops

Hi All
i have a CSV file that i'm reading in and each line has the look of
the below

{None: ['User-ID', 'Count']}
{None: ['576460847178667334', '1']}
{None: ['576460847178632334', '8']}

i want to make a dictionary of items in the form
{576460847178667334:1, 576460847178632334:8, ..... } for all rows in
the datafile

my code so far is thus:

dict1={}
j=1
for row in reader1:
if j==1:
j+=1
continue #thus allowing me to skip the first row
if j>1:
for element in row.values():
for item in element:
if int(item)%2==0:
dict1[int(item)] = int(item)+1
# i know this is the problem line as it's not picking the second item
up just finding the first and increasing it, but i can't figure out
how to correct this?
j+=1

I get one dictionary from this but not the correct data inside, can
anyone help?

Sep 8 '08 #1
6 1090
Mike P a écrit :
Hi All
i have a CSV file that i'm reading in and each line has the look of
the below

{None: ['User-ID', 'Count']}
{None: ['576460847178667334', '1']}
{None: ['576460847178632334', '8']}
This doesn't look like a CSV file at all... Is that what you actually
have in the file, or what you get from the csv.reader ???
i want to make a dictionary of items in the form
{576460847178667334:1, 576460847178632334:8, ..... } for all rows in
the datafile

my code so far is thus:

dict1={}
j=1
for row in reader1:
if j==1:
j+=1
continue #thus allowing me to skip the first row
if j>1:
Drop this, and call reader1.next() before entering the loop.
for element in row.values():
for item in element:
if int(item)%2==0:
dict1[int(item)] = int(item)+1
You're repeating the same operation (building an int from a string)
three time, where one would be enough:

for item in element:
item = int(item)
if item %2 == 0: # or : if not item % 2:
dict1[item] = item + 1

But this code is not going to yield the expected result...
# i know this is the problem line as it's not picking the second item
up just finding the first and increasing it, but i can't figure out
how to correct this?
Mmm... What about learning Python instead of trying any random code ?
Programming by accident won't take you very far, and you can't expect
this neswgroup to do your own work.
Ok, assuming your CSV file looks like this - and you never have
duplicate values for the User-id column:

# source.csv
"User-ID", "Count"
576460847178667334, 1
576460847178632334, 8'

Here's a possible solution:

result = {}
src = open("source.csv", "rb")
try:
reader = csv.reader(src)
reader.next()
for row in reader:
user_id, count = int(row[0]), int(row[1])
result[user_id] = count
finally:
src.close()

or more tersely:

src = open("source.csv", "rb")
try:
reader = csv.reader(src)
reader.next()
result = dict(map(int, row) for row in reader)
finally:
src.close()

Sep 8 '08 #2
Few solutions, not much tested:

data = """{None: ['User-ID', 'Count']}
{None: ['576460847178667334', '1']}
{None: ['576460847178632334', '8']}"""

lines = iter(data.splitlines())
lines.next()

identity_table = "".join(map(chr, xrange(256)))
result = {}
for line in lines:
parts = line.translate(identity_table, "'[]{},").split()
key, val = map(int, parts[1:])
assert key not in result
result[key] = val
print result

(With Python 3 finally that identity_table can be replaced by None)

# --------------------------------------

import re

patt = re.compile(r"(\d+).+?(\d+)")

lines = iter(data.splitlines())
lines.next()

result = {}
for line in lines:
key, val = map(int, patt.search(line).groups())
assert key not in result
result[key] = val
print result

# --------------------------------------

from itertools import groupby

lines = iter(data.splitlines())
lines.next()

result = {}
for line in lines:
key, val = (int("".join(g)) for h,g in groupby(line,
key=str.isdigit) if h)
assert key not in result
result[key] = val
print result

Bye,
bearophile
Sep 8 '08 #3
Bruno Desthuilliers:
This doesn't look like a CSV file at all... Is that what you actually
have in the file, or what you get from the csv.reader ???
I presume you are right, the file probably doesn't contain that stuff
like I have assumed in my silly/useless solutions :-)

Bye,
bearophile
Sep 8 '08 #4
be************@lycos.com a écrit :
Bruno Desthuilliers:
>This doesn't look like a CSV file at all... Is that what you actually
have in the file, or what you get from the csv.reader ???

I presume you are right, the file probably doesn't contain that stuff
like I have assumed in my silly/useless solutions :-)
Yeps. I suspect the OP found a very creative way to misuse
csv.DictReader, but I couldn't figure out how he managed to get such a mess.
Sep 8 '08 #5
Thanks for the solution above,

The raw data looked like
User-ID,COUNTS
576460840144207854,6
576460821700280307,2
576460783848259584,1
576460809027715074,3
576460825909089607,1
576460817407934470,1

and i used

CSV_INPUT1 = "C:/Example work/Attr_model/Activity_test.csv"
fin1 = open(CSV_INPUT1, "rb")
reader1 = csv.DictReader((fin1), [], delimiter=",")
for row in reader1:
print row

with the following outcome.
{None: ['User-ID', 'COUNTS']}
{None: ['576460840144207854', '6']}
{None: ['576460821700280307', '2']}
{None: ['576460783848259584', '1']}
{None: ['576460809027715074', '3']}
{None: ['576460825909089607', '1']}

So i can see csv.reader is what i should have been using

Thanks for the help
Sep 8 '08 #6
Mike P a écrit :
Thanks for the solution above,

The raw data looked like
User-ID,COUNTS
576460840144207854,6
576460821700280307,2
576460783848259584,1
576460809027715074,3
576460825909089607,1
576460817407934470,1

and i used

CSV_INPUT1 = "C:/Example work/Attr_model/Activity_test.csv"
fin1 = open(CSV_INPUT1, "rb")
reader1 = csv.DictReader((fin1), [], delimiter=",")
This should have been:
reader1 = csv.DictReader(fin1, delimiter=",")

or even just csv.DictReader(fin1), since IIRC ',' is the default
delimiter (I'll let you check this by yourself...).

with which you would have:
[
{'User-ID':'576460840144207854', 'count':'6'},
{'User-ID':'576460821700280307', 'count':'2'},
# etc...
]
with the following outcome.
{None: ['User-ID', 'COUNTS']}
{None: ['576460840144207854', '6']}
{None: ['576460821700280307', '2']}
{None: ['576460783848259584', '1']}
{None: ['576460809027715074', '3']}
{None: ['576460825909089607', '1']}
And you didn't noticed anything strange ???
So i can see csv.reader is what i should have been using
With only 2 values, DictReader is probably a bit overkill, yes.
Thanks for the help
You're welcome. But do yourself a favour: take time to *learn* Python -
at least the very basic (no pun) stuff like iterating over a sequence.
Sep 8 '08 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Vlad Sirenko | last post by:
I need: dict = {2002 : {'Jan': {1 : 'num1', 2: 'num2', 3 : 'num3'}, {'Feb': {1 : 'num4', 2: 'num5', 3 : 'num6'} } } 2003 : {'Jan': {1 : 'num7', 2: 'num8', 3 : 'num9'} } } How to do it...
0
by: Till Plewe | last post by:
Is there a way to speed up killing python from within a python program? Sometimes shutting down takes more than 10 times as much time as the actual running of the program. The programs are...
10
by: Bulba! | last post by:
Hello everyone, I'm reading the rows from a CSV file. csv.DictReader puts those rows into dictionaries. The actual files contain old and new translations of software strings. The dictionary...
8
by: beliavsky | last post by:
Since Python does not have declarations, I wonder if people think it is good to name function arguments according to the type of data structure expected, with names like "xlist" or "xdict".
2
by: Odd-R. | last post by:
If input is , list1 is I want to search list1, and the result should be all dictionaries where primarycolor is in input. I can do this using a double for-loop, but is there a more efficent...
210
by: Christoph Zwerschke | last post by:
This is probably a FAQ, but I dare to ask it nevertheless since I haven't found a satisfying answer yet: Why isn't there an "ordered dictionary" class at least in the standard list? Time and again...
2
by: techiepundit | last post by:
I'm parsing some data of the form: OuterName1 InnerName1=5,InnerName2=7,InnerName3=34; OuterName2 InnerNameX=43,InnerNameY=67,InnerName3=21; OuterName3 .... and so on.... These are fake...
15
by: pretoriano_2001 | last post by:
Hello: I have next dictionaries: a={'a':0, 'b':1, 'c':2, 'd':3} b={'a':0, 'c':1, 'd':2, 'e':3} I want to put in a new dictionary named c all the keys that are in b and re-sequence the values....
1
by: Edwin.Madari | last post by:
by the way, iterating over bar will throw KeyError if that key does not exist in foo. to see that in action, simply set another key in bar after copy.deepcopy stmt in this example.. bar = 0 and...
14
by: cnb | last post by:
Are dictionaries the same as hashtables?
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.