By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
439,942 Members | 1,788 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 439,942 IT Pros & Developers. It's quick & easy.

Strange thing with types

P: n/a
TYR
I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().

Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, "%d %b %Y %H %M")

year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:

[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
File "bothv2.py", line 71, in <module>
t1 = string.join(t, '')
File "/usr/lib/python2.5/string.py", line 316, in join
return sep.join(words)
TypeError
Jun 27 '08 #1
Share this Question
Share on Google+
4 Replies


P: n/a
TYR wrote:
I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().

Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, "%d %b %Y %H %M")

year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:

[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
File "bothv2.py", line 71, in <module>
t1 = string.join(t, '')
File "/usr/lib/python2.5/string.py", line 316, in join
return sep.join(words)
TypeError
First - don't use module string anymore. Use e.g.

''.join(t)

Second, you can only join strings. but year is an integer. So convert it to
a string first:

t = timeinput.insert(2, str(year))

Diez
Jun 27 '08 #2

P: n/a
On May 29, 11:09 pm, TYR <a.harrow...@gmail.comwrote:
I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().

Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, "%d %b %Y %H %M")

year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:

[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
File "bothv2.py", line 71, in <module>
t1 = string.join(t, '')
File "/usr/lib/python2.5/string.py", line 316, in join
return sep.join(words)
TypeError
list.insert modifies the list in-place:
>>l = [1,2,3]
l.insert(2,4)
l
[1, 2, 4, 3]

It also returns None, which is what you're assigning to 't' and then
trying to join.

Replace your usage of 't' with 'timeinput' and it should work.
Jun 27 '08 #3

P: n/a
TYR
On May 29, 2:23 pm, "Diez B. Roggisch" <de...@nospam.web.dewrote:
TYR wrote:
I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().
Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, "%d %b %Y %H %M")
year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:
[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
File "bothv2.py", line 71, in <module>
t1 = string.join(t, '')
File "/usr/lib/python2.5/string.py", line 316, in join
return sep.join(words)
TypeError

First - don't use module string anymore. Use e.g.

''.join(t)

Second, you can only join strings. but year is an integer. So convert it to
a string first:

t = timeinput.insert(2, str(year))

Diez
Yes, tm_year is converted to a unicode string elsewhere in the program.
Jun 27 '08 #4

P: n/a
TYR
On May 29, 2:24 pm, alex23 <wuwe...@gmail.comwrote:
On May 29, 11:09 pm, TYR <a.harrow...@gmail.comwrote:
I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().
Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, "%d %b %Y %H %M")
year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:
[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
File "bothv2.py", line 71, in <module>
t1 = string.join(t, '')
File "/usr/lib/python2.5/string.py", line 316, in join
return sep.join(words)
TypeError

list.insert modifies the list in-place:
>l = [1,2,3]
l.insert(2,4)
l

[1, 2, 4, 3]

It also returns None, which is what you're assigning to 't' and then
trying to join.

Replace your usage of 't' with 'timeinput' and it should work.
Thank you.
Jun 27 '08 #5

This discussion thread is closed

Replies have been disabled for this discussion.