By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
439,930 Members | 2,018 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 439,930 IT Pros & Developers. It's quick & easy.

Question about concatenation error

P: n/a
I am new to python and I am confused as to why when I try to
concatenate 3 strings, it isn't working properly.

Here is the code:

------------------------------------------------------------------------------------------
import string
import sys
import re
import urllib

linkArray = []
srcArray = []
website = sys.argv[1]

urllib.urlretrieve(website, 'getfile.txt')

filename = "getfile.txt"
input = open(filename, 'r')
reg1 = re.compile('href=".*"')
reg3 = re.compile('".*?"')
reg4 = re.compile('http')
Line = input.readline()

while Line:
searchstring1 = reg1.search(Line)
if searchstring1:
rawlink = searchstring1.group()
link = reg3.search(rawlink).group()
link2 = link.split('"')
cleanlink = link2[1:2]
fullink = reg4.search(str(cleanlink))
if fullink:
linkArray.append(cleanlink)
else:
cleanlink2 = str(website) + "/" + str(cleanlink)
linkArray.append(cleanlink2)
Line = input.readline()

print linkArray
-----------------------------------------------------------------------------------------------

I get this:

["http://www.slugnuts.com/['index.html']",
"http://www.slugnuts.com/['movies.html']",
"http://www.slugnuts.com/['ramblings.html']",
"http://www.slugnuts.com/['sluggies.html']",
"http://www.slugnuts.com/['movies.html']"]

instead of this:

["http://www.slugnuts.com/index.html]",
"http://www.slugnuts.com/movies.html]",
"http://www.slugnuts.com/ramblings.html]",
"http://www.slugnuts.com/sluggies.html]",
"http://www.slugnuts.com/movies.html]"]

The concatenation isn't working the way I expected it to. I suspect
that I am screwing up by mixing types, but I can't see where...

I would appreciate any advice or pointers.

Thanks.
Sep 7 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
On Wed, 07 Sep 2005 16:34:25 GMT, colonel <th******@camelrichard.org>
wrote:
I am new to python and I am confused as to why when I try to
concatenate 3 strings, it isn't working properly.

Here is the code:

------------------------------------------------------------------------------------------
import string
import sys
import re
import urllib

linkArray = []
srcArray = []
website = sys.argv[1]

urllib.urlretrieve(website, 'getfile.txt')

filename = "getfile.txt"
input = open(filename, 'r')
reg1 = re.compile('href=".*"')
reg3 = re.compile('".*?"')
reg4 = re.compile('http')
Line = input.readline()

while Line:
searchstring1 = reg1.search(Line)
if searchstring1:
rawlink = searchstring1.group()
link = reg3.search(rawlink).group()
link2 = link.split('"')
cleanlink = link2[1:2]
fullink = reg4.search(str(cleanlink))
if fullink:
linkArray.append(cleanlink)
else:
cleanlink2 = str(website) + "/" + str(cleanlink)
linkArray.append(cleanlink2)
Line = input.readline()

print linkArray
-----------------------------------------------------------------------------------------------

I get this:

["http://www.slugnuts.com/['index.html']",
"http://www.slugnuts.com/['movies.html']",
"http://www.slugnuts.com/['ramblings.html']",
"http://www.slugnuts.com/['sluggies.html']",
"http://www.slugnuts.com/['movies.html']"]

instead of this:

["http://www.slugnuts.com/index.html]",
"http://www.slugnuts.com/movies.html]",
"http://www.slugnuts.com/ramblings.html]",
"http://www.slugnuts.com/sluggies.html]",
"http://www.slugnuts.com/movies.html]"]

The concatenation isn't working the way I expected it to. I suspect
that I am screwing up by mixing types, but I can't see where...

I would appreciate any advice or pointers.

Thanks.

Okay. It works if I change:

fullink = reg4.search(str(cleanlink))
if fullink:
linkArray.append(cleanlink)
else:
cleanlink2 = str(website) + "/" + str(cleanlink)

to

fullink = reg4.search(cleanlink[0])
if fullink:
linkArray.append(cleanlink[0])
else:
cleanlink2 = str(website) + "/" + cleanlink[0]
so can anyone tell me why "cleanlink" gets coverted to a list? Is it
during the slicing?
Thanks.
Sep 7 '05 #2

P: n/a
colonel wrote:
On Wed, 07 Sep 2005 16:34:25 GMT, colonel <th******@camelrichard.org>
wrote:

I am new to python and I am confused as to why when I try to
concatenate 3 strings, it isn't working properly.

Here is the code:

------------------------------------------------------------------------------------------
import string
import sys
import re
import urllib

linkArray = []
srcArray = []
website = sys.argv[1]

urllib.urlretrieve(website, 'getfile.txt')

filename = "getfile.txt"
input = open(filename, 'r')
reg1 = re.compile('href=".*"')
reg3 = re.compile('".*?"')
reg4 = re.compile('http')
Line = input.readline()

while Line:
searchstring1 = reg1.search(Line)
if searchstring1:
rawlink = searchstring1.group()
link = reg3.search(rawlink).group()
link2 = link.split('"')
cleanlink = link2[1:2]
fullink = reg4.search(str(cleanlink))
if fullink:
linkArray.append(cleanlink)
else:
cleanlink2 = str(website) + "/" + str(cleanlink)
linkArray.append(cleanlink2)
Line = input.readline()

print linkArray
-----------------------------------------------------------------------------------------------

I get this:

["http://www.slugnuts.com/['index.html']",
"http://www.slugnuts.com/['movies.html']",
"http://www.slugnuts.com/['ramblings.html']",
"http://www.slugnuts.com/['sluggies.html']",
"http://www.slugnuts.com/['movies.html']"]

instead of this:

["http://www.slugnuts.com/index.html]",
"http://www.slugnuts.com/movies.html]",
"http://www.slugnuts.com/ramblings.html]",
"http://www.slugnuts.com/sluggies.html]",
"http://www.slugnuts.com/movies.html]"]

The concatenation isn't working the way I expected it to. I suspect
that I am screwing up by mixing types, but I can't see where...

I would appreciate any advice or pointers.

Thanks.


Okay. It works if I change:

fullink = reg4.search(str(cleanlink))
if fullink:
linkArray.append(cleanlink)
else:
cleanlink2 = str(website) + "/" + str(cleanlink)

to

fullink = reg4.search(cleanlink[0])
if fullink:
linkArray.append(cleanlink[0])
else:
cleanlink2 = str(website) + "/" + cleanlink[0]
so can anyone tell me why "cleanlink" gets coverted to a list? Is it
during the slicing?
Thanks.


The statement

cleanlink = link2[1:2]

results in a list of one element. If you want to accesss element one
(the second in the list) then use

cleanlink = link2[1]

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Sep 7 '05 #3

P: n/a

"colonel" <th******@camelrichard.org> wrote in message
news:cu********************************@4ax.com...
so can anyone tell me why "cleanlink" gets coverted to a list?
Is it during the slicing?


Steve answered for you, but for next time, you could find out faster by
either using the all-purpose debuging tool known as 'print' or, with
Python, the handy-dandy interactive window:
[1,2,3][1:2]

[2]

Terry J. Reedy

Sep 7 '05 #4

P: n/a
On Wednesday 07 September 2005 11:34 am, colonel wrote:
I am new to python and I am confused as to why when I try to
concatenate 3 strings, it isn't working properly.

Here is the code:
I'm not taking the time to really study it, but at first
glance, the code looks like it's probably much more
complicated than it needs to be.
["http://www.slugnuts.com/['index.html']",
"http://www.slugnuts.com/['movies.html']",
"http://www.slugnuts.com/['ramblings.html']",
"http://www.slugnuts.com/['sluggies.html']",
"http://www.slugnuts.com/['movies.html']"]


The tail end of that is the string representation of
a list containing one string, not of that string. I
suspect you needed to use ''.join() somewhere. Or,
you could, in principle have indexed the list, since
you only want one member of it, e.g.:
['index.html'][0]

'index.html'

--
Terry Hancock ( hancock at anansispaceworks.com )
Anansi Spaceworks http://www.anansispaceworks.com

Sep 7 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.