472,325 Members | 1,690 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,325 software developers and data experts.

Simple Question : files and URLLIB

Hi - I'm new to Python. I've been trying to use URLLIB and the 'tidy'
function (part of the mx.tidy package). There's one thing I'm having
real difficulties understanding. When I did this ...

finA= urllib.urlopen('http://www.python.org/')
foutA=open('C:\\testout.html','w')
tidy(finA,foutA,None)

I get ...

Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "mx\Tidy\Tidy.py", line 38, in tidy
return mxTidy.tidy(input, output, errors, kws)
TypeError: inputstream must be a file object or string

.... what I don't understand is surely the result of a urllib is a file
object ? Isn't it ? To quote the manual at :

http://www.python.org/doc/current/li...le-urllib.html

"If all went well, a file-like object is returned". I can make the
tidy function happy but changing the code to read ...

finA= urllib.urlopen('http://www.python.org/').read()

.... I haven't had time to look into this properly yet but I suspect
finA is now a string not a file handle ?

Anyway if anyone can throw light on this I would be grateful.

thanks

richard.shea.
Jul 18 '05 #1
4 3870
> "If all went well, a file-like object is returned". I can make the

file-like means having similar interface to a file object (methods read,
readline, etc.), but not a real file though,

mxTidy.tidy most probably requires a real file to be passed,
just you look into Tidy.py (line 38) and you'll know for sure

--
bromden[at]gazeta.pl

Jul 18 '05 #2
> finA= urllib.urlopen('http://www.python.org/').read()

... I haven't had time to look into this properly yet but I suspect
finA is now a string not a file handle ?


Correct. If you do:
print type(finA)
you obtain the result:
<type 'str'>

If you do:
finA= urllib.urlopen('http://www.python.org/')
print type(finA)
then you obtain the result:
<type 'instance'>

Compare this with:
finA = open("blah", "w")
print type(finA)
which gives the result:
<type 'file'>

According to the docs on urlopen( url[, data[, proxies]]) :
"If all went well, a file-like object is returned."
So the answer would appear to be: "close, but no cigar".
Jul 18 '05 #3

"Richard Shea" <ri*********@fastmail.fm> wrote in message
news:28*************************@posting.google.co m...
Hi - I'm new to Python. I've been trying to use URLLIB and the 'tidy' function (part of the mx.tidy package). There's one thing I'm having
real difficulties understanding. When I did this ...

finA= urllib.urlopen('http://www.python.org/')
foutA=open('C:\\testout.html','w')
tidy(finA,foutA,None)

I get ...

Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "mx\Tidy\Tidy.py", line 38, in tidy
return mxTidy.tidy(input, output, errors, kws)
TypeError: inputstream must be a file object or string

... what I don't understand is surely the result of a urllib is a file object ? Isn't it ? To quote the manual at :

http://www.python.org/doc/current/li...le-urllib.html

"If all went well, a file-like object is returned".
'file-like object' is different from 'file object' From urllib.py doc
string:
"The object returned by URLopener().open(file) will differ per
protocol. All you know is that is has methods read(), readline(),
readlines(), fileno(), close() and info()."

Why this is not good enough for mx.tidy is a question for it's author.
I can make the tidy function happy by changing the code to read ...

finA= urllib.urlopen('http://www.python.org/').read()

... I haven't had time to look into this properly yet but I suspect
finA is now a string not a file handle ?


Yes. So it meets the 'file or string' requirement.

Terry J. Reedy
Jul 18 '05 #4
Thanks to everyone for the info/feedback. In particular I didn't know
you could that ...

type(finA)

.... business (which shows you how new to Python I am probably) but
it'll come in handy.

As I think you realised I had misunderstood exactly what urllib was
offering however the blah.read() approach is quite good enough. Just
out of curiousity though if 'tidy' demanded a file (rather than being
prepared to take a string as it is)would the only sure approach be to
....

f1=open('C:\\workfile.html','w')
strHTML= urllib.urlopen('http://www.python.org/').read()
f1.write(strHTML)
tidy(f1,strOut,None)

.... that is to take the string that results from the read on urllib
file-like object and write it back out to a file ?

Just wondering ...

Thanks again for the information on my original question.

regards

richard.

ri*********@fastmail.fm (Richard Shea) wrote in message news:<28*************************@posting.google.c om>...
Hi - I'm new to Python. I've been trying to use URLLIB and the 'tidy'
function (part of the mx.tidy package). There's one thing I'm having
real difficulties understanding. When I did this ...

finA= urllib.urlopen('http://www.python.org/')
foutA=open('C:\\testout.html','w')
tidy(finA,foutA,None)

I get ...

Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "mx\Tidy\Tidy.py", line 38, in tidy
return mxTidy.tidy(input, output, errors, kws)
TypeError: inputstream must be a file object or string

... what I don't understand is surely the result of a urllib is a file
object ? Isn't it ? To quote the manual at :

http://www.python.org/doc/current/li...le-urllib.html

"If all went well, a file-like object is returned". I can make the
tidy function happy but changing the code to read ...

finA= urllib.urlopen('http://www.python.org/').read()

... I haven't had time to look into this properly yet but I suspect
finA is now a string not a file handle ?

Anyway if anyone can throw light on this I would be grateful.

thanks

richard.shea.

Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Blaktyger | last post by:
I would like to download some mp3 files from a web site. There is to much of them and I had the idea of writing a script to do it for me. Code:...
3
by: sj | last post by:
I wish to download hundreds of files from the University of Iowa sound archive. Doing it manually would be a daunting task especially since the...
1
by: Timothy Wu | last post by:
Hi, I'm trying to fill the form on page http://www.cbs.dtu.dk/services/TMHMM/ using urllib. There are two peculiarities. First of all, I am...
0
by: Ray Slakinski | last post by:
I got a small issue, I am using urllib.urlretreive to download files but in some cases I'm downloading from a CGI that is redirecting urlretrieve...
6
by: justsee | last post by:
Hi, I'm using Python 2.3 on Windows for the first time, and am doing something wrong in using urllib to retrieve images from urls embedded in a...
7
by: Shani | last post by:
I have the following code which takes a list of urls "http://google.com", without the quotes ofcourse, and then saves there source code as a text...
7
by: Ehsan | last post by:
I foundd this code in ASPN Python Cookbook for downloading files in python but when it finished downloading files the files became corrupted and...
5
by: chrispoliquin | last post by:
Hi, I have a small Python script to fetch some pages from the internet. There are a lot of pages and I am looping through them and then...
4
by: Clay Hobbs | last post by:
I am making a program that (with urllib) that downloads two jpeg files and, if they are different, displays the new one. I need to find a way to...
0
by: tammygombez | last post by:
Hey everyone! I've been researching gaming laptops lately, and I must say, they can get pretty expensive. However, I've come across some great...
0
by: concettolabs | last post by:
In today's business world, businesses are increasingly turning to PowerApps to develop custom business applications. PowerApps is a powerful tool...
0
by: teenabhardwaj | last post by:
How would one discover a valid source for learning news, comfort, and help for engineering designs? Covering through piles of books takes a lot of...
0
by: CD Tom | last post by:
This happens in runtime 2013 and 2016. When a report is run and then closed a toolbar shows up and the only way to get it to go away is to right...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
0
jalbright99669
by: jalbright99669 | last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made...
0
by: Matthew3360 | last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function. Here is my code. ...
0
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.