473,545 Members | 2,639 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Why does this fail?

New to Python question, why does this fail?

Thanks,
Dave

---testcase.py---
import sys, urllib, htmllib
def Checkit(URL):
try:
print "Opening", URL
f = urllib.open(URL )
f.close()
return 1
except:
return 0

rtfp = Checkit("http://www.python.org/doc/Summary.html")
if rtfp == 1:
print "OK"
else:
print "Fail"
python testcase.py

Jul 18 '05 #1
14 3017
In article <vv************ @corp.supernews .com>,
"Dave Murray" <dl******@mic ro-net.com> wrote:
New to Python question, why does this fail?

---testcase.py---
import sys, urllib, htmllib
def Checkit(URL):
try:
print "Opening", URL
f = urllib.open(URL )
f.close()
return 1
except: Here, try
except Exception, details
print "Exception: ", details return 0

rtfp = Checkit("http://www.python.org/doc/Summary.html")
if rtfp == 1:
print "OK"
else:
print "Fail"


Then you'll see

Opening http://www.python.org/doc/Summary.html
Exception: 'module' object has no attribute 'open'
Fail
You probably mean urlopen .

Regards. Mel.
Jul 18 '05 #2
On Sun, 2004-01-04 at 18:58, Dave Murray wrote:
New to Python question, why does this fail?
[...] try: [...] except:

[...]

Because you're treating all errors as if they're what you expect. You
should be more specific in your except clause. Do this and you'll see
what I mean:

try:
whatever
except:
raise # raise whatever exception occurred
return 0

In other words, you should be explicit about the errors you silence.

Also, it's not clear what Checkit() is actually supposed to do. Is it
supposed to verify the URL actually exists? urllib doesn't raise an
error for 404 not found--urllib2 does. Try that instead.

Cheers,

// m
Jul 18 '05 #3
>>>>> "Dave" == Dave Murray <dl******@mic ro-net.com> writes:

Dave> New to Python question, why does this fail? Thanks, Dave

Dave> f = urllib.open(URL )

urllib does not have an open function. Instead, it has a constructor called
URLopener, which creates an object with such a method. So instead, you have
to say

opener = urllib.URLopene r()
f = opener(URL)

Regards,
Isaac.
Jul 18 '05 #4
Thank you all, this is a hell of a news group. The diversity of answers
helped me with some unasked questions, and provided more elegant solutions
to what I thought that I had figured out on my own. I appreciate it.

It's part of a spider that I'm working on to verify my own (and friends) web
page and check for broken links. Looks like making it follow robot rules
(robots.txt and meta field exclusions) is what's left.

I have found the library for html/sgml to be not very robust. Big .php and
..html with lot's of cascades and external references break it very
ungracefully (sgmllib.SGMLPa rseError: expected name token). I'd like to be
able to trap that stuff and just move on to the next file, accepting the
error. I'm reading in the external links and printing the title as a sanity
check in addition to collecting href anchors. This problem that I asked
about reared it's head when I started testing for a robots.txt file, which
may or may not exist.

The real point is to learn the language. When a new grad wrote a useful
utility at work in Python faster than I could have written it in C I decided
that I needed to learn Python. He's very sharp but he sold me on the
language too. Since I often must write utilities, Python seems to be a very
good thing since I normally don't have much time to kill on them.

Dave

Jul 18 '05 #5
On Sun, 2004-01-04 at 20:58, Dave Murray wrote:
[...]
I have found the library for html/sgml to be not very robust. Big .php and
.html with lot's of cascades and external references break it very
ungracefully (sgmllib.SGMLPa rseError: expected name token).


I'd suggest using htmllib.

// m
Jul 18 '05 #6
I could not help replying to this thread...

There are already quite a lot of spider programs existing
in Python. I am the author of one of the first programs of
the kind, called HarvestMan. It is multithreaded and has
many features for downloading websites, checking links etc.
You can get it from the HarvestMan homepage at
http://harvestman.freezope.org. HarvestMan is quite
comprehensive and is a bit more than a link checker or
web crawler. My feeling is that it is not easy to understand
for a Python beginner though the program is distributed
as source code in true Python tradition.

If you want something simpler, try spider.py. You can get
information on it from the PyPi pages.

My point was that, there is nothing to gain from re-inventing
the wheel again and again. Spider programs have been written in
Python, so you should try to use them rather than writing code
from scratch. If you think that you are having new ideas, then
take the code of HarvestMan(or spider) and customize it or
improve on it. I will be happy to merge the changes back in the
code if I think they improve the program, if it is for HarvestMan.

This is the main reason why developers release programs as
opensource. Help the community, and help yourselves. Re-inventing
the wheel is perhaps not the way to go.

best regards

-Anand
"Dave Murray" <dl******@mic ro-net.com> wrote in message news:<vv******* *****@corp.supe rnews.com>...
Thank you all, this is a hell of a news group. The diversity of answers
helped me with some unasked questions, and provided more elegant solutions
to what I thought that I had figured out on my own. I appreciate it.

It's part of a spider that I'm working on to verify my own (and friends) web
page and check for broken links. Looks like making it follow robot rules
(robots.txt and meta field exclusions) is what's left.

I have found the library for html/sgml to be not very robust. Big .php and
.html with lot's of cascades and external references break it very
ungracefully (sgmllib.SGMLPa rseError: expected name token). I'd like to be
able to trap that stuff and just move on to the next file, accepting the
error. I'm reading in the external links and printing the title as a sanity
check in addition to collecting href anchors. This problem that I asked
about reared it's head when I started testing for a robots.txt file, which
may or may not exist.

The real point is to learn the language. When a new grad wrote a useful
utility at work in Python faster than I could have written it in C I decided
that I needed to learn Python. He's very sharp but he sold me on the
language too. Since I often must write utilities, Python seems to be a very
good thing since I normally don't have much time to kill on them.

Dave

Jul 18 '05 #7
Thank you for the information. I will check them out after I finish my
effort. My purpose isn't to obtain a spider program, it is to learn Python
by doing. If the exercise will result in something that I can use, it gives
me incentive to not abandon the effort because the exercise is interesting
to me. The sources that you pointed out should be rich in information on how
I could have done it better if I had been more experienced in Python
(knowledgeable about it's libraries, etc.)

Whenever I learn something new I like to work at it, get help if I'm stuck
on something silly (why waste time?), assess what I did against a higher
standard, repeat. It's just the way that I learn. I can see that this forum
will be just what I need for a chunk of that process. I appreciate it.

Regards,
Dave

----- Original Message -----
From: "Anand Pillai" <py*******@Hotp op.com>

I could not help replying to this thread...

There are already quite a lot of spider programs existing
in Python. --
This is the main reason why developers release programs as
opensource. Help the community, and help yourselves. Re-inventing
the wheel is perhaps not the way to go.

Jul 18 '05 #8
After re-reading this part, I can see that it is an idea that I like. How
does participating in open source work for someone (me) who has signed the
customary intellectual property agreement with the corporation that they
work for? Since programming is part of my job, developing test solutions
implemented on automatic test equipment (the hardware too) I don't know if I
would/could be poison to an open source project. How does that work? I've
never participated. If all the work is done on someone's own time, not using
company resources, yadda-yadda-hadda-hadda, do corporate lawwwyaahhhs have a
history of trying to dispute that and stake a claim? No doubt, many of you
are in the same position.

Regards,
Dave
"Anand Pillai" <py*******@Hotp op.com> wrote in message
news:84******** *************** ***@posting.goo gle.com...
This is the main reason why developers release programs as
opensource. Help the community, and help yourselves. Re-inventing
the wheel is perhaps not the way to go.

Jul 18 '05 #9

Dave> How does participating in open source work for someone (me) who
Dave> has signed the customary intellectual property agreement with the
Dave> corporation that they work for? Since programming is part of my
Dave> job, developing test solutions implemented on automatic test
Dave> equipment (the hardware too) I don't know if I would/could be
Dave> poison to an open source project. How does that work?

Only your corporate counsel knows for sure. <wink> Seriously, the degree to
which you are allowed to release code to an open source project and the
manner in which is released is probably a matter best taken up with your
company's legal department. Some companies are fairly enlightened. Some
are not. You may need very little review to release bug fixes or test cases
(my guess is you might be pretty good at writing test cases ;-), more review
to release a new module or package, and considerable participation by
management and the legal eagles if you want to release a sophisticated
application into the wild.

In any case, if you make large contributions to an open source project such
as Python, I'm pretty sure a release form for substantial amounts of code
will be required at the Python end of things. See here

http://www.python.org/psf/psf-contri...agreement.html

for more details. Note that it hasn't been updated in a couple years. I
don't know if MAL has something which is more up-to-date.

Skip

Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

24
3772
by: David Mathog | last post by:
If this: int i,sum; int *array; for(sum=0, i=0; i<len; i++){ sum += array; } is converted to this (never mind why for the moment):
27
4242
by: Chess Saurus | last post by:
I'm getting a little bit tired of writing if (a = malloc(...) == NULL) { // error code } I mean, is it really possible that a malloc call could fail, except in the case of running out of virtual memory? -Chess
10
2350
by: Gunnar G | last post by:
I'm having problem reading from the beginning of a file. Here is the code (more or less) ifstream codefin; ofstream codefout; while (not_annoyed)
9
2135
by: David Thielen | last post by:
Hi; I am sure I am missing something here but I cannot figure it out. Below I have a program and I cannot figure out why the xpath selects that throw an exception fail. From what I know they should work. Also the second nav.OuterXml appears to also be wrong to me. Can someone explain to me why this does not work? (This is an example...
4
1684
by: merk | last post by:
I always understood that an array had to have a defined size at compile time via either a real value or a const so why does this work? Note that it does fail in MS Visual Studio 2003. #include <iostream> int main() { int n; std::cin >n;
19
4170
by: Angus | last post by:
I have a socket class CTestClientSocket which I am using to simulate load testing. I create multiple instances of the client like this: for (int i = 0; i < 5; i++) { CTestClientSocket* pTemp = new CTestClientSocket(this, ip, port); pTemp->Connect(); m_collClients.push_back(pTemp);
34
13327
by: niranjan.singh | last post by:
This is regarding to test an SDK memory stuff. In what situation malloc gets fail. any comment/reply pls.... regards
5
6311
by: marshmallowww | last post by:
I have an Access 2000 mde application which uses ADO and pass through queries to communicate with SQL Server 7, 2000 or 2005. Some of my customers, especially those with SQL Server 2005, have had pass-through queries fail due to intermittent connection failures. I can easily restablish a connection for ADO. My problem is with...
6
14196
by: meLlamanJefe | last post by:
I am not sure what else to try in order to address this issue. #include <iostream> #include <fstream> using namespace std; int main(int argc, char *argv) {
0
7490
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7682
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7935
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7780
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6009
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5351
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3465
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1911
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
734
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.