473,804 Members | 3,277 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Trying to make a spider using mechanize

Hi,

I can read the home page using the mechanize lib. Is there a way to
load in web pages using filename.html instad of servername/
filename.html. Lots of time the links just have the file name. I'm
trying to read in the links name and then vsit those pages.

here is the sample code I am ussing.
import ClientForm
import mechanize
#get home page
request = mechanize.Reque st("http://www.activetechc onsulting.com")
response = mechanize.urlop en(request)
print response.read()

#sub page (this does note work)
request = mechanize.Reque st("service.htm l")
response = mechanize.urlop en(request)
print response.read-Ted
Sep 8 '08 #1
1 2239
Hi,

Perhaps you might want to
try out using a sample spider
I wrote and base your code of
this ?

See: http://hg.shortcircuit.net.au/index....ples/spider.py

cheers
James

On Tue, Sep 9, 2008 at 2:24 AM, te*******@gmail .com <te*******@gmai l.comwrote:
Hi,

I can read the home page using the mechanize lib. Is there a way to
load in web pages using filename.html instad of servername/
filename.html. Lots of time the links just have the file name. I'm
trying to read in the links name and then vsit those pages.

here is the sample code I am ussing.
import ClientForm
import mechanize
#get home page
request = mechanize.Reque st("http://www.activetechc onsulting.com")
response = mechanize.urlop en(request)
print response.read()

#sub page (this does note work)
request = mechanize.Reque st("service.htm l")
response = mechanize.urlop en(request)
print response.read-Ted
--
http://mail.python.org/mailman/listinfo/python-list


--
--
-- "Problems are solved by method"
Sep 8 '08 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

15
2021
by: Larry Asher | last post by:
Hi all. I'm a bit of a novice in this arena so please forgive if this question reflects that. I am trying to grab the html from a website and display it within another webpage (once I get this to work I am going to manipulate the html in other ways - this isn't the end purpose of this effort). To do this I am trying to open another window containing the source html from a URL and then capture the html from that window. I can open the...
0
2083
by: dtsearch | last post by:
New release expands-through a .NET Spider API, to Linux, and to OpenOffice-dtSearch's ability to index over a terabyte of text in a single index, with indexed search time typically less than a second BETHESDA, MD (January 10, 2006) dtSearch Corp., a leading supplier of enterprise and developer text retrieval software, announces Version 7.2 of its product line for instantly searching terabytes of documents across a desktop, network,...
2
8785
by: barrybevel | last post by:
Hi, I have a very small simple program below which does the following: 1) post a username & password to a website - THIS WORKS 2) follow a link - THIS WORKS 3) update values of 2 fields and post the form - ERROR! This works fine using firefox even with javascript turned off. But when using Perl (v5.8.8 on FC5) I get a page back stating an error has occured: "We're sorry, an error has occurred. Please review the error below There has...
1
3952
by: comeshopcheap | last post by:
Hi I am using this script to access doba.com (I need to download some files) but I keep on being sent back to the login page not the user home page. Any help. I think I may need to use a post method and opener is using a get method Thanks import mechanize
6
3383
by: sureshbup | last post by:
Hi, i am new to perl... i tried this module mechanize. this is the script #!/usr/bin/perl # Include the WWW::Mechanize module use WWW::Mechanize;
1
1912
by: mithunmo | last post by:
Hello All, How do we access form using Win32::IE::Mechanize->new( ) ? I know we have a method $mech->form_name('"name") .But what if the form has only the 'id' attribute and no name attribute, In this case how do we access the form through Win32::IE::Mechanize->new() Kindly reply
0
324
by: bruce | last post by:
i'm getting the following error: mechanize._response.httperror_seek_wrapper: HTTP Error 500: i'm running python 5.1 and mechanize 0.1.7b I have no idea as to what I have to change/modify/include to handle this issue. The link that I'm testing is at the bottom of the page. When I insert the link into the browser, I actually get an err page.. so, I suspect that there is a handler that I should be able to modify/use to handle this
1
1932
by: tedpottel | last post by:
Hi, I am trying to install the mechanize lib so I can use python to do webbrowseing. First I set up easy_install When I ran the script, it download the files ok, then I got these error messages sun is not reganized as a internal command I did a sercah on sun.* and the sercah came up empty, am I missing
2
4907
by: Rex | last post by:
Hello, I am working on an academic research project where I need to log in to a website (www.lexis.com) over HTTPS and execute a bunch of queries to gather a data set. I just discovered the mechanize module, which seems great because it's a high-level tool. However, I can't find any decent documentation for mechanize apart from the docstrings, which are pretty thin. So I just followed some other examples I found online, to produce the...
0
9708
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9587
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10340
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10324
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
6857
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5527
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5662
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4302
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3827
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.