473,236 Members | 1,771 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,236 software developers and data experts.

urllib equivalent for HTTP requests

K
Hello everyone,

I understand that urllib and urllib2 serve as really simple page
request libraries. I was wondering if there is a library out there
that can get the HTTP requests for a given page.

Example:
URL: http://www.google.com/test.html

Something like: urllib.urlopen('http://www.google.com/
test.html').files()

Lists HTTP Requests attached to that URL:
=http://www.google.com/test.html
=http://www.google.com/css/google.css
=http://www.google.com/js/js.css

The other fun part is the inclusion of JS within <scripttags, i.e.
the new Google Analytics script
=http://www.google-analytics.com/ga.js

or css, @imports
=http://www.google.com/css/import.css

I would like to keep track of that but I realize that py does not have
a JS engine. :( Anyone with ideas on how to track these items or am I
out of luck.

Thanks,
K
Oct 8 '08 #1
4 2863
K schrieb:
Hello everyone,

I understand that urllib and urllib2 serve as really simple page
request libraries. I was wondering if there is a library out there
that can get the HTTP requests for a given page.

Example:
URL: http://www.google.com/test.html

Something like: urllib.urlopen('http://www.google.com/
test.html').files()

Lists HTTP Requests attached to that URL:
=http://www.google.com/test.html
=http://www.google.com/css/google.css
=http://www.google.com/js/js.css

There are no "Requests attached" to an url. There is a HTML-document
behind it, that might contain further external references.
The other fun part is the inclusion of JS within <scripttags, i.e.
the new Google Analytics script
=http://www.google-analytics.com/ga.js

or css, @imports
=http://www.google.com/css/import.css

I would like to keep track of that but I realize that py does not have
a JS engine. :( Anyone with ideas on how to track these items or am I
out of luck.
You can use e.g. BeautifulSoup to extract all links from the site.

What you can't do though is to get the requests that are issued by
Javascript that is *running*.

Diez
Oct 8 '08 #2
Hi All,

I have chosen to use a Django app for a customer site and wish to put
it up on the net.

Before I waste all day trying it myself (and probably getting it
wrong) I thought I would ask the experts here.

My questions are:

- can most everyday vanilla linux web hosts run a django site ?

- can most everyday vanilla linux web hosts run python web scripts?

Thanks

David

Oct 8 '08 #3
My questions are:
>
- can most everyday vanilla linux web hosts run a django site ?

- can most everyday vanilla linux web hosts run python web scripts?
Depends on your definition of "most everyday vanilla linux web
hosts". :)

The bottom-of-the-barrel hosts will often (but not always) offer
Python CGI. Django "can" run in a CGI (google for "django
cgi"[0]), but it's an unpleasant experience because the entire
Django framework gets reloaded for *every* request.
Doable/tolerable for a private development/family page, but it
will likely flounder under the slightest load.

This is like strapping a jet engine (Django) on a bicycle (CGI).
[1] Doable, but more for the macho-factor of "I got it
working" rather than the practical aspects.

Your lowest-end hosting services won't offer mod_python or WSGI
(either Apache with mod_wsgi, or others like lighttpd with a wsgi
interface) though WSGI is becoming more popular. There are still
some shared-hosting solutions that facilitate using Django[2]
pretty well. They're not super-cheap, but they're affordable.
The canonical catalog of Django-friendly & Django-capable hosting
services can be found at [3]. If you're just starting out with
Django, it might help to pay a bit more for one of the click-n-go
hosts, while others you'll have to do some of the heavy lifting
(installing Django, as well as possibly other components,
assembling your wsgi startup script, etc) yourself.

Hope this helps,

-tkc
[0]
http://www.google.com/search?q=django%20cgi

[1]
http://www.youtube.com/watch?v=SFv1Yu-KxZ8

[2]
http://groups.google.com/group/djang...f8c04f3dfb56e/

[3]
http://code.djangoproject.com/wiki/D...iendlyWebHosts



Oct 8 '08 #4
Tim Chase wrote:
[In response t David Lyon]
>My questions are:

- can most everyday vanilla linux web hosts run a django site ?

- can most everyday vanilla linux web hosts run python web scripts?

Depends on your definition of "most everyday vanilla linux web hosts". :)

The bottom-of-the-barrel hosts will often (but not always) offer Python
CGI. Django "can" run in a CGI (google for "django cgi"[0]), but it's
an unpleasant experience because the entire Django framework gets
reloaded for *every* request. Doable/tolerable for a private
development/family page, but it will likely flounder under the slightest
load.

This is like strapping a jet engine (Django) on a bicycle (CGI). [1]
Doable, but more for the macho-factor of "I got it working" rather than
the practical aspects.

Your lowest-end hosting services won't offer mod_python or WSGI (either
Apache with mod_wsgi, or others like lighttpd with a wsgi interface)
though WSGI is becoming more popular. There are still some
shared-hosting solutions that facilitate using Django[2] pretty well.
They're not super-cheap, but they're affordable. The canonical catalog
of Django-friendly & Django-capable hosting services can be found at
[3]. If you're just starting out with Django, it might help to pay a
bit more for one of the click-n-go hosts, while others you'll have to do
some of the heavy lifting (installing Django, as well as possibly other
components, assembling your wsgi startup script, etc) yourself.
There's recently been a discussion about hosting on the django-users
list, which I recommend you think about joining. Both WebFaction and
SliceHost got high marks from many users. I personally use OpenHosting,
who are very Python-friendly and mostly just let you ge on with what you
want to do, which is great if you are comfortable managing your own
email and web services.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Oct 9 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Volker M. | last post by:
Hey, I want to open a list of URLs with Pythons urllib and the fuction open(URL) automatically. It is important that the program open ONLY normal http-sites and no https-sites with...
0
by: Shane Hathaway | last post by:
I started experimenting with SOAPpy yesterday and immediately hit a snag. Both web services I tried simply hung and never replied. After a lot of digging, I found out what was going wrong:...
0
by: Pieter Edelman | last post by:
Hi all, I'm trying to submit some data using a POST request to a HTTP server with BASIC authentication with python, but I can't get it to work. Since it's driving me completely nuts, so here's...
1
by: Timothy Wu | last post by:
Hi, I'm trying to fill the form on page http://www.cbs.dtu.dk/services/TMHMM/ using urllib. There are two peculiarities. First of all, I am filling in incorrect key/value pairs in the...
1
by: Matthijs | last post by:
I have been trying to make a script that will download several rss feeds to my computer. The only problem I have is that I have to go through a proxy. First I tried using urllib (python 2.4,...
4
by: william | last post by:
I've got a strange problem on windows (not very familiar with that OS). I can ping a host, but cannot get it via urllib (see here under). I can even telnet the host on port 80. Thus network...
6
by: JabaPyth | last post by:
Hello, I'm trying to use the urllib module, but when i try urllib.urlopen, it gives me a socket error: >>import urllib >>print urllib.urlopen('http://www.google.com/').read() Traceback (most...
5
by: supercooper | last post by:
I am downloading images using the script below. Sometimes it will go for 10 mins, sometimes 2 hours before timing out with the following error: Traceback (most recent call last): File...
1
by: Jonathan Gardner | last post by:
So, I ran into a problem that I would like to write as little code as possible to solve. The problem is that I would like to send out a bunch of HTTP requests simultaneously, using asynchronous...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.