473,406 Members | 2,867 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

WebScraping

Can someone steer me to scripts / modules etc on webscraping please???
Ultimately I would like someone to write a script for me.
However i am still searching for documentation on this subject
Thanks Graham

Nov 4 '06 #1
5 1061
On Sun, 05 Nov 2006 08:09:52 +1000, Graham Feeley wrote:
Can someone steer me to scripts / modules etc on webscraping please???
The definitive documentation on the built-in Python modules can be found
here: http://docs.python.org/modindex.html

The ActiveState Python cookbook should be useful, e.g.
http://aspn.activestate.com/ASPN/Coo.../Recipe/391929

Also see Beautiful Soup:
http://www.crummy.com/software/BeautifulSoup/

And of course, GIYF ("Google Is Your Friend") http://www.google.com which
leads me to:

http://sig.levillage.org/?p=588
http://sig.levillage.org/2005/03/11/...ython-part-ii/
http://wiki.tcl.tk/2915 (not focused on Python, but may still be useful).

Ultimately I would like someone to write a script for me.
Are you offering to hire a developer?
--
Steven.

Nov 5 '06 #2
On Sun, 2006-11-05 at 13:40 +1100, Steven D'Aprano wrote:
On Sun, 05 Nov 2006 08:09:52 +1000, Graham Feeley wrote:
Can someone steer me to scripts / modules etc on webscraping please???

The definitive documentation on the built-in Python modules can be found
here: http://docs.python.org/modindex.html

The ActiveState Python cookbook should be useful, e.g.
http://aspn.activestate.com/ASPN/Coo.../Recipe/391929

Also see Beautiful Soup:
http://www.crummy.com/software/BeautifulSoup/
Beautiful soup is not always speedy, but it sure is the most flexible
scraper I've ever came across. I hacked together a web forum-to-nntp
gateway using Beautiful Soup. Worked very well.

Michael

>
And of course, GIYF ("Google Is Your Friend") http://www.google.com which
leads me to:

http://sig.levillage.org/?p=588
http://sig.levillage.org/2005/03/11/...ython-part-ii/
http://wiki.tcl.tk/2915 (not focused on Python, but may still be useful).

Ultimately I would like someone to write a script for me.

Are you offering to hire a developer?
--
Steven.
Nov 5 '06 #3
ina
This might be of help to you.
http://phlik.ishpeck.net/index.php?P=a1141076600phlik

http://phlik.ishpeck.net/index.php?P=b1134168973phlik

Graham Feeley wrote:
Can someone steer me to scripts / modules etc on webscraping please???
Ultimately I would like someone to write a script for me.
However i am still searching for documentation on this subject
Thanks Graham
Nov 5 '06 #4
yup yup BeautifulSoup is the way to go.

what would you like to scrape by the way?

Graham Feeley wrote:
Can someone steer me to scripts / modules etc on webscraping please???
Ultimately I would like someone to write a script for me.
However i am still searching for documentation on this subject
Thanks Graham
Nov 6 '06 #5
Well I would like to publicly thank Bernard Chhun for actually writing this
script and "pretting " it up for me.
He is truly a talented guy.
He used Beautifull Soup and Regex which i am still coming to terms trying to
understand them any way Thanks again Bernard.
Regards
graham

"Graham Feeley" <gr***********@optusnet.com.auwrote in message
news:45***********************@news.optusnet.com.a u...
Can someone steer me to scripts / modules etc on webscraping please???
Ultimately I would like someone to write a script for me.
However i am still searching for documentation on this subject
Thanks Graham

Nov 19 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
by: Michele Simionato | last post by:
I would like to know what is available for scripting browsers from Python. For instance, webbrowser.open let me to perform GET requests, but I would like to do POST requests too. I don't want to...
5
by: Fritz Switzer | last post by:
I've got some strings I'd like to regex.split. Any ideas on what the format would be for these examples. I'm webscraping so I have no control on the inputs. A couple points: the POS can be...
10
by: rlueneberg | last post by:
I am trying to foward the old sessionID using "Session.SessionID" to an HttpWebRequest CookieContainer so that I can capture the requested page session variables but it is not working as it is...
7
by: Donlingerfelt | last post by:
I would like to download stock quotes from the web, store them, do calculations and sort the results. However I am fairly new and don't have a clue how to parse the results of a web page download....
0
by: jmesanc | last post by:
Help Me please: I'm working in a webscraping project, an everything is ok. Problem begins because an x.asp webpage, fires an "form.submit" event and displays y.asp, i need get y.asp (x.asp...
10
by: Victor | last post by:
hi guys. In my project, now I am using a asp.net login control and a customized membership provider to do the form authentication. Now I want some function that user can skip the login form and be...
1
by: stevesnow | last post by:
---------------------------------------------- Please forward this work experience & skills summary to your Database & software development, MIS/IT/Software Department for review. ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.