473,398 Members | 2,113 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

parsing a site/page that uses/calls javascript functions...

Hi...

I've got a couple of test apps that I use to parse/test different html
webpages. However, I'm now looking at how to parse a given site/page that
uses javascript calls to dynamically create/display the resulting HTML.

I can see the HTML is the Browser page if I manually select the btn that
invokes the javascript function, but I have no idea how to create an app
that can effectively parse the page.

My test apps use python, along with mechanize/browser/urllib. I've seen
sites/docs that discuss selenium, spidermonkey, etc... If possible, I'm
trying to find a complete example (that walks through how to setup the
environment, to how to finally extract the DOM elements of a given
javascript page), or I'm looking to find someone I can work with, to create
a complete example that can then be posted to the 'net.

I'd really rather have a headless browser solution, as my overall goal is to
run a parsing/crawling over a number of pages that utilize javascript..

Pointers, thoughts, comments, etc will be greatly appreciated.
Thanks!!!

-bruce

Sep 28 '08 #1
0 947

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Ralph Krausse | last post by:
I am a novice with PHP, in fact so novice, that I come from the Windows world. I have successfully set up my Slackware distro on my laptop, downloaded, compiled and installed Apache 2.x and PHP...
9
by: RiGGa | last post by:
Hi, I want to parse a web page in Python and have it write certain values out to a mysql database. I really dont know where to start with parsing the html code ( I can work out the database...
1
by: Fastly | last post by:
Hi Guys I'm no Javascript guru, although am fairly confidant using PHP... What I have knocked together here uses PHP to create a querystring without the iso=XX value, and dump it's output within...
3
by: Dennis M. Marks | last post by:
Take a look at the new www.amtrak.com site. It looks like all pages are built dynamically using javascript. All of the page sources look the same. I used to have direct links to some of the...
2
by: laredotornado | last post by:
Hello, I am looking for a cross-browser way (Firefox 1+, IE 5.5+) to have my Javascript function execute from the BODY's "onload" method, but if there is already an onload method defined, I would...
2
by: Esa | last post by:
Hi, I'm having problems with one strange web system where submitting an application and making queries about its handling status require a series of form submits and response parsing - all in...
9
by: paul | last post by:
Hi All, We have a small dilemma. We have the following page: http://giggsey.com/m00Cow.php (don't ask about the content) that we want to turn into an interactive application for some new intake...
3
by: toton | last post by:
Hi, I have some ascii files, which are having some formatted text. I want to read some section only from the total file. For that what I am doing is indexing the sections (denoted by .START in...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.