472,798 Members | 1,235 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,798 software developers and data experts.

parsing a site/page that uses/calls javascript functions...


I've got a couple of test apps that I use to parse/test different html
webpages. However, I'm now looking at how to parse a given site/page that
uses javascript calls to dynamically create/display the resulting HTML.

I can see the HTML is the Browser page if I manually select the btn that
invokes the javascript function, but I have no idea how to create an app
that can effectively parse the page.

My test apps use python, along with mechanize/browser/urllib. I've seen
sites/docs that discuss selenium, spidermonkey, etc... If possible, I'm
trying to find a complete example (that walks through how to setup the
environment, to how to finally extract the DOM elements of a given
javascript page), or I'm looking to find someone I can work with, to create
a complete example that can then be posted to the 'net.

I'd really rather have a headless browser solution, as my overall goal is to
run a parsing/crawling over a number of pages that utilize javascript..

Pointers, thoughts, comments, etc will be greatly appreciated.


Sep 28 '08 #1
0 928

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

by: Ralph Krausse | last post by:
I am a novice with PHP, in fact so novice, that I come from the Windows world. I have successfully set up my Slackware distro on my laptop, downloaded, compiled and installed Apache 2.x and PHP...
by: RiGGa | last post by:
Hi, I want to parse a web page in Python and have it write certain values out to a mysql database. I really dont know where to start with parsing the html code ( I can work out the database...
by: Fastly | last post by:
Hi Guys I'm no Javascript guru, although am fairly confidant using PHP... What I have knocked together here uses PHP to create a querystring without the iso=XX value, and dump it's output within...
by: Dennis M. Marks | last post by:
Take a look at the new www.amtrak.com site. It looks like all pages are built dynamically using javascript. All of the page sources look the same. I used to have direct links to some of the...
by: laredotornado | last post by:
Hello, I am looking for a cross-browser way (Firefox 1+, IE 5.5+) to have my Javascript function execute from the BODY's "onload" method, but if there is already an onload method defined, I would...
by: Esa | last post by:
Hi, I'm having problems with one strange web system where submitting an application and making queries about its handling status require a series of form submits and response parsing - all in...
by: paul | last post by:
Hi All, We have a small dilemma. We have the following page: http://giggsey.com/m00Cow.php (don't ask about the content) that we want to turn into an interactive application for some new intake...
by: toton | last post by:
Hi, I have some ascii files, which are having some formatted text. I want to read some section only from the total file. For that what I am doing is indexing the sections (denoted by .START in...
by: erikbower65 | last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps: 1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal. 2. Connect to...
by: linyimin | last post by:
Spring Startup Analyzer generates an interactive Spring application startup report that lets you understand what contributes to the application startup time and helps to optimize it. Support for...
by: erikbower65 | last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA: 1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
by: kcodez | last post by:
As a H5 game development enthusiast, I recently wrote a very interesting little game - Toy Claw ((http://claw.kjeek.com/))。Here I will summarize and share the development experience here, and hope it...
by: DJRhino1175 | last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this - If...
by: Rina0 | last post by:
I am looking for a Python code to find the longest common subsequence of two strings. I found this blog post that describes the length of longest common subsequence problem and provides a solution in...
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
by: lllomh | last post by:
How does React native implement an English player?
by: Mushico | last post by:
How to calculate date of retirement from date of birth

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.