473,408 Members | 2,405 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

How to Encode Parameters into an HTML Parsing Script

I've written a Script that navigates various urls on a website, and
fetches the contents.
The Url's are being fed from a list "urlList". Everything seems to
work splendidly, until I introduce the concept of encoding parameters
for a certain url.
So for example if I wanted to navigate to an encoded url
http://online.investools.com/landing.iedu?signedin=true rather than
just http://online.investools.com/landing.iedu How would I do this?
How can I modify the script to urlencode these parameters:
{signedin:true} and to associate them with a specific url from the
urlList
Thank you!
import datetime, time, re, os, sys, traceback, smtplib, string,
urllib2, urllib, inspect
from urllib2 import build_opener, HTTPCookieProcessor, Request
opener = build_opener(HTTPCookieProcessor)
from urllib import urlencode

def urlopen2(url, data=None, user_agent='urlopen2'):
"""Opens Our URLS """
if hasattr(data, "__iter__"):
data = urlencode(data)
headers = {'User-Agent' : user_agent} # User-Agent for
Unspecified Browser
return opener.open(Request(url, data, headers))

def badCharCheck(host,url):
try:
page = urlopen2("http://"+host+".investools.com/"+url+"", ())
pageRead= page.read()
print "Loading:",url
#print pageRead
except:
print "Failed: ", traceback.format_tb(sys.exc_info()[2]),'\n'
if __name__ == '__main__':
host= "online"
urlList = ["landing.iedu","sitemap.iedu"]
print "\n","***** Begin BadCharCheck for", host
for url in urlList:
badCharCheck(host,url)

print'***** TEST FINISHED! Total Runs:'
sys.exit()

OUTPUT:
***** Begin BadCharCheck for online
Loading: landing.iedu
Loading: sitemap.iedu
***** TEST FINISHED! Total Runs:

Jun 22 '07 #1
2 1835
En Thu, 21 Jun 2007 23:37:07 -0300, <SM********@gmail.comescribió:
So for example if I wanted to navigate to an encoded url
http://online.investools.com/landing.iedu?signedin=true rather than
just http://online.investools.com/landing.iedu How would I do this?
How can I modify the script to urlencode these parameters:
{signedin:true} and to associate them with a specific url from the
urlList
If you want to use GET, append '?' plus the encoded parameters to the
desired url:

pydata = {'signedin':'true', 'another':42}
pyprint urlencode(data)
signedin=true&another=42

Do not use the data argument to urlopen.

--
Gabriel Genellina

Jun 22 '07 #2
On Jun 21, 9:45 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
En Thu, 21 Jun 2007 23:37:07 -0300, <SMERSH0...@gmail.comescribió:
So for example if I wanted to navigate to an encoded url
http://online.investools.com/landing...din=truerather than
justhttp://online.investools.com/landing.iedu How would I do this?
How can I modify thescriptto urlencode these parameters:
{signedin:true} and to associate them with a specific url from the
urlList

If you want to use GET, append '?' plus the encoded parameters to the
desired url:

pydata = {'signedin':'true', 'another':42}
pyprint urlencode(data)
signedin=true&another=42

Do not use the data argument to urlopen.

--
Gabriel Genellina
Sweet! I love this python group

Jun 22 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: milesd | last post by:
Hi, Rather new to MSXML4. I am parsing an XML data-stream over HTTP, and would like to know why I cannot parse XML nodes with multiple parameters. The XML and Code are below, BUT I would like...
16
by: Terry | last post by:
Hi, This is a newbie's question. I want to preload 4 images and only when all 4 images has been loaded into browser's cache, I want to start a slideshow() function. If images are not completed...
5
by: Scott Matthews | last post by:
I've recently come upon an odd Javascript (and/or browser) behavior, and after hunting around the Web I still can't seem to find an answer. Specifically, I have noticed that the Javascript...
4
by: Newbie | last post by:
How would I modify this form to encode *all* the characters in the 'source' textarea to the '%xx' format & place result code into the 'output' textarea? (cross browser compatable) Any help is...
4
by: Darrel | last post by:
How does HTML.encode work? I'm trying to save text in a hidden form field into a SQL DB. The tedt is HTML (from a WYSIWYG editor...X-standard). One problem I have is that stray apostrophe's in...
1
by: anagai | last post by:
Im wondering if generating html objects such as tabels and rows in javascript is faster than typing the html directly? Seems when you do it in javascript you have to download alot of code and would...
59
by: Lennart Björk | last post by:
Hi All, I have a tiny program: <!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <title>MyTitle</title> <meta...
6
by: g_no_mail_please | last post by:
Python 2.3.5 seems to choke when trying to parse html files, because it doesn't realize that what's inside <!-- --> is a comment in HTML, even if this comment is inside <script> </script>,...
12
by: Peter Michaux | last post by:
Hi, I am experimenting with some of the Ruby on Rails JavaScript generators and see something I haven't before. Maybe it is worthwhile? In the page below the script is enclosed in //<!]> ...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.