By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,222 Members | 2,416 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,222 IT Pros & Developers. It's quick & easy.

Outbound HTML Authentication

P: n/a
Hi,

I was trying to do a simple web scraping tool, but the network they
use at work does some type of internal authentication before it lets
the request out of the network. As a result I'm getting the '401 -
Authentication Error' from the application.

I know when I use a web browser or other application that it uses the
information from my Windows AD to validate my user before it accesses
a website. I'm constantly getting asked to enter in this info before I
use Firefox, and I assume that IE picks it up automatically.

However I'm not sure how to tell the request that I'm building in my
python script to either use the info in my AD account or enter in my
user/pass automatically.

Anyone know how to do this?

Thanks
Nov 29 '07 #1
Share this Question
Share on Google+
2 Replies


P: n/a
On Nov 29, 2007 2:22 PM, Mudcat <mn******@gmail.comwrote:
Hi,

I was trying to do a simple web scraping tool, but the network they
use at work does some type of internal authentication before it lets
the request out of the network. As a result I'm getting the '401 -
Authentication Error' from the application.

I know when I use a web browser or other application that it uses the
information from my Windows AD to validate my user before it accesses
a website. I'm constantly getting asked to enter in this info before I
use Firefox, and I assume that IE picks it up automatically.

However I'm not sure how to tell the request that I'm building in my
python script to either use the info in my AD account or enter in my
user/pass automatically.
You can configure a proxy for urllib2, but your proxy probably uses
NTLM authentication which urllib2 doesn't support. Your best bet is to
use a local proxy which understands NTLM.
Nov 29 '07 #2

P: n/a
twill is a simple language for browsing the Web. It's designed for
automated testing of Web sites, but it can be used to interact with
Web sites in a variety of ways. In particular, twill supports form
submission, cookies, redirects, and HTTP authentication.

Mudcat wrote:
Hi,

I was trying to do a simple web scraping tool, but the network they
use at work does some type of internal authentication before it lets
the request out of the network. As a result I'm getting the '401 -
Authentication Error' from the application.

I know when I use a web browser or other application that it uses the
information from my Windows AD to validate my user before it accesses
a website. I'm constantly getting asked to enter in this info before I
use Firefox, and I assume that IE picks it up automatically.

However I'm not sure how to tell the request that I'm building in my
python script to either use the info in my AD account or enter in my
user/pass automatically.

Anyone know how to do this?

Thanks

--
Shane Geiger
IT Director
National Council on Economic Education
sg*****@ncee.net | 402-438-8958 | http://www.ncee.net

Leading the Campaign for Economic and Financial Literacy

Nov 29 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.