473,241 Members | 1,539 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,241 software developers and data experts.

Outbound HTML Authentication

Hi,

I was trying to do a simple web scraping tool, but the network they
use at work does some type of internal authentication before it lets
the request out of the network. As a result I'm getting the '401 -
Authentication Error' from the application.

I know when I use a web browser or other application that it uses the
information from my Windows AD to validate my user before it accesses
a website. I'm constantly getting asked to enter in this info before I
use Firefox, and I assume that IE picks it up automatically.

However I'm not sure how to tell the request that I'm building in my
python script to either use the info in my AD account or enter in my
user/pass automatically.

Anyone know how to do this?

Thanks
Nov 29 '07 #1
2 1127
On Nov 29, 2007 2:22 PM, Mudcat <mn******@gmail.comwrote:
Hi,

I was trying to do a simple web scraping tool, but the network they
use at work does some type of internal authentication before it lets
the request out of the network. As a result I'm getting the '401 -
Authentication Error' from the application.

I know when I use a web browser or other application that it uses the
information from my Windows AD to validate my user before it accesses
a website. I'm constantly getting asked to enter in this info before I
use Firefox, and I assume that IE picks it up automatically.

However I'm not sure how to tell the request that I'm building in my
python script to either use the info in my AD account or enter in my
user/pass automatically.
You can configure a proxy for urllib2, but your proxy probably uses
NTLM authentication which urllib2 doesn't support. Your best bet is to
use a local proxy which understands NTLM.
Nov 29 '07 #2
twill is a simple language for browsing the Web. It's designed for
automated testing of Web sites, but it can be used to interact with
Web sites in a variety of ways. In particular, twill supports form
submission, cookies, redirects, and HTTP authentication.

Mudcat wrote:
Hi,

I was trying to do a simple web scraping tool, but the network they
use at work does some type of internal authentication before it lets
the request out of the network. As a result I'm getting the '401 -
Authentication Error' from the application.

I know when I use a web browser or other application that it uses the
information from my Windows AD to validate my user before it accesses
a website. I'm constantly getting asked to enter in this info before I
use Firefox, and I assume that IE picks it up automatically.

However I'm not sure how to tell the request that I'm building in my
python script to either use the info in my AD account or enter in my
user/pass automatically.

Anyone know how to do this?

Thanks

--
Shane Geiger
IT Director
National Council on Economic Education
sg*****@ncee.net | 402-438-8958 | http://www.ncee.net

Leading the Campaign for Economic and Financial Literacy

Nov 29 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: John | last post by:
I am using forms authentication for a website. I plan to use some static html pages (generated with a tool) on the site as well. I would like the html pages to be secured using the forms...
6
by: William F. Zachmann | last post by:
We've got a project going that involves moving an old web site with a massive dll written in C++ that produces most of the output from a SQL 7.0 data base on NT4 onto IIS on Windows 2003 Server...
1
by: John Rossitter | last post by:
Hi Everybody, I’m looking for a way to capture the outbound XML stream of a web service call. I need to be able to save these communications to SQL Server. Is there any way to do this...
0
by: kv | last post by:
Is it possible to make multiple outbound calls using voicexml/ccxml? This is what i want: - An inbound call is received - We get some information form the caller - Depending on informtaion...
1
by: Arun | last post by:
I have a folder “Secured” under the root folder of the project In the project root web.config authentication is given as <authentication mode="Forms"> <forms loginUrl="Login.aspx" timeout="15"...
5
by: nick | last post by:
I need to create a simple asp.net application that use password protect some html pages. The html page provider doesn't know asp.net. And the host doesn't allow me to create user accounts. ...
2
by: nick | last post by:
I have an Asp.Net 2.0 application using form authentication. I want the html pages be protected by the authentication system too. The accessing of html files need to be authenticated in my local...
25
by: bmearns | last post by:
Is it possible to specify which port to use as the outbound port on a connection? I have the IP address and port number for the computer I'm trying to connect to (not listening for), but it's...
1
by: erikcw | last post by:
Python seems to default to the main system IP for outbound connections (such as urllib), but I want to bind to one of my other IPs for outbound connections. Any ideas? Thanks!
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.