By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,878 Members | 928 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,878 IT Pros & Developers. It's quick & easy.

Creating a web bot/crawler/spider for multiple websites

P: 4

I need to create a web bot/crawler/spider that would go into different web sites and collect data for us and store in a database. The crawler needs to 'READ' the options on a website (either from drop-downs, radio-buttons or check-boxesand) to create some input itself OR use some generic pre-defined words (that we provide it with).

For example, a webpage might be structure with a text field and some drop-downs. Typically, if the user enters the case number of a court case the web-site displays the status, and also there might be different legal documents thay could be retrieved through drop-down options like: 'Industry Permits', 'Civil Cases', 'Criminl cases' etc. So the crawler should be able to read and self-generate a list of suitable options and use them to get the data. we want to create a bot/crawler/spider that will automatically enter the information about multiple cases etc. i.e. case numbers (text field), case type (from drop-downs) and retrieve the data about the relevant cases available on the website.

What is the best approach to achieve this? We can write inidividual bots for each website but are trying to come-up with a more intelligent bot or crawler that can be used to crawl multiple websites. Please advise on how we can achive this.

We are not doing anything illegal, everything perfectly legal. Please advise on how we can achieve this.

Oct 21 '08 #1
Share this question for a faster answer!
Share on Google+

Post your reply

Sign in to post your reply or Sign up for a free account.