473,396 Members | 1,924 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Screen scrapping

I am thinking of writing a php script that when given a url will parse certain download links and then using the results from the parsing to then scrape those results for a specific link. basically the script will need to go 2 links deep. I was wondering how i would approach this.
Feb 22 '11 #1
1 1110
Rabbit
12,516 Expert Mod 8TB
Well, high level, what you would do is
  1. Get HTML source for link
  2. Scan source for links
  3. Repeat 1 and 2 with that array of links
  4. Repeat 1 and 2 with the array generated from that array of links

Just be aware that this is pretty much O(n*m) time. If each page has 10 links then you are looking at downloading 111 pages of data.
Feb 22 '11 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

6
by: Tony Liu | last post by:
Hi, when switched to the full screen mode in VS.NET IDE, is there any way to hide the main menu bar? Thanks Tony
18
by: DavidS | last post by:
Have resW=screen.width; resH=screen.height in javascript. How can I read these values in ASP.NET source code - Page_Load function of code behind? Any suggestions?
0
by: Jeffery Tyree | last post by:
I have a machine that will be used to demo some Excel spreadsheets at a convention. Because I did not author the spreadsheets and attendees will be allowed to play with the spreadsheets, I needed...
0
by: Ian Ceicys | last post by:
I’ve been tasked with doing documentation on the project I’ve recently joined. Here’s what I’m looking for in terms of documentation. I want to generate a diagram from a huge .net 1.1...
9
by: Steve Wright | last post by:
Hi I'm developing a webpage that needs to include a different stylesheet depending on the screen resolution. I know that I could read the resolution in javascript and then do some sort of...
0
by: apondu | last post by:
I'm trying to screen scrape a site that requires a password. I am using C#.Net, i am new to this and with the information available around on the internet i just put tht information into the...
6
by: atyant | last post by:
hey i want to know the funda of screen scrapping that how it is done using C#
3
by: Peter Oliphant | last post by:
I'm programming using MS VS VC++ 2008 Express (Beta 2) in /Cli pure mode. I've found the Screen class. With it I can very easily get the count and all the data on the various Screens attached to...
1
by: nasima khan | last post by:
Hi, i am nasima. I have got a code for setting the screen resolution of my page, but i am unable to understand. Can any one give a complete data explanation of the below code. Sub ChangeRes(X...
0
Shashi Sadasivan
by: Shashi Sadasivan | last post by:
Hi All, i Have an application functionaluty which has 4 forms which are needed to interact by the user by dragging data from these forms to one another the main form is located in the MDI Parent...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.