473,396 Members | 1,833 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Get Data from a Website

675 512MB
Access 2002 - Windows XP Home SP3
I want to get some data from a website. Currently I select the desired area of the webpage and copy to Clipboard. This selected area contains text and 2 imbedded images, a .jpg (~7K) and a .gif.
When I cntl-V (Paste) into an Access textbox, the 2 images are gone. Also gone are the stuff the designers of the webpage use to put the various pictures, colors, etc. on the webpage. I know nothing about this.
I have to return to the web page and separately SaveAs the .jpg image, and note which .gif image is shown, so I can manually enter using a combobox.

I want to do automatically-
1) Get the text without the imbedded images.
2) Determine which (of about 20) .gif is present. This may be in the image name, if I could see it.
3) Determine if .jpg is 1 of 3, and if not, SaveAs & strFileName, where I can generate strFileName after analyzing text in 1).

I notice that pasting the entire webpage into WordPad shows different results in the area I'm interested in than selecting that area from the webpage and pasting only that into WordPad.

Can someone point me toward info on how to do this?
Mar 9 '09 #1
3 9245
mshmyob
904 Expert 512MB
Hello OB,

Have you tried HTML Scraping. A member has some code here that you should be able to modify. I haven't tried the code myself but have a look at it or read some articles on HTML Scraping for more details,

http://bytes.com/topic/access/answer...-html-scraping


cheers,

@OldBirdman
Mar 10 '09 #2
ADezii
8,834 Expert 8TB
@OldBirdman
Couldn't the entire HTML Source Code be Pasted into Word where it could easily be examined or Saved as Text Only where the entire File can be opened in Access, and each Line analyzed in turn?
Mar 11 '09 #3
OldBirdman
675 512MB
I cannot get the code referred to by HTML scraping to work. I get a message
Compile error:
User-defined type not defined
on Line 2
Expand|Select|Wrap|Line Numbers
  1. Dim webBrowser As webBrowser
Apparently Access 2002 does not know what a webBrowser is. I don't understand the code, and can't figure it out if I can't run it. I learn from new examples by stepping thru the code, but this code won't start for me.

Couldn't the entire HTML Source Code be Pasted into Word where it could easily be examined or Saved as Text Only where the entire File can be opened in Access, and each Line analyzed in turn?
I've never tried automation, which I believe this involves. However, I did open Word and paste the entire webpage into it. It took about 45 seconds (hourglass) to do the paste. Even repeated pastes. The clear in preparation for another paste took 10 seconds.
Loading Word has its own time penalty. I don't know enough about Word to create a command button to read the text. So here is a whole new subject to investigate, but I think not now.
As I look at the Word page after the paste, I still can't find the .jpg picture address to download and save. The .gif name is embedded in the image frame, and that name would be enough for my purposes, if I knew enough to get to it. But the time cost is excessive, and I still would have to return to IE and "SaveAs..." the image.
I currently paste my selected area of the webpage into an unbound textbox, assign to a string variable, and scrape it. This gets me all the info I need except the .gif name and the ability to save the .jpg image.
So far, my steps are:
Expand|Select|Wrap|Line Numbers
  1. 1) Press "New Record" button in my Access (myDB)
  2. 2) Alt+Tab to Microsoft Internet Explorer (IE) 
  3.    and select the correct Tab and/or navigate
  4.    to the desired web page
  5. 3) Select desired section of web page
  6. 4) Cntl+V to copy to clipboard
  7. 5) Alt+Tab to return to myDB
  8. 6) Press command button "Paste from Website"
  9. 7) If MsgBox "GIF not determined", clear with "OK"
  10. 8) Alt+Tab to return to IE
  11. 9) RightClick .jpg Image and select "Save Picture As..."
  12.       Cntl+V to paste file name into dialog
  13.           (this was generated in step 6
  14.            and copied to clipboard)
  15.       Press "Enter" or click "Save" to save image
  16. 10) Mentally note .gif displayed
  17. 11) Cntl+Tab to return to myDB
  18. 12) If 7) displayed msg, Click combobox and 
  19.       select row to note .gif displayed
  20. 13) Click command button to record that an image
  21.       was actually acquired & saved.
Although this seems like a clumsy set of instructions, replacing with these doesn't seem to help.
Expand|Select|Wrap|Line Numbers
  1. 1) same as 1) above
  2. 2) same as 2) above
  3. 3) Cntl+A Select the entire web page
  4. 4) Cntl+C Copy web page to clipboard
  5. 5) Alt+Tab to Word
  6. 6) Cntl+A Select anything in Word
  7. 7) Cntl+P Paste selected from step 4,
  8.     overwriting anything already in Word
  9. 8) Alt+Tab to myDB
  10. 9) Press command button "Scrape from Word"
  11. 10) <<I still have no .jpg image, not sure steps here>>
What I am aiming for (and may not get to) is:
Expand|Select|Wrap|Line Numbers
  1. 1) same as 1) above
  2. 2) same as 2) above
  3. 3) Alt+Tab to myDB
  4. 4) Press command button "Get from Website"
Mar 11 '09 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

10
by: Steve | last post by:
Hi all i am just starting to get back into VB and i need a little help. I am writing a program that asks a user to type in a set of numbers/letters (in this case shipping containers). Once the...
10
by: Noozer | last post by:
Below is some ASP, HTML and javascript. It is part of a page used to maintain a small database. This code did work at one time, but has since stopped. For some reason the data on my form is not...
1
by: Aliza Klein | last post by:
Hi. A client of mine has a website with a standard "Contact Us" form page with name, address etc. The information is currently sent via email to the client and is then data entered into their...
4
by: Sarah Marriott | last post by:
Our website contains session variables that are used to validate if a user is logged in etc. We have found that these variables are randomly lost while navigating the website. We set up some...
3
by: petermichaux | last post by:
Hi, I am trying to put together the last major pieces of my project's puzzle. This is more website/client-side architecture than JavaScript syntax but I hope this is a good place to ask. I'm a...
9
by: Jeff Gardner | last post by:
Greetings: I have an UPDATE query (php 5.1.6/mysql 5.0.24a on apache 2.2) that appears to execute with no errors (php,mysql, or apache) but the data in the "UPDATED" table doesn't change. I've...
7
by: Atul | last post by:
Hi Theres a website that books hotels . user enters the information and according to that results are displayed to the user.Let it be website A. Now I want to create a new project with...
6
by: dboyerco | last post by:
I'm working with a company that is tracking my vihicle and they have an API that will allow me to log into their database and retrieve the location of my vihicle, which is returned to their website...
4
by: Dave | last post by:
I have a global.asax file with Application_Start defined and create some static data there and in another module used in the asp.net application and I realize that static data is shared amongst...
70
mideastgirl
by: mideastgirl | last post by:
I have recently been working on a website for an honors association, and have a lot of difficulty but have found help from those on this site. I would like to see if I can get some more help on a...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.