473,520 Members | 14,047 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to import data from a website using VBA

Seth Schrock
2,965 Recognized Expert Specialist
I'm creating a database that will pull some data from a website and then do the necessary manipulations to it. The BE of the database is SQL Server with an Access FE. I have gotten as far as being able to pull the source code from the website and then extract just the HTML table. I'm now stuck with the HTML table code stored in a string variable and I don't know how to proceed from here. I could use the InStr(), and Mid() functions to get the values out of the table and then write them to a table, but I was hoping that I could just import the data so that I wouldn't have to write as much code that gets difficult to figure out because of nested InStr() and Mid() functions.
Sep 30 '13 #1
8 7848
Rabbit
12,516 Recognized Expert Moderator MVP
Rather than a bunch of mid and instr methods, I would use split or regular expressions. I don't think there's a native function to inport html data.
Sep 30 '13 #2
ADezii
8,834 Recognized Expert Expert
How about Posting a sample of the Data contained in your String Variable.
Oct 1 '13 #3
Hennepin
25 New Member
I wrote this code several years ago to down load baseball stats from espn. It reads one character at a time.
saves the output into a comma seperated file.
You can tweak it to your needs.

Have a reference to microsoft scripting runtime for file system object.

Expand|Select|Wrap|Line Numbers
  1. Option Compare Database
  2. Option Explicit
  3.  
  4. Enum RdProcess
  5.     tableout
  6.     tablein
  7.     tagin
  8.     tagout
  9.     fieldin
  10.     fieldout
  11.     rowskip
  12.     Readstart
  13.     Readend
  14. End Enum
  15.  
  16.  
  17. Private Declare Function URLDownloadToFile Lib "urlmon" Alias _
  18.   "URLDownloadToFileA" (ByVal pCaller As Long, ByVal szURL As String, ByVal _
  19.     szFileName As String, ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long
  20.  
  21. Public Function DownloadFilefromWeb(Url As String, strSavePath As String) As Boolean
  22.     Dim ret As Long
  23.     ret = URLDownloadToFile(0, Url, strSavePath, 0, 0)
  24.     If ret = 0 Then
  25.         DownloadFilefromWeb = True
  26.     Else
  27.         DownloadFilefromWeb = False
  28.     End If
  29. End Function
  30.  
  31. Public Sub dofiles()
  32.     Dim fso As FileSystemObject, tso As TextStream, tsoi As TextStream, i As Integer, spath As String, Url As String
  33.     Dim strSavePath As String
  34.     spath = "E:\strato\"
  35.     Set fso = CreateObject("Scripting.FileSystemObject")
  36.  
  37.     Set tso = fso.OpenTextFile(spath & "newpit.txt", ForWriting, True, 0)
  38.     Url = "http://espn.go.com/mlb/stats/pitching/_/qualified/false/order/false"
  39.     strSavePath = spath & "temp_pit_file.txt"
  40.     i = 1
  41.     Do While DownloadFilefromWeb(Url, strSavePath)
  42.         Set tsoi = fso.OpenTextFile(strSavePath, ForReading, False, 0)
  43.         If get_stats_table(tso, tsoi) Then
  44.             tsoi.Close
  45.             Exit Do
  46.         End If
  47.         tsoi.Close
  48.         Url = "http://espn.go.com/mlb/stats/pitching/_/count/" & i & "/qualified/false/order/false"
  49.         i = i + 40 'forty players per page
  50.     Loop
  51.     tso.Close
  52.  
  53.     Set tso = fso.OpenTextFile(spath & "newbat.txt", ForWriting, True, 0)
  54.     Url = "http://espn.go.com/mlb/stats/batting/_/qualified/false"
  55.     strSavePath = spath & "temp_bat_file.txt"
  56.     i = 1
  57.     Do While DownloadFilefromWeb(Url, strSavePath)
  58.         Set tsoi = fso.OpenTextFile(strSavePath, ForReading, False, 0)
  59.         If get_stats_table(tso, tsoi) Then
  60.             tsoi.Close
  61.             Exit Do
  62.         End If
  63.         tsoi.Close
  64.         Url = "http://espn.go.com/mlb/stats/batting/_/count/" & i & "/qualified/false"
  65.         i = i + 40
  66.     Loop
  67.  
  68.     tso.Close
  69. End Sub
  70. Public Function get_stats_table(tso As TextStream, tsoi As TextStream) As Boolean
  71.     Dim i As Integer, s As String
  72.     Dim tagVal  As String, fval As String
  73.     Dim stat As RdProcess, FoundTable As Boolean, readtable As Boolean, decount As Integer
  74.  
  75.     FoundTable = False
  76.     stat = tableout
  77.  
  78.     Do
  79.         s = tsoi.Read(1)
  80.         Select Case stat
  81.         Case tableout
  82.             tagVal = Right(tagVal, 5) + s
  83.             If tagVal = "<Table" Then stat = tablein
  84.         Case tablein
  85.             If s = "<" Then ' start reading a tag else ignor the stuff
  86.                 stat = tagin
  87.                 tagVal = "<"
  88.             End If
  89.         Case tagin
  90.             tagVal = tagVal + s
  91.             If s = ">" Then
  92.                 stat = Handle_the_Tag(tagVal, fval, tso, FoundTable, decount)
  93.                 tagVal = ""
  94.                 If stat = Readend Then
  95.                     Exit Do
  96.                 End If
  97.             End If
  98.         Case tagout ' decifer the tag
  99.             If s = "<" Then
  100.                 stat = tagin
  101.                 tagVal = s
  102.             End If
  103.         Case fieldin
  104.             If s = "<" Then
  105.                 tagVal = "<"
  106.                 stat = tagin
  107.             Else
  108.                 fval = fval + s
  109.             End If
  110.         Case rowskip
  111.             tagVal = Right(tagVal, 4) + s
  112.             If tagVal = "</tr>" Then stat = tagout
  113.         Case Readend
  114.             Exit Do
  115.         End Select
  116.     Loop Until tsoi.AtEndOfStream
  117.       '  tso.Write (s)
  118.     get_stats_table = CBool(decount < 40)
  119. End Function
  120. '<table     </table>
  121. '<td        </td>             <td align="left">WAS</td>
  122. '<tr        </tr>
  123. Function Handle_the_Tag(tagVal As String, fval As String, tso As TextStream, FoundTable As Boolean, decount As Integer) As RdProcess
  124.     Dim p As Integer
  125.     If tagVal Like "</td>" Then 'write  comma space
  126.         If FoundTable Then
  127.             tso.Write fval & ", "
  128.         Else
  129.             If fval = "Sortable Batting" Then
  130.                 FoundTable = True
  131.             End If
  132.             If fval = "Sortable Pitching" Then
  133.                 FoundTable = True
  134.             End If
  135.         End If
  136.         fval = ""
  137.         Handle_the_Tag = tagout
  138.     ElseIf tagVal Like "<td*>" Then
  139.         fval = ""
  140.         Handle_the_Tag = fieldin
  141.     ElseIf tagVal = "</table>" Then
  142.         If FoundTable Then
  143.             Handle_the_Tag = Readend        'done reading file
  144.         Else
  145.             Handle_the_Tag = tableout
  146.         End If
  147.     ElseIf tagVal = "</tr>" Then 'write newline
  148.         If FoundTable Then tso.Write vbCrLf
  149.         Handle_the_Tag = tagout
  150.     ElseIf tagVal Like "<tr class=""colhead""*>" Then
  151.         Handle_the_Tag = rowskip
  152.     ElseIf tagVal Like "<a href=*" Then   'get Id and write id comma space
  153.         If FoundTable Then
  154.             p = InStrRev(tagVal, "id/") '<a href="/mlb/players/profile?playerId=4949"> Willie Harris</a>
  155.             '<a href="http://espn.go.com/mlb/player/_/id/6261/scott-baker">   Scott Baker
  156.             fval = "" & val(Mid(tagVal, p + 3))
  157.             decount = decount + 1
  158.             p = InStrRev(tagVal, "/")
  159.             fval = fval & ", " & Mid(tagVal, p + 1)
  160.             fval = Replace(fval, """>", "")
  161.  
  162.             tso.Write fval & ", "
  163.             fval = ""
  164.             Handle_the_Tag = fieldin
  165.         Else
  166.             Handle_the_Tag = tagout
  167.         End If
  168.     Else
  169.         Handle_the_Tag = fieldin
  170.     End If
  171. End Function
  172.  
  173.  
Oct 1 '13 #4
Seth Schrock
2,965 Recognized Expert Specialist
Expand|Select|Wrap|Line Numbers
  1. <table class="table1" cellpadding="0" cellspacing="0" width="185">
  2. <colgroup class="stdColgroup">
  3. <col span="1" class="hdrCol"><col span="4" class="stdCol">
  4. </colgroup>
  5. <thead>
  6.    <tr><th colspan="5"><a href="/pmms/" target="_top">Freddie Mac's Primary Mortgage Market Survey</a>&reg;</th></tr>
  7.    <tr class="theadSub"><td>&nbsp;</td><td>30YR FRM</td><td>15YR FRM</td><td>5YR ARM</td><td>1YR ARM</td></tr>
  8. </thead>    
  9.  
  10. <tfoot>
  11.    <tr><td colspan="5">Copyright 2013, <a href="http://www.freddiemac.com/" target="_top">Freddie Mac</a>. 
  12.    Averages are for conforming mortgages with 20% down.</td></tr>
  13. </tfoot>
  14. <tr><th>Avg.</th><td>4.32 </td><td>3.37 </td> 
  15.     <td>3.07 </td><td>2.63 </td></tr>
  16. <tr><th>Fees&nbsp;& Points</th><td>0.7 </td><td>0.7 </td>
  17.     <td>0.5 </td><td>0.4 </td></tr>
  18.  
  19. </table>    
  20.  
That is the HTML Table code. I know that I can import an HTML document using the import wizard in Access 2010, so I suppose that I could save the text as a file and then import the file, but I don't know how to do those steps either.

I will have to look at the Split and Regular functions to see how I can use them as I'm not familiar with those.
Oct 1 '13 #5
ADezii
8,834 Recognized Expert Expert
Within a couple of minutes I was able to extract most of the Primary Data as such:
Expand|Select|Wrap|Line Numbers
  1. Dim strHTML As String
  2. Dim varSplit As Variant
  3. Dim intCtr As Integer
  4.  
  5. strHTML = "<table class='table1' cellpadding='0' cellspacing='0' width='185'>" & _
  6.           "<colgroup class='stdColgroup'>" & _
  7.           "<col span='1' class='hdrCol'><col span='4' class='stdCol'>" & _
  8.           "</colgroup>" & _
  9.           "<thead>" & _
  10.           "<tr><th colspan='5'><a href='/pmms/' target='_top'>Freddie Mac's Primary Mortgage Market Survey</a>&reg;</th></tr>" & _
  11.           "<tr class='theadSub'><td>&nbsp;</td><td>30YR FRM</td><td>15YR FRM</td><td>5YR ARM</td><td>1YR ARM</td></tr>" & _
  12.           "</thead>" & _
  13.           "<tfoot>" & _
  14.           "<tr><td colspan='5'>Copyright 2013, <a href='http://www.freddiemac.com/' target='_top'>Freddie Mac</a>." & _
  15.           "Averages are for conforming mortgages with 20% down.</td></tr>" & _
  16.           "</tfoot>" & _
  17.           "<tr><th>Avg.</th><td>4.32 </td><td>3.37 </td>" & _
  18.           "<td>3.07 </td><td>2.63 </td></tr>" & _
  19.           "<tr><th>Fees&nbsp;& Points</th><td>0.7 </td><td>0.7 </td>" & _
  20.           "<td>0.5 </td><td>0.4 </td></tr>" & _
  21.           "</table>"
  22.  
  23. varSplit = Split(strHTML, "<td>")
  24.  
  25. For intCtr = LBound(varSplit) To UBound(varSplit)
  26.   Debug.Print Split(varSplit(intCtr), " ")(0)
  27. Next
OUTPUT:
Expand|Select|Wrap|Line Numbers
  1. <table
  2. &nbsp;</td>
  3. 30YR
  4. 15YR
  5. 5YR
  6. 1YR
  7. 4.32
  8. 3.37
  9. 3.07
  10. 2.63
  11. 0.7
  12. 0.7
  13. 0.5
  14. 0.4
Oct 1 '13 #6
Seth Schrock
2,965 Recognized Expert Specialist
I appreciate all the help. Unfortunately, I have been pulled into another task right now, but hopefully I will be able to test all of your solutions in the next few days. I will post back the results and if I need any help. Thanks again.
Oct 1 '13 #7
Seth Schrock
2,965 Recognized Expert Specialist
I have finally gotten back to this. Thanks ADezii. It worked perfectly and it is extremely easy to implement.
Oct 22 '13 #8
ADezii
8,834 Recognized Expert Expert
You are quite weldcome, Seth.
Oct 24 '13 #9

Sign in to post your reply or Sign up for a free account.

Similar topics

3
45184
by: John Kolvereid | last post by:
How does one import data INTO a table. I am forced to read files (\i) w/ a ton of insert statements. I am trying to import approx 800 lines into the table. However, I must bundle them into a series of files because of the bug in PGSQL (the buffer overflows). Interestingly, the literature claims that the one feature of PostgreSQL is SPEED....
0
1047
by: William Fields | last post by:
Hello, I've used FrontPage to build some simple websites and am interested in what Visual Studio and ASP.NET can give me. Is there a way to import a website into a Visual Studio Web project like you can with FrontPage? Thanks.
3
2673
by: Schultz | last post by:
I would like to know if it is possible to import data from MS Excel 2000 to a dataset using asp.net. The excel file would need to be uploaded to the server from a web page first, before the loaded into a dataset. After the data is stored in the dataset, i would need to write the information to an access db. If it is possible to do this, where...
1
2315
by: Deep | last post by:
i whould u like to import data from sql server database into excel sheet using vb.net
0
12754
by: barrybevel | last post by:
Hi, I'm trying to login to the www.vodafone.ie website using HttpWebRequest. It works fine with IE/Firefox and the .NET Web Control too, just not with my code. I think it's a redirect 302 problem. I'm using this code in a ASP.NET 2.0 application just in case that matters, maybe someone knows a better way to do this?
1
3005
by: Nick Craig-Wood | last post by:
I've been dumping a database in a python code format (for use with Python on S60 mobile phone actually) and I've noticed that it uses absolutely tons of memory as compared to how much the data structure actually needs once it is loaded in memory. The programs below create a file (z.py) with a data structure in which looks like this --...
1
3726
by: crs27 | last post by:
Hai All, I want to import data from Oracle 10g to POstgreSQl 8.2.Would like to know if their is any way to import data from pgAdmin III. Thought of using Copy command,but for this the files needs to be either in Ascii or Binary formate. Kindly suggest me any means of importing data from oracle to postgres. Thank You
0
1245
by: nuray | last post by:
I have a spreadsheet where I run a query in Access and copy it into the specific spreadsheets. I was trying to use Data->import external data->import data tool in excel to get the data automaticly w/o me going into Access and running the query and pasting into the spreadsheet. However, I can go to the database using this tool in excel, but I can...
9
5762
by: a | last post by:
Dear friends I want import data from CSV file to mdb file How can I do that in vb.net?
1
5018
by: cindy7 | last post by:
how can i read excel data and import into sql using cfm??? I have a CFM form for user to import excel file into database and save file in the folder at server location. I would like to read that excel file and import data into sql database each time the user upload the file and create new record. Any idea how to solve this problem, every...
0
7228
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7620
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7195
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7585
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
5761
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5150
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3295
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3292
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
521
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.