473,386 Members | 1,644 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Output html file to text file

jimleon
74
Hi People,

I have a link in my vba code:

Application.FollowHyperlink "http://www.xxxxxx.co.uk/"

but I dont want to open a browser (or show it minimised) but need to dump the resultant html page to a text file to search for a string.

I have tried:
Application.FollowHyperlink "http://www.xxxxxx.co.uk/" >c:\temp.txt

but that doesnt work.

Any ideas?
Oct 7 '09 #1

✓ answered by Delerna

Hi Neopa
Yes I normally use option explict, but the code is something that I threw together in about 15 minutes, for the sole purpose of answering the posted question. So to answer your question I did not use Option explicit and the posted code is complete.

I hope I can be forgiven for the bad programming practice because, I can try and excuse it with the simplicity of the program but there is no excuse in reality :{
No offence is taken, you are just....right!

As to the "Microsoft Internet Controls" library, that is definitely the one.
I also have a reference to the "Microsoft HTML Object" Library, so I tried removing the reference to the "Microsoft Internet Controls" and then checked intellisense by typing a space immediately after "As" in

Private ieBrowser As InternetExplorer

and there was no InternetExplorer in the list.
so its definitely "Microsoft Internet Controls"

The code is actually written from ideas presented on a few sites I found with google. Much of the code was copy paste as evidenced by the variable you point out sDocHTML. My normal practice would have it as strDocHTML
Anyway, here is a cleaned up version of the code

Expand|Select|Wrap|Line Numbers
  1. Option Compare Database
  2. Option Explicit
  3.  
  4. Private ieBrowser As InternetExplorer
  5.  
  6. Private Sub Form_Load()
  7.    Dim strDocHTML As String, dteStartTime As Date
  8.    'Create a browser object
  9.    Set ieBrowser = CreateObject("internetexplorer.application")
  10.    ieBrowser.Navigate "http://www.delerna.com/Index.asp"
  11.  
  12.    'Wait for the page to load. Exit Form_load sub, doing nothing, if loading the page takes too long
  13.    dteStartTime = Now
  14.    Do While ieBrowser.readyState <> READYSTATE_COMPLETE
  15.       If DateDiff("s", dteStartTime, Now) > 240 Then Exit Sub
  16.    Loop
  17.  
  18.    'Get the page contents
  19.    strDocHTML = ieBrowser.Document.documentElement.innerHTML
  20.  
  21.    'And save it
  22.    Open "c:\Test.txt" For Output As 1
  23.    Print #1, strDocHTML
  24.    Close #1
  25.  
  26.    'destroy the browser object
  27.    Set ieBrowser = Nothing
  28. End Sub
  29.  

9 11838
NeoPa
32,556 Expert Mod 16PB
I cannot help much, other than to point you towards the Microsoft Web Browser OLE class. It needs a Reference set up to the Microsoft HTML Reference Library.

Hope this helps.

I'm pretty sure using the .FollowHyperlink won't get you what you want.
Oct 7 '09 #2
Delerna
1,134 Expert 1GB
Here is my attempt. First time I have done this and I like it and I will use it.
Its something I have contemplated for a quite while but never got around to it.
I have kept the code as simple as possible so its workings are obvious.
You will need to add error checking etc etc.


Step 1
add a reference to the "Microsoft Internet Controlls" type library

Step 2
add the code and modify for your situation
Expand|Select|Wrap|Line Numbers
  1. Private ieBrowser As InternetExplorer
  2.  
  3. Private Sub Form_Load()
  4.    'Create a browser object
  5.    Set ieBrowser = CreateObject("internetexplorer.application")
  6.    ieBrowser.Navigate "http://www.delerna.com/Index.asp"
  7.  
  8.    'Wait for the page to load
  9.    dtStartTime = Now
  10.    Do While ieBrowser.readyState <> READYSTATE_COMPLETE
  11.       If DateDiff("s", dtStartTime, Now) > 240 Then Exit Sub
  12.    Loop
  13.  
  14.    'Get the page contents
  15.    sDocHTML = ieBrowser.Document.documentElement.innerHTML
  16.  
  17.    'And save it
  18.    Open "c:\Test.txt" For Output As 1
  19.    Print #1, sDocHTML
  20.    Close #1
  21.  
  22.    'destroy the browser object
  23.    Set ieBrowser = Nothing
  24. End Sub
  25.  
Good luck and thanks for influencing me to actually sit down and finally do it.
Oct 16 '09 #3
NeoPa
32,556 Expert Mod 16PB
@Delerna
I found a Microsoft HTML Object Library reference Delerna, but nothing similar to that :S
Oct 16 '09 #4
NeoPa
32,556 Expert Mod 16PB
Like you I think I'm ready to start playing in this area. I'd be interested in following your code more closely, but I find there are items which are not declared (sDocHTML, etc). Is this because it's declared elsewhere or do you not have Option Explicit set? I ask this not to criticise you understand, but if you don't use this as standard then may I suggest you reconsider that approach. I have a short article on the matter here (Require Variable Declaration).

Again, I expect you may be doing this already and just omitted a couple of lines of your code. In that case please just ignore this (but I'm interested in the full code anyway ;)).
Oct 16 '09 #5
Delerna
1,134 Expert 1GB
Hi Neopa
Yes I normally use option explict, but the code is something that I threw together in about 15 minutes, for the sole purpose of answering the posted question. So to answer your question I did not use Option explicit and the posted code is complete.

I hope I can be forgiven for the bad programming practice because, I can try and excuse it with the simplicity of the program but there is no excuse in reality :{
No offence is taken, you are just....right!

As to the "Microsoft Internet Controls" library, that is definitely the one.
I also have a reference to the "Microsoft HTML Object" Library, so I tried removing the reference to the "Microsoft Internet Controls" and then checked intellisense by typing a space immediately after "As" in

Private ieBrowser As InternetExplorer

and there was no InternetExplorer in the list.
so its definitely "Microsoft Internet Controls"

The code is actually written from ideas presented on a few sites I found with google. Much of the code was copy paste as evidenced by the variable you point out sDocHTML. My normal practice would have it as strDocHTML
Anyway, here is a cleaned up version of the code

Expand|Select|Wrap|Line Numbers
  1. Option Compare Database
  2. Option Explicit
  3.  
  4. Private ieBrowser As InternetExplorer
  5.  
  6. Private Sub Form_Load()
  7.    Dim strDocHTML As String, dteStartTime As Date
  8.    'Create a browser object
  9.    Set ieBrowser = CreateObject("internetexplorer.application")
  10.    ieBrowser.Navigate "http://www.delerna.com/Index.asp"
  11.  
  12.    'Wait for the page to load. Exit Form_load sub, doing nothing, if loading the page takes too long
  13.    dteStartTime = Now
  14.    Do While ieBrowser.readyState <> READYSTATE_COMPLETE
  15.       If DateDiff("s", dteStartTime, Now) > 240 Then Exit Sub
  16.    Loop
  17.  
  18.    'Get the page contents
  19.    strDocHTML = ieBrowser.Document.documentElement.innerHTML
  20.  
  21.    'And save it
  22.    Open "c:\Test.txt" For Output As 1
  23.    Print #1, strDocHTML
  24.    Close #1
  25.  
  26.    'destroy the browser object
  27.    Set ieBrowser = Nothing
  28. End Sub
  29.  
Oct 18 '09 #6
NeoPa
32,556 Expert Mod 16PB
Perfectly timed Delerna :)

I was just answering a thread where the OP wanted information about the public facing IP address they were published as (I know. Don't even ask). I tried to explain why this was less straightforward than they imagined but ...

Anyway, I wanted to post a link to this thread but I was struggling to find it again. At this point you posted. Nice one!

FYI: The other thread is Anyone out there have a clean Get_External_IP_Address function?.
Oct 18 '09 #7
Delerna
1,134 Expert 1GB
Where to get "Microsoft Internet Controls"
I had the impression from the web sites that I visited that it was included with access. I have Access 2003.
I rarely use Access these days, we use asp and html as front ends to SQL server. Thats part of the reason I come here to answer questions...it maintains my access skillset....Access is a great tool and it also happens to be where I learned much of what I know.

Anyway, if it's not part of access then I also have installed on my computer.
Dot Net Framework SDK v2.0
Visual Web Dexeloper 2008 Express
and Visual Studio Pro 2005

Maybe it came from one of those?
Oct 18 '09 #8
NeoPa
32,556 Expert Mod 16PB
@Delerna
I did some searching and it seems that the Internet Client SDK is required for that reference.
Microsoft Internet Controls Life Saver
The Internet Client SDK can be downloaded from http://www.microsoft.com/ie/ie50
Thanks for your clearly explained answer. That has helped me find what I needed to proceed on this :) It should also help the OP of the linked thread. Bonus!
Oct 18 '09 #9
Delerna
1,134 Expert 1GB
I had a need to download the content of a web page on a regular basis so that I could monitor the current state of some facts and figures. So I wrote a vbscript version of the above so I could schedule it. I don't know if anyone is interested but I thought I would post it here.
The conversion was not very difficult, so you probably could have figured it out anyway.

Expand|Select|Wrap|Line Numbers
  1. const READYSTATE_COMPLETE=4
  2.  
  3. Dim ieBrowser  
  4.    'Create a browser object 
  5.    Set ieBrowser = CreateObject("internetexplorer.application") 
  6.    SavePageContent("c:\scripts\Test.txt","page URL")
  7.    Set ieBrowser = Nothing   'destroy the browser object 
  8.    MsgBox "Done"
  9.  
  10.  
  11. sub SavePageContent(Pth,Pge)
  12.    Dim strDocHTML, dteStartTime
  13.  
  14.    ieBrowser.Navigate Pge 
  15.  
  16.    'Wait for the page to load. Exit sub, doing nothing, if loading the page takes too long 
  17.    dteStartTime = Now 
  18.    Do While ieBrowser.readyState <> READYSTATE_COMPLETE 
  19.       If DateDiff("s", dteStartTime, Now) > 1000 Then Exit Sub 
  20.    Loop 
  21.  
  22.    'And save it 
  23.    Dim fso, MyFile
  24.    Set fso = CreateObject("Scripting.FileSystemObject")
  25.    Set MyFile = fso.CreateTextFile(Pth)
  26.    MyFile.WriteLine(ieBrowser.Document.documentElement.innerHTML )
  27.    MyFile.Close
  28.    set MyFile=nothing
  29.    set fso=nothing
  30. end sub
  31.  
Nov 24 '09 #10

Sign in to post your reply or Sign up for a free account.

Similar topics

2
by: Bill Sneddon | last post by:
Can any one tell me how to output the following string? <%response.write "<tr><td><a href=""file://SERVER/mmlogs/TNAME" & yearmonth & """>"& "MYJUNK" & "</a><BR></td></tr>" %> ...
5
by: Poster | last post by:
I have a script, its outputs are in HTML. It displays perferctly in a browser, however when I view source, it gives me the JS scripts, but I want to view the HTML output. Is there a way I can view...
6
by: Wescotte | last post by:
I'm writing a tiny php app that will log into our bank of america account and retrieve a file containing a list of checks that cleared the previous day. The problem I'm running into is when I...
9
by: Steve Peterson | last post by:
Hi I have an .aspx web form in which I would like to output only XML, no HTML. The datasource is a datatable that's bult on the fly based on user input from previous .aspx page so the XML output...
6
by: Skeptical | last post by:
Hello, I am trying to embed html output into my webform but could not figure out how to so far. The form will execute a Perl script with some parameters, and script will output some html...
1
by: Andrew | last post by:
I'm adding this as it to me a while to figure out all the pieces to be able to do this without using Microsoft.Office.Interop which caused me problems on the web-server. Streaming is the easy...
4
by: astromac | last post by:
I'm new to php and was wondering if the following was possible... I would like to have a list of items loaded from a text file, process each item in the list and then return the processed result...
14
by: dawnerd | last post by:
Hi, I am developing a CMS and came across something which has never happened to me before, and I re-wrote the specific script twice, both differently, and still had the same error. I'm not sure...
4
by: Jon Slaughter | last post by:
I'm using eval to excute some mixed php and html code but I cannot debug it. I am essentially using filegetcontents to load up a php/html file and then inserting it into another php/html file and...
11
by: JRough | last post by:
I'm trying to use output buffering to cheat so i can print to excel which is called later than this header(). header("Content-type: application/xmsdownload"); header("Content-Disposition:...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.