| re: how we extract data from html file
HTML is not a data structure and therefore can not be accessed like one.
XML is a data structure and there are rules for XML that don't exist for HTML. What you are doing is called "screen scraping" and is a dubious activity at best.
Here's an example:
In XML tags can identify a string as a segment of data, such as a <firstname> or <lastname> or <date>. HTML can only identify how the string is to be displayed (big bold and blue). HTML is a display structure.
You can only hope that whoever creates the HTML will never change how they do it - but they will. HTML is also typically not well formed (unless it is XHTML) and therefore cannot be parsed using standard XML tools.
In order to parse HTML you will pretty much be forced to do it all manually. Instr, Left, Right, Mid, Replace, etc. It gets pretty ugly - and then has to change every time the page author changes their mind.
'************************************************* **************
Public Function GetDatefromHTML(ByVal sHTML as String)as Date
Dim sTemp As String
Dim dDate As Date
sTemp = Mid$(sHTML, InStr(1, sHTML, "aktiva", vbTextCompare))
On Error Goto BadDate
dDate = CDate(Mid$(sTemp, [number of chars into sTemp the date begins], [length of the date string]))
On Error Goto 0
GetDatefromHTML = dDate
Exit Function
BadDate:
dDate = "1/1/1970"
Resume next
End Sub
'************************************************* ************************
There are lots of other ways to do this - this is pretty down and dirty. Using the cDate function (built into VB) it will convert most any valid date format into a DATE type - and will appear however your computer locale is set.
If the HTML changes though, or the date is bad it will return 1/1/1970 (you can pick any date - but it has to return a date.
|