469,267 Members | 1,045 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,267 developers. It's quick & easy.

Line Input Method

Hello, all.

I am writing a program that parses an html report (generated by
another program).

I use the Open "C:\...." for Input As myValue method to open
the html, and read through it line by line, using "Line Input
#myValue, myLine" to do so.

My problem is that there are, occasionally, unusual Unicode characters
in the report. When the Line Input method hits a line with one of
these square-looking characters, it interprets it as an End-Of-File
marker. There is more information after it but VB refuses to access
it.

I cannot force another Line Input, and cannot even have a Mid()
function read any of the characters beyond this unicode character.

Does anyone have any ideas/suggestions on what command I can use to
trap this type of event? I was thinking I could use FileLen() to
identify how large the file is, and then compare it against a running
tally of the size of each line and say that, if fileLen >
runningTally, force past the current character.

Any help would be appreciated.

Trevor Fairchild
Jul 17 '05 #1
6 22296
Binary input

On 23 Dec 2003 08:14:28 -0800, MR*******@e-crime.on.ca (Trevor
Fairchild) wrote:
Hello, all.

I am writing a program that parses an html report (generated by
another program).

I use the Open "C:\...." for Input As myValue method to open
the html, and read through it line by line, using "Line Input
#myValue, myLine" to do so.


<snip>
Jul 17 '05 #2
Trevor Fairchild <MR*******@e-crime.on.ca> schreef in berichtnieuws
f5**************************@posting.google.com...
Hello, all.
Hello Trevor,
I am writing a program that parses an html report
(generated by another program).

I use the Open "C:\...." for Input As myValue method
to open the html, and read through it line by line, using "Line
Input #myValue, myLine" to do so.

My problem is that there are, occasionally, unusual Unicode
characters in the report. When the Line Input method hits a
line with one of these square-looking characters, it interprets
it as an End-Of-File marker.
That would be highly unusual ... The *only* character that's interpreted as
an EOF is CTRL-Z.
There is more information after it but VB refuses to access it.

I cannot force another Line Input, and cannot even have
a Mid() function read any of the characters beyond this
unicode character.

Does anyone have any ideas/suggestions on what command
I can use to trap this type of event? I was thinking I could
use FileLen() to identify how large the file is, and then
compare it against a running tally of the size of each line and
say that, if fileLen > runningTally, force past the current
character.


... which is half of what you are supposed to do :-)

1) Open the file "for binary: instead of "for input"
2) Replace the "eof(MyValue)" with "(not seek(MyValue) < lof(MyValue))"

That should be all ...

Regards,
Rudy Wieser

Jul 17 '05 #3
I had tried the binary on this issue and it didn't work - each time I
try to read the line, it only comes back with a portion of it...
I can't get into too much detail, but the program that makes this
report goes through a harddrive and rebuilds folder structures from
Unallocated Clusters - because these are folders that had been deleted
and now rebuilt, they are not always intact, and, as such, non-ascii
characters appear where the folder name cannot be 100% reconstructed.
I have put a sample of the character issue:

partsᨀ\00000001.did<br>

it is the square, after parts that is the problem.

When I use the Binary method, myLine only reads as "˙ūp" - I don't
know where the two other characters come from...
If I use the Input method, myLine comes out as "˙ūparts"

I can't even get VB6 to read that square - if I could, then I could
have it ignore it, or something...

"R.Wieser" <rw***************@xs4all.nl> wrote in message news:<3f*********************@dreader5.news.xs4all .nl>...
Trevor Fairchild <MR*******@e-crime.on.ca> schreef in berichtnieuws
f5**************************@posting.google.com...
Hello, all.


Hello Trevor,
I am writing a program that parses an html report
(generated by another program).

I use the Open "C:\...." for Input As myValue method
to open the html, and read through it line by line, using "Line
Input #myValue, myLine" to do so.

My problem is that there are, occasionally, unusual Unicode
characters in the report. When the Line Input method hits a
line with one of these square-looking characters, it interprets
it as an End-Of-File marker.


That would be highly unusual ... The *only* character that's interpreted as
an EOF is CTRL-Z.
There is more information after it but VB refuses to access it.

I cannot force another Line Input, and cannot even have
a Mid() function read any of the characters beyond this
unicode character.

Does anyone have any ideas/suggestions on what command
I can use to trap this type of event? I was thinking I could
use FileLen() to identify how large the file is, and then
compare it against a running tally of the size of each line and
say that, if fileLen > runningTally, force past the current
character.


... which is half of what you are supposed to do :-)

1) Open the file "for binary: instead of "for input"
2) Replace the "eof(MyValue)" with "(not seek(MyValue) < lof(MyValue))"

That should be all ...

Regards,
Rudy Wieser

Jul 17 '05 #4
even as I look at my message post, I see the square has been replaced
with



lol, you'll hve to trust me, though, it looks like a square on my
side, and even in my message - I guess Google groups, or Internet
Explorer changed it after I submitted it - it was a square.
Jul 17 '05 #5
> I had tried the binary on this issue and it didn't work - each time I
try to read the line, it only comes back with a portion of it...


In binary mode you get all the data, no lines. You have to parse
the data manually to break it up in any meaningful way. If you're
lucky, you might get End Of Line characters to help you, but don't
count on it.

Data in any file, is simple data, why didn't binary access work for
you?

LFS


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 100,000 Newsgroups - 19 Different Servers! =-----
Jul 17 '05 #6
On 24 Dec 2003 07:10:01 -0800, MR*******@e-crime.on.ca (Trevor
Fairchild) wrote:
even as I look at my message post, I see the square has been replaced
with



lol, you'll hve to trust me, though, it looks like a square on my
side, and even in my message - I guess Google groups, or Internet
Explorer changed it after I submitted it - it was a square.

Unicode is not really supported on the Web
Jul 17 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by josh dismukes | last post: by
7 posts views Thread by David A. Osborn | last post: by
3 posts views Thread by Beorne | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.