Connecting Tech Pros Worldwide Forums | Help | Site Map

Counting lines/characters in an uploaded .DOC/.RTF file using ASP.NET

j
Guest
 
Posts: n/a
#1: Nov 17 '05
Hi,
I've been trying to do line/character counts on documents that are
being uploaded. As well as the "counting" I also have to remove
certain sections from the file.
So, firstly I was working with uploaded MS WORD .doc files. Using code
like that below:

strLine = sr.ReadLine
While Not IsNothing(strLine) 'Not eof
If Trim(strLine) <> "" Then 'Not blank
'increment counter & capture line text
lc += 1
sbFileContent.Append(strLine + vbCr) 'Put CR into string to mark
line break
End If
strLine = sr.ReadLine
End While
sr.Close()

and with a subsequent count on the number of vbCr in the
string-builder contents (sbFileContent) I was hoping to count the
number of "visible" & non-blank lines (and thus characters) in the
file.

My first problem:
If you type in WORD WITHOUT using any line break characters (vbCr,
vbLf, vbCrLf etc), the typing naturally wraps at the edge of the page
so that on visual inspection a document might have 1 paragraph
consisting of 8 lines BUT in fact what you actually have is 1
continuous string with no line breaks. I guess I'm wondering how can
you count lines in a WORD file like its native line counter but
without using WORD on the server!
How does WORD do it anyway? Does it calculate the number of lines by
dividing the total number of characters in the file by the width of
the page in characters????

My second problem:
I have to edit the file to remove some sections. I need to edit the
file and re-save it which, when the file is a MS WORD .doc file, is
problematic considering I don't have WORD on the server. The file just
gets corrupted and when I have to open it later I just get gibberish.

So, I thought about using an RTF file saved from WORD as the uploaded
document. Now, the benefits of RTF is that I can definitely do the
search & replace function and resave the docuemnt WITHOUT causing any
corruption of the document.
However, I have much the same "line counting" problems as I had with
WORD except that now I even have the RTF formatting markup do deal
with which is in the actual content of the file. So, I guess I'm
wondering how to do line counting of visible, non-blank lines in an
RTF while ignoring the RTF markup. Again I'm gonna have the same
problems with the counting of lines where word wrapping is what is
responsible for breaking of a continuous paragraph into a number of
lines.

So, I need a solution that will allow me to count the number of
visible lines in either a WORD or RTF file AND a suggestion of how to
edit (Search/Replace & Save) that file, after the counting process!!!

Would anyone have any suggestions, bright ideas, hacks, references,
code, sleep they'd like to give me I'd be very grateful!
Thanks for listening,
J

Kevin Spencer
Guest
 
Posts: n/a
#2: Nov 17 '05

re: Counting lines/characters in an uploaded .DOC/.RTF file using ASP.NET


I would suggest you post this question to the Word/Office newsgroups. This
is not an ASP.Net-related question.

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
http://www.takempis.com
Big things are made up of
lots of little things.

"j" <j_mcmullin76@hotmail.com> wrote in message
news:1603d3ce.0307220740.1037a9c2@posting.google.c om...[color=blue]
> Hi,
> I've been trying to do line/character counts on documents that are
> being uploaded. As well as the "counting" I also have to remove
> certain sections from the file.
> So, firstly I was working with uploaded MS WORD .doc files. Using code
> like that below:
>
> strLine = sr.ReadLine
> While Not IsNothing(strLine) 'Not eof
> If Trim(strLine) <> "" Then 'Not blank
> 'increment counter & capture line text
> lc += 1
> sbFileContent.Append(strLine + vbCr) 'Put CR into string to mark
> line break
> End If
> strLine = sr.ReadLine
> End While
> sr.Close()
>
> and with a subsequent count on the number of vbCr in the
> string-builder contents (sbFileContent) I was hoping to count the
> number of "visible" & non-blank lines (and thus characters) in the
> file.
>
> My first problem:
> If you type in WORD WITHOUT using any line break characters (vbCr,
> vbLf, vbCrLf etc), the typing naturally wraps at the edge of the page
> so that on visual inspection a document might have 1 paragraph
> consisting of 8 lines BUT in fact what you actually have is 1
> continuous string with no line breaks. I guess I'm wondering how can
> you count lines in a WORD file like its native line counter but
> without using WORD on the server!
> How does WORD do it anyway? Does it calculate the number of lines by
> dividing the total number of characters in the file by the width of
> the page in characters????
>
> My second problem:
> I have to edit the file to remove some sections. I need to edit the
> file and re-save it which, when the file is a MS WORD .doc file, is
> problematic considering I don't have WORD on the server. The file just
> gets corrupted and when I have to open it later I just get gibberish.
>
> So, I thought about using an RTF file saved from WORD as the uploaded
> document. Now, the benefits of RTF is that I can definitely do the
> search & replace function and resave the docuemnt WITHOUT causing any
> corruption of the document.
> However, I have much the same "line counting" problems as I had with
> WORD except that now I even have the RTF formatting markup do deal
> with which is in the actual content of the file. So, I guess I'm
> wondering how to do line counting of visible, non-blank lines in an
> RTF while ignoring the RTF markup. Again I'm gonna have the same
> problems with the counting of lines where word wrapping is what is
> responsible for breaking of a continuous paragraph into a number of
> lines.
>
> So, I need a solution that will allow me to count the number of
> visible lines in either a WORD or RTF file AND a suggestion of how to
> edit (Search/Replace & Save) that file, after the counting process!!!
>
> Would anyone have any suggestions, bright ideas, hacks, references,
> code, sleep they'd like to give me I'd be very grateful!
> Thanks for listening,
> J[/color]


Closed Thread