473,386 Members | 1,698 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Counting lines/characters in an uploaded .DOC/.RTF file using ASP.NET

j
Hi,
I've been trying to do line/character counts on documents that are
being uploaded. As well as the "counting" I also have to remove
certain sections from the file.
So, firstly I was working with uploaded MS WORD .doc files. Using code
like that below:

strLine = sr.ReadLine
While Not IsNothing(strLine) 'Not eof
If Trim(strLine) <> "" Then 'Not blank
'increment counter & capture line text
lc += 1
sbFileContent.Append(strLine + vbCr) 'Put CR into string to mark
line break
End If
strLine = sr.ReadLine
End While
sr.Close()

and with a subsequent count on the number of vbCr in the
string-builder contents (sbFileContent) I was hoping to count the
number of "visible" & non-blank lines (and thus characters) in the
file.

My first problem:
If you type in WORD WITHOUT using any line break characters (vbCr,
vbLf, vbCrLf etc), the typing naturally wraps at the edge of the page
so that on visual inspection a document might have 1 paragraph
consisting of 8 lines BUT in fact what you actually have is 1
continuous string with no line breaks. I guess I'm wondering how can
you count lines in a WORD file like its native line counter but
without using WORD on the server!
How does WORD do it anyway? Does it calculate the number of lines by
dividing the total number of characters in the file by the width of
the page in characters????

My second problem:
I have to edit the file to remove some sections. I need to edit the
file and re-save it which, when the file is a MS WORD .doc file, is
problematic considering I don't have WORD on the server. The file just
gets corrupted and when I have to open it later I just get gibberish.

So, I thought about using an RTF file saved from WORD as the uploaded
document. Now, the benefits of RTF is that I can definitely do the
search & replace function and resave the docuemnt WITHOUT causing any
corruption of the document.
However, I have much the same "line counting" problems as I had with
WORD except that now I even have the RTF formatting markup do deal
with which is in the actual content of the file. So, I guess I'm
wondering how to do line counting of visible, non-blank lines in an
RTF while ignoring the RTF markup. Again I'm gonna have the same
problems with the counting of lines where word wrapping is what is
responsible for breaking of a continuous paragraph into a number of
lines.

So, I need a solution that will allow me to count the number of
visible lines in either a WORD or RTF file AND a suggestion of how to
edit (Search/Replace & Save) that file, after the counting process!!!

Would anyone have any suggestions, bright ideas, hacks, references,
code, sleep they'd like to give me I'd be very grateful!
Thanks for listening,
J
Nov 17 '05 #1
1 6897
I would suggest you post this question to the Word/Office newsgroups. This
is not an ASP.Net-related question.

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
http://www.takempis.com
Big things are made up of
lots of little things.

"j" <j_**********@hotmail.com> wrote in message
news:16**************************@posting.google.c om...
Hi,
I've been trying to do line/character counts on documents that are
being uploaded. As well as the "counting" I also have to remove
certain sections from the file.
So, firstly I was working with uploaded MS WORD .doc files. Using code
like that below:

strLine = sr.ReadLine
While Not IsNothing(strLine) 'Not eof
If Trim(strLine) <> "" Then 'Not blank
'increment counter & capture line text
lc += 1
sbFileContent.Append(strLine + vbCr) 'Put CR into string to mark
line break
End If
strLine = sr.ReadLine
End While
sr.Close()

and with a subsequent count on the number of vbCr in the
string-builder contents (sbFileContent) I was hoping to count the
number of "visible" & non-blank lines (and thus characters) in the
file.

My first problem:
If you type in WORD WITHOUT using any line break characters (vbCr,
vbLf, vbCrLf etc), the typing naturally wraps at the edge of the page
so that on visual inspection a document might have 1 paragraph
consisting of 8 lines BUT in fact what you actually have is 1
continuous string with no line breaks. I guess I'm wondering how can
you count lines in a WORD file like its native line counter but
without using WORD on the server!
How does WORD do it anyway? Does it calculate the number of lines by
dividing the total number of characters in the file by the width of
the page in characters????

My second problem:
I have to edit the file to remove some sections. I need to edit the
file and re-save it which, when the file is a MS WORD .doc file, is
problematic considering I don't have WORD on the server. The file just
gets corrupted and when I have to open it later I just get gibberish.

So, I thought about using an RTF file saved from WORD as the uploaded
document. Now, the benefits of RTF is that I can definitely do the
search & replace function and resave the docuemnt WITHOUT causing any
corruption of the document.
However, I have much the same "line counting" problems as I had with
WORD except that now I even have the RTF formatting markup do deal
with which is in the actual content of the file. So, I guess I'm
wondering how to do line counting of visible, non-blank lines in an
RTF while ignoring the RTF markup. Again I'm gonna have the same
problems with the counting of lines where word wrapping is what is
responsible for breaking of a continuous paragraph into a number of
lines.

So, I need a solution that will allow me to count the number of
visible lines in either a WORD or RTF file AND a suggestion of how to
edit (Search/Replace & Save) that file, after the counting process!!!

Would anyone have any suggestions, bright ideas, hacks, references,
code, sleep they'd like to give me I'd be very grateful!
Thanks for listening,
J

Nov 17 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Dale Atkin | last post by:
As part of a larger project, I need to be able to count the number of lines in a file (so I know what to expect). Anyways, I came accross the following code that seems to do the trick, the only...
1
by: jabailo | last post by:
Which would be faster for counting lines in a StreamReader: (a) iterate through a file using .ReadLine() and adding to a counter, i++ (b) doing a .ReadToEnd() and then using an IndexOf() method...
5
by: Neo | last post by:
Hello: I am receiving a Binary File in a Request from a application. The stream which comes to me has the boundary (Something like "---------------------------39<WBR>­0C0F3E0099" without the...
3
by: Brent | last post by:
Say that I have a text box that holds 5 lines of text. If the user enters ten text lines, I want the first four lines of text displayed, followed by a "More" hyperlink. The link will pop up another...
5
by: andy.lee23 | last post by:
hi im having trouble counting lines in a text file, i have the following code int node1, node2, i; char name; float value; ifstream fin; fin.open(OpenDialog1->FileName.c_str()); i=1;
4
by: bigbagy | last post by:
Notes The programs will be compiled and tested on the machine which runs the Linux operating system. V3.4 of the GNU C/C++ compiler (gcc ,g++) must be used. A significant amount coding is...
7
by: peraklo | last post by:
Hello, there is another problem i am facing. i have a text file which is about 15000 lines big. i have to cut the last 27 lines from that file and create a new text file that contans those 27...
8
by: xiaolim | last post by:
i making a simple program to count the different kinds of characters in a text file and then display them out, however i only manage to count the total numbers of characters. #include...
12
by: Punkis | last post by:
Hi all, I have a problem with my php and mysql project. I use an auctions software, named phpauction for my project. I import into my database with utf8 encodingm and I can see the greek...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.