473,597 Members | 2,726 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

File Read Progress Indicator

I am working on a program that reads and processes large text files (on
the order of 32 MB, so not too huge), so I wanted to add a progress
indicator so I can estimate when it will finish. I just need an
estimate, so the exact byte count isn't essential.

// reduced code
// assume necessary #include's and using declarations for std
// components

ifstream file(filename.c _str());
// read 2 header lines
for (int i = 0; i != 2; ++i) {
string header;
getline(file, header);
}
ifstream::pos_t ype start_of_data = file.tellg();

file.seekg(0, ios::end);
ifstream::pos_t ype end_of_data = file.tellg();
file.seekg(star t_of_data);
for (string line; getline(file, line); ) {
do_something_wi th(line);

int percent_done =
static_cast<uns igned long>(file.tell g()) * 100 / end_of_data;

cout << percent_done << "%\n";
}

This outline seems to work well. My question is: is the cast from the
return type of ifstream::tellg () to unsigned long well-defined? The
reason I am casting to an unsigned type in the first place is that
without the cast, eventually negative percents were being displayed.

Also, are there any other issues with my usage of tellg()? I remember
reading somewhere that the result of tellg() isn't guaranteed to be able
to represent any valid filesize, but I don't know if there is any way
around this issue using only standard components.

--
Marcus Kwok
Replace 'invalid' with 'net' to reply
May 18 '07 #1
7 3865
int percent_done =
static_cast<uns igned long>(file.tell g()) * 100 / end_of_data;
This outline seems to work well. My question is: is the cast from the
return type of ifstream::tellg () to unsigned long well-defined? The
reason I am casting to an unsigned type in the first place is that
without the cast, eventually negative percents were being displayed.
It seems that the negativity problem you're seeing would have to be
when you're hitting that limit, but specifically when file.tellg() *
100 hits that limit. You probably should do this calculation in
doubles, then convert back to ints at the end.

How many bits does unsigned long have on your system? If it's 64,
then ignore the previous paragraph, as you're very unlikely to be
hitting that limit.

Michael

May 18 '07 #2
On May 18, 10:17 pm, ricec...@gehenn om.invalid (Marcus Kwok) wrote:
I am working on a program that reads and processes large text files (on
the order of 32 MB, so not too huge), so I wanted to add a progress
indicator so I can estimate when it will finish. I just need an
estimate, so the exact byte count isn't essential.
// reduced code
// assume necessary #include's and using declarations for std
// components
ifstream file(filename.c _str());
// read 2 header lines
for (int i = 0; i != 2; ++i) {
string header;
getline(file, header);
}
ifstream::pos_t ype start_of_data = file.tellg();
file.seekg(0, ios::end);
ifstream::pos_t ype end_of_data = file.tellg();
file.seekg(star t_of_data);
for (string line; getline(file, line); ) {
do_something_wi th(line);
int percent_done =
static_cast<uns igned long>(file.tell g()) * 100 / end_of_data;

cout << percent_done << "%\n";
}
This outline seems to work well. My question is: is the cast from the
return type of ifstream::tellg () to unsigned long well-defined?
No. First, the return type is a streampos, which may not even
be convertible to an integral type. Second, even when it is
convertible, there is not necessarily a direct relationship
between the numeric value and the number of bytes in the file.
Third, even on systems where there is an exact relationship
(Unix), or a more or less rough relationship (Windows), and
unsigned long is generally not large enough. (Unix defines a
special type, ssize_t, for this; Microsoft uses a struct
LARGE_INTEGER.) If you're sure that the files can never be more
than, say, 100 MB, then this is not necessarily a consideration.
The
reason I am casting to an unsigned type in the first place is that
without the cast, eventually negative percents were being displayed.
Overflow. The length of a file often doesn't fit into a long to
begin with, and then you go ahead and multiply it by 100. Since
you're interested in per cent, and exact precision isn't an
issue, I'd cast it to double, and use floating point arithmetic.
Also, are there any other issues with my usage of tellg()? I remember
reading somewhere that the result of tellg() isn't guaranteed to be able
to represent any valid filesize, but I don't know if there is any way
around this issue using only standard components.
There's no real solution if you want to remain 100% standard,
because there are real systems where what you want simply isn't
possible. If you're willing to limit portability to Windows and
Unix, however, converting the results of tellg() to double, and
using it, should work. (The results may be off by a couple of
percent under Windows, but typically, the error will be more or
less the same for each call, so your calculations of per cent
will probably end up more precise than expected. Supposing that
the file has more or less homogeonous contents, at least.)

--
James Kanze (Gabi Software) email: ja*********@gma il.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

May 19 '07 #3
On May 19, 1:01 am, James Kanze <james.ka...@gm ail.comwrote:
On May 18, 10:17 pm, ricec...@gehenn om.invalid (Marcus Kwok) wrote:
I am working on a program that reads and processes large text files (on
the order of 32 MB, so not too huge), so I wanted to add a progress
indicator so I can estimate when it will finish. I just need an
estimate, so the exact byte count isn't essential.
// reduced code
// assume necessary #include's and using declarations for std
// components
ifstream file(filename.c _str());
// read 2 header lines
for (int i = 0; i != 2; ++i) {
string header;
getline(file, header);
}
ifstream::pos_t ype start_of_data = file.tellg();
file.seekg(0, ios::end);
ifstream::pos_t ype end_of_data = file.tellg();
file.seekg(star t_of_data);
for (string line; getline(file, line); ) {
do_something_wi th(line);
int percent_done =
static_cast<uns igned long>(file.tell g()) * 100 / end_of_data;
cout << percent_done << "%\n";
}
This outline seems to work well. My question is: is the cast from the
return type of ifstream::tellg () to unsigned long well-defined?

No. First, the return type is a streampos, which may not even
be convertible to an integral type. Second, even when it is
convertible, there is not necessarily a direct relationship
between the numeric value and the number of bytes in the file.
Third, even on systems where there is an exact relationship
(Unix), or a more or less rough relationship (Windows), and
unsigned long is generally not large enough. (Unix defines a
special type, ssize_t, for this; Microsoft uses a struct
LARGE_INTEGER.) If you're sure that the files can never be more
than, say, 100 MB, then this is not necessarily a consideration.
The
reason I am casting to an unsigned type in the first place is that
without the cast, eventually negative percents were being displayed.

Overflow. The length of a file often doesn't fit into a long to
begin with, and then you go ahead and multiply it by 100. Since
you're interested in per cent, and exact precision isn't an
issue, I'd cast it to double, and use floating point arithmetic.
Also, are there any other issues with my usage of tellg()? I remember
reading somewhere that the result of tellg() isn't guaranteed to be able
to represent any valid filesize, but I don't know if there is any way
around this issue using only standard components.

There's no real solution if you want to remain 100% standard,
because there are real systems where what you want simply isn't
possible. If you're willing to limit portability to Windows and
Unix, however, converting the results of tellg() to double, and
using it, should work. (The results may be off by a couple of
percent under Windows, but typically, the error will be more or
less the same for each call, so your calculations of per cent
will probably end up more precise than expected. Supposing that
the file has more or less homogeonous contents, at least.)

--
James Kanze (Gabi Software) email: james.ka...@gma il.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

May 19 '07 #4
Michael <mc******@aol.c omwrote:
> int percent_done =
static_cast<uns igned long>(file.tell g()) * 100 / end_of_data;
>This outline seems to work well. My question is: is the cast from the
return type of ifstream::tellg () to unsigned long well-defined? The
reason I am casting to an unsigned type in the first place is that
without the cast, eventually negative percents were being displayed.

It seems that the negativity problem you're seeing would have to be
when you're hitting that limit, but specifically when file.tellg() *
100 hits that limit. You probably should do this calculation in
doubles, then convert back to ints at the end.
Thanks, that's the same advice James Kanze gave as well.
How many bits does unsigned long have on your system? If it's 64,
then ignore the previous paragraph, as you're very unlikely to be
hitting that limit.
sizeof(unsigned long) * CHAR_BIT = 32 on my platform (Windows XP, VS
2005).

--
Marcus Kwok
Replace 'invalid' with 'net' to reply
May 21 '07 #5
James Kanze <ja*********@gm ail.comwrote:
On May 18, 10:17 pm, ricec...@gehenn om.invalid (Marcus Kwok) wrote:
>This outline seems to work well. My question is: is the cast from the
return type of ifstream::tellg () to unsigned long well-defined?

No. First, the return type is a streampos, which may not even
be convertible to an integral type. Second, even when it is
convertible, there is not necessarily a direct relationship
between the numeric value and the number of bytes in the file.
Third, even on systems where there is an exact relationship
(Unix), or a more or less rough relationship (Windows), and
unsigned long is generally not large enough. (Unix defines a
special type, ssize_t, for this; Microsoft uses a struct
LARGE_INTEGER.) If you're sure that the files can never be more
than, say, 100 MB, then this is not necessarily a consideration.
>The
reason I am casting to an unsigned type in the first place is that
without the cast, eventually negative percents were being displayed.

Overflow. The length of a file often doesn't fit into a long to
begin with, and then you go ahead and multiply it by 100. Since
you're interested in per cent, and exact precision isn't an
issue, I'd cast it to double, and use floating point arithmetic.
Thanks, I think I'll go this route.

As an aside, the conversion from streampos to double is well-defined?
Or it just will work in practice? Right now it only needs to work on
Windows but we may use it on HP-UX in the future.
>Also, are there any other issues with my usage of tellg()? I remember
reading somewhere that the result of tellg() isn't guaranteed to be able
to represent any valid filesize, but I don't know if there is any way
around this issue using only standard components.

There's no real solution if you want to remain 100% standard,
because there are real systems where what you want simply isn't
possible. If you're willing to limit portability to Windows and
Unix, however, converting the results of tellg() to double, and
using it, should work.
I see, so I guess this answers my above question :)

--
Marcus Kwok
Replace 'invalid' with 'net' to reply
May 21 '07 #6
On May 21, 8:03 pm, ricec...@gehenn om.invalid (Marcus Kwok) wrote:
James Kanze <james.ka...@gm ail.comwrote:
On May 18, 10:17 pm, ricec...@gehenn om.invalid (Marcus Kwok) wrote:
This outline seems to work well. My question is: is the cast from the
return type of ifstream::tellg () to unsigned long well-defined?
No. First, the return type is a streampos, which may not even
be convertible to an integral type. Second, even when it is
convertible, there is not necessarily a direct relationship
between the numeric value and the number of bytes in the file.
Third, even on systems where there is an exact relationship
(Unix), or a more or less rough relationship (Windows), and
unsigned long is generally not large enough. (Unix defines a
special type, ssize_t, for this; Microsoft uses a struct
LARGE_INTEGER.) If you're sure that the files can never be more
than, say, 100 MB, then this is not necessarily a consideration.
The
reason I am casting to an unsigned type in the first place is that
without the cast, eventually negative percents were being displayed.
Overflow. The length of a file often doesn't fit into a long to
begin with, and then you go ahead and multiply it by 100. Since
you're interested in per cent, and exact precision isn't an
issue, I'd cast it to double, and use floating point arithmetic.
Thanks, I think I'll go this route.
As an aside, the conversion from streampos to double is well-defined?
Or it just will work in practice? Right now it only needs to work on
Windows but we may use it on HP-UX in the future.
First, it's not defined at all; there is (in the standard) no
direct conversion from streampos to an arithmetic type. There
is an implicite conversion from streampos to streamoff, however,
and streamoff is required to be convertible to an integral type;
in most implementations , streamoff is in fact a typedef of an
integral type. If streamoff is a typedef to an integral type,
streampos will convert implicitly to any arithmetic type; if it
is a user defined type, you'll need some explicit conversion in
there somewhere.

More significantly, of course, the semantics of the conversion
are more or less undefined; there is a set of operations which
are required to work, but there's nothing to stop the resulting
integral type from being a magic number, or (more likely), some
formatted representation, with different bits having different
significations.

In practice, of course: under Unix or Windows, streamoff will be
an integral type, and it will represent the number of bytes at
the system level from the start of the file. Under Unix, this
means exactly the number of bytes that you read; under Windows,
the number may be slightly higher, but perfectly adequate for
things like a progress bar. This solution typically won't work
on mainframes, but then, mainframes don't usually have the sort
of terminals attached to them where a running indication of
progress would make sense. (And they're different enough from
Unix/Windows that there are probably other things in your code
which would require fixing.)
Also, are there any other issues with my usage of tellg()? I remember
reading somewhere that the result of tellg() isn't guaranteed to be able
to represent any valid filesize, but I don't know if there is any way
around this issue using only standard components.
There's no real solution if you want to remain 100% standard,
because there are real systems where what you want simply isn't
possible. If you're willing to limit portability to Windows and
Unix, however, converting the results of tellg() to double, and
using it, should work.
I see, so I guess this answers my above question :)
Yes. And Windows and Unix (which includes Mac) is a pretty
large world; I'd say that if you're concerned about a user
sitting in front of a terminal, they pretty much cover that
environment. (Today---this wasn't always true, and even today,
you might run into a legacy system here and there. But if you
don't already have one, your company isn't going to go out and
acquire one in the future.)

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

May 22 '07 #7
James Kanze <ja*********@gm ail.comwrote:
First, it's not defined at all; there is (in the standard) no
direct conversion from streampos to an arithmetic type. There
is an implicite conversion from streampos to streamoff, however,
and streamoff is required to be convertible to an integral type;
in most implementations , streamoff is in fact a typedef of an
integral type. If streamoff is a typedef to an integral type,
streampos will convert implicitly to any arithmetic type; if it
is a user defined type, you'll need some explicit conversion in
there somewhere.
Thanks. The conversion from streampos to double works for me, today, on
my current platform :)

[snip rest]

--
Marcus Kwok
Replace 'invalid' with 'net' to reply
May 22 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
3924
by: Lokkju | last post by:
I am pretty much lost here - I am trying to create a managed c++ wrapper for this dll, so that I can use it from c#/vb.net, however, it does not conform to any standard style of coding I have seen. It is almost like it is trying to implement it's own COM interfaces... below is the header, and a link to the dll+code: Zip file with header, example, and DLL:...
4
5720
by: Kenneth Keeley | last post by:
Hi, I have a page that uploads files to my server and I wish to display a "Please wait while uploading" page to the user while the file is uploading. I have been able to redirect the user once the file is finished uploading but am not sure how to do it while file is uploading. Some sample code would be welcomed with open arms. Thank You.
12
7764
by: JMB | last post by:
Hello, I was wondering if anyone knew of any projects extending the inline upload progress bar to utilize an inpage image uploader with bar, without having to refresh or go to a seperate page, nor opening a second box for display of the progress bar. I had been using the MegaUpload that was adapted from Raditha's script at http://www.raditha.com/upload.php . The MegaUpload script I have been using takes the progress bar inpage,...
1
6008
by: Anonieko | last post by:
Query: How to display progress bar for long running page Answer: Yet another solution. REFERENCE: http://www.eggheadcafe.com/articles/20050108.asp My only regret is that when click the browser back button, it loads the progress bar again. Any solution to this?
1
2197
by: Marko Vuksanovic | last post by:
I am trying to implement a file upload progress indicator (doesn't have to be a progress bar) using atlas... I do realize that the indicator cannot be implemented using Update panel control, but is it possible to implement it using some other control, for example a floating window? A link to example would also be useful. Thanks, Marko Vuksanovic.
6
10088
by: Marko Vuksanovic | last post by:
I am trying to implement a file upload progress indicator (doesn't have to be a progress bar) using atlas... I do realize that the indicator cannot be implemented using Update panel control, but is it possible to implement it using some other control, for example a floating window?
1
1422
by: M.V. | last post by:
I'm trying to implement a progress indicator using atlas. Note that this indicator does not have to reflect the real progress but just to let the user know that the something is being done so that the user does not close the page as he/she might think that the web application stopped responding. I realize that the file upload control does not work with the update panel but is there any other way to show the progress indicator (just an...
10
4323
by: Robertf987 | last post by:
Okay, now then. I'm hoping somebody can help here, pretty please. I want to make a progress bar/indicator on a form. At first I was just going to insert an animated gif, but I've tried and remembered that you can't put an animated gif in there, it becomes static (I did find that out in year 10 at school, I should have known, but that's just me). I can't find an ActiveX control for a progress bat that works (and isn't it true that if you move the...
17
4699
by: Killer42 | last post by:
Hi all. In the past I've done most file IO using the old built-in VB statements such as Open, Line Input #, Get, Put and so on. In a recent project I decided to try and update a bit, so I'm using a File object to represent each file, and OpenAsTextStream to create a TextStream object, then reading that using ReadLine. It's working just fine, except for one thing. How can I tell where I am in the file? I'm dealing with large files (hundreds...
0
7894
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8281
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8262
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5437
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
3893
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
3937
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2409
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1497
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1245
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.