By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,797 Members | 1,187 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,797 IT Pros & Developers. It's quick & easy.

lseek and write question

P: n/a
Hello,

I am going to ask a question regarding
write and lseek. I will provide code at the end of this, but first
some background.
I am trying to identify the cause of some latency in writing to disk.
My user claims that performance is much slower on SAN than on local
disk. The developer provided me a C++ program that performed a write
test that confirmed his suspicions. I modified the code to better
fit
my needs which it does now.
What I found during the test is that fsync is an expensive operation
and will block waiting for a confirmation from the disk device. What
I am trying to understand is the lseek function.
From what I read, it simply moves the pointer in the file descriptor
as directed. When I use this lseek function, writes are faster.
My question is why? When I use the write command, does the pointer
get reset and on each write, it will search for EOF?
This is running Linux sytem.
Thanks in advance:
#include <sys/types.h>
#include <sys/time.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char **argv)
{
struct timeval start, end;
double usecs;
long val;
int ch, fd, idx, ops, numThreads;
char *fname= "";
int filesize = 40000000;
int bytes = 0;
bool dosync = true, doSeek=false;
bytes = 0;
ops = 0;
char *buf = new char[bytes];
fname = argv[1];
while (( ch = getopt(argc,argv, "b:o:f:sl")) != EOF)
switch (ch) {
case 'b' :
bytes = atoi(optarg);
break;
case 'o' :
ops = atoi(optarg);
break;
case 'f' :
fname = (optarg);
break;
case 's' :
dosync = false;
break;
case 'l' :
doSeek = true;
break;
}
argc -= optind;
argv += optind;
gettimeofday(&start,NULL);
memset(buf,0,bytes);
if ( dosync ) {
printf("Processing %d bytes with %d Operations of fsync :
\t", bytes,ops);
} else {
printf("Processing %d bytes with %d Operations of fsync :
\t", bytes,1);
}
// unlink(fname);
if ((fd = open(fname, O_RDWR | O_CREAT, 0666)) == -1)
{
int errNum = errno;
printf("ERROR: failed to open %s: n",fname);
return(0);
}
for ( int idx(0) ; idx < ops ; idx++)
{
if (write(fd, buf, bytes) != bytes)
{
printf("write: \n");
exit (1);
}
if ( dosync ) {
if (fsync(fd) != 0)
{
printf("fsync: \n");
exit (1);
}
}
if ( doSeek )
{
if (lseek(fd, (off_t)0, SEEK_SET) == -1)
{
printf("lseek: %s\n",
strerror(errno));
exit (1);
}
}
}
// One last sync
if (fsync(fd) != 0)
{
printf("fsync: \n");
exit (1);
}
gettimeofday(&end,NULL);
int totalSec = 0;
long totalUSec = 0;
if (start.tv_usec end.tv_usec) {
end.tv_usec += 1000000;
end.tv_sec--;
}
totalSec = end.tv_sec - start.tv_sec;
totalUSec = end.tv_usec - start.tv_usec;
int t = totalSec + (totalUSec / 1000000);
printf("%ld Hours ",t / ( 60 * 60));
t %= (60*60);
printf("%ld Minutes ",t / 60);
t %= 60;
printf("%ld.%ld Seconds ",t ,totalUSec);
printf("%ld.%ld Seconds\n ",totalSec ,totalUSec);
}
Nov 16 '07 #1
Share this Question
Share on Google+
3 Replies


P: n/a
golden wrote:
I am going to ask a question regarding
write and lseek. I will provide code at the end of this, but first
some background.
[..]
What I found during the test is that fsync is an expensive operation
and will block waiting for a confirmation from the disk device. What
I am trying to understand is the lseek function.
From what I read, it simply moves the pointer in the file descriptor
as directed. When I use this lseek function, writes are faster.
My question is why? When I use the write command, does the pointer
get reset and on each write, it will search for EOF?
This is running Linux sytem.

[..]
First a nit pick: 'write' is not a command. It's a function. IIRC,
it's a POSIX function, which isn't really on topic here. Now that's
out of the way, second, in C++ we'd use the 'fwrite' function (from
the C Standard Library). Have you tried switching to using 'fwrite'
instead?

And the last point: you might want to consider asking in the Linux
newsgroup since I/O performance depends greatly on the platform, and
there is no real explanation from the language point of view why
'write' is so slow without 'lseek'.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Nov 16 '07 #2

P: n/a
On Nov 16, 3:45 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
golden wrote:
I am going to ask a question regarding
write and lseek. I will provide code at the end of this, but first
some background.
[..]
What I found during the test is that fsync is an expensive operation
and will block waiting for a confirmation from the disk device. What
I am trying to understand is the lseek function.
From what I read, it simply moves the pointer in the file descriptor
as directed. When I use this lseek function, writes are faster.
My question is why? When I use the write command, does the pointer
get reset and on each write, it will search for EOF?
This is running Linux sytem.
[..]

First a nit pick: 'write' is not a command. It's a function. IIRC,
it's a POSIX function, which isn't really on topic here. Now that's
out of the way, second, in C++ we'd use the 'fwrite' function (from
the C Standard Library). Have you tried switching to using 'fwrite'
instead?

And the last point: you might want to consider asking in the Linux
newsgroup since I/O performance depends greatly on the platform, and
there is no real explanation from the language point of view why
'write' is so slow without 'lseek'.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Thanks... the nitpicking will make me better, so I welcome that. I am
so used to programming in perl the "command" seems automatic. I will
try the fwrite and visit the linux group. Thanks for the reply.
Nov 17 '07 #3

P: n/a
On Nov 16, 9:45 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
golden wrote:
I am going to ask a question regarding
write and lseek. I will provide code at the end of this, but first
some background.
[..]
What I found during the test is that fsync is an expensive operation
and will block waiting for a confirmation from the disk device. What
I am trying to understand is the lseek function.
From what I read, it simply moves the pointer in the file descriptor
as directed. When I use this lseek function, writes are faster.
My question is why? When I use the write command, does the pointer
get reset and on each write, it will search for EOF?
This is running Linux sytem.
[..]
First a nit pick: 'write' is not a command. It's a function. IIRC,
it's a POSIX function, which isn't really on topic here. Now that's
out of the way, second, in C++ we'd use the 'fwrite' function (from
the C Standard Library). Have you tried switching to using 'fwrite'
instead?
It won't work. He's using a functionality (synchronized
writing) which isn't available in the standard library. The
most you can ever guarantee with the standard library (either
FILE* or iostream) is that the data has been transfered to the
OS; his call to fsych guarantees that it has been physically
written on the medium.
And the last point: you might want to consider asking in the Linux
newsgroup since I/O performance depends greatly on the platform, and
there is no real explanation from the language point of view why
'write' is so slow without 'lseek'.
With regards to his particular question, the answer seems
obvious (and will probably be the same on any system, anytime he
doesn't use synchronized writes): because of the seek, he's
always writing the data at the same place on the disk, which
means that the system can always reuse the same sector cache,
and never has to go to disk. Without the seek, he's writing a
fairly large file, and the system probably won't keep all of the
cached data around, but will write to disk.

Is it really surprising that writing a file with one record is
significantly faster than writing one with ops records (where
ops is probably fairly large)?

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Nov 17 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.