473,511 Members | 9,983 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Performance Optimizations

Hello everyone,

I am writing a utility. Part of its function is to do a block-mode copy of
files and generate MD5 / SHA1 hashes on the fly. The functionality is
similar to that of the Unix DD / DCFLDD utilities. I am using API calls to
get handles to files and FileStreams to perform the copy block by block,
hashing on the way. The relevant portion of the code is below. I am using
byte by byte transfer to avoid copying the file and then re-openning it to
hash it. That's double the work and results in poor performance.

I realize that hashing the the files as they are being copied will generate
overhead. I am comparing performance to utilities like robocopy, xcopy,
plain windows copy, and dcfldd. I am pretty much on par with dcfldd (on
windows), sometimes better. My performance is about 90% to 95% that of DOS
utilities like robocopy and xcopy. Windows copy has too much overhead and
is much slower. Dcfldd on windows is not very fast and is not a good
reference point. I would like to get closer to 98% / 99% performance of
robocopy and xcopy.

I have several questions.

1. Can anyone recommend a good list of block sizes to use for various
environments? For example, what should be the block size for HDD to HDD
copying, LAN to LAN, within the same HDD, LAN to HDD, slow connections, etc?
I tried setting the block size to 512 bytes when copying from HDD to HDD to
match the NTFS cluster size, but that resulted in much worse transfer speed
then a block size of 32K. I experimented with setting block size to match
default LAN MTU size - minus packet headers sizes, but that resulted in poor
performance as well. I am just not how to determine good block size other
then trial and error.

2. Is my code below optimized? Am I wasting any CPU cycles?

3. Is there a better way of doing this? I would not call myself an
experienced programmer. I would appreciate any criticism.

Thank you in advance!

'get MAC times of the sourcefile
If GetFileTime(hFlHandle, dtCreated, dtAccessed, dtModified) = False Then
Logger.writeLN("Copy Error: " & APIErrorMessage(GetLastError))
Throw New Exception("Unable to get MAC times from sourcefile")
End If

While SourceStream.Position <SourceStream.Length ' write until EOF

'clear the buffer block
ReDim transferBlock(iBlockSize - 1)
If SourceStream.Length - SourceStream.Position CLng(iBlockSize) Then

'read a block of data
iBytesRead = SourceStream.Read(transferBlock, 0,
transferBlock.Length)
'hash the block
objMD5.TransformBlock(transferBlock, 0, transferBlock.Length,
transferBlock, 0)
'write to destination file
DestStream.Write(transferBlock, 0, transferBlock.Length)

Else

'read a block of data
ReDim transferBlock(SourceStream.Length - SourceStream.Position - 1)
iBytesRead = SourceStream.Read(transferBlock, 0,
transferBlock.Length)
'hash final block
objMD5.TransformFinalBlock(transferBlock, 0, transferBlock.Length)
'write to destination file
DestStream.Write(transferBlock, 0, transferBlock.Length)

End If

iTotalBytesRead += iBytesRead
iCurrentFileCopied += iBytesRead

End While

'set MAC times for the destination file
If SetFileTime(hDestHandle, dtCreated, dtAccessed, dtModified) = False Then
Logger.writeLN("Copy Error: " & APIErrorMessage(GetLastError))
Throw New Exception("Unable to set MAC times for destination file")
End If
Oct 4 '08 #1
0 902

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
9169
by: Dave Theese | last post by:
Hello all, I'm in a poition of trying to justify use of the STL from a performance perspective. As a starting point, can anyone cite any benchmarks comparing vectors to plain old...
133
8425
by: Gaurav | last post by:
http://www.sys-con.com/story/print.cfm?storyid=45250 Any comments? Thanks Gaurav
8
1396
by: Profil1 | last post by:
Hi, I'm writing a code that has to be as efficient as possible both in terms of memory use and execution speed. I'll have to pass a class instance (which is an 'intelligent' array) to a function...
5
3985
by: Scott | last post by:
I have a customer that had developed an Access97 application to track their business information. The application grew significantly and they used the Upsizing Wizard to move the tables to SQL...
13
1388
by: Andrew Au \(Newsgroup\) | last post by:
Hi all, I am switching from Java to C solely for performance, but I wonder are there any coding techniques that can boost performance in C? I am asking such an open-ended question to elicit...
3
1820
by: Champika Nirosh | last post by:
Hi, I have written a sperate appliaction for error logging and now I have two options one is to create a dll and give it to fellow developer and allow them to add the reference while keeping...
2
2881
by: Champika Nirosh | last post by:
Hi All, Can some one shed some light on to the fallowing discussion.. The issure is.. Why we shouldn't use a seperate dll when we have the option to add the class (cs file) straightaway to...
36
2440
by: mrby | last post by:
Hi, Does anyone know of any link which describes the (relative) performance of all kinds of C operations? e.g: how fast is "add" comparing with "multiplication" on a typical machine. Thanks!...
13
4578
by: atlaste | last post by:
Hi, I'm currently developing an application that uses a lot of computational power, disk access and memory caching (to be more exact: an information retrieval platform). In these kind of...
4
3576
by: =?Utf-8?B?V2lsc29uIEMuSy4gTmc=?= | last post by:
Hi Experts, I am doing a prototype of providing data access (read, write & search) through Web Service. We observed that the data storing in SQL Server 2005, the memory size is always within...
0
7251
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7430
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7089
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
5673
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
5072
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4743
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3230
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3217
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
790
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.