473,505 Members | 14,394 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Fill a Server

I think this is a silly task, but I have to do it. I have to fill a
file server (1 TB SATA RAID Array) with files. I wrote a Python script
to do this, but it's a bit slow... here it is:

import shutil
import os
import sys
import time

src = "G:"
des = "C:scratch"

os.chdir(src)
try:
for x in xrange(5000):
for root, dirs, files in os.walk(src):
for f in files:
shutil.copyfile(os.path.join(root, f),
"C:\scratch\%s%s" %(f,x))
print "Done!!!"

except Exception, e:
print e
time.sleep(15)
sys.exit()

The problem with this is that it only copies about 35 GB/hour. I would
like to copy at least 100 GB/hour... more if possible. I have tried to
copy from the IDE CD drive to the SATA array with the same results. I
understand the throughput on SATA to be roughly 60MB/sec which comes
out to 3.6 GB/min which should be 216 GB/hour. Can someone show me how
I might do this faster? Is shutil the problem?

Also, my first attempt at this did a recursive copy creating subdirs in
dirs as it copied. It would crash everytime it went 85 subdirs deep.
This is an NTFS filesystem. Would this limitation be in the filesystem
or Python?

Thanks
Bob Smith

"I once worked for a secret government agency in South Africa... among
other things, we developed and tested AIDS. I am now living in an
undisclosed location in North America. The guys who live with me never
let me drive the same route to work and they call me 'Bob Smith 17280'
as there were more before me." -- Bob Smith

Jul 18 '05 #1
5 1283
bo*************@hotmail.com wrote:
[snip code involving copyfile:]
shutil.copyfile(os.path.join(root, f), The problem with this is that it only copies about 35 GB/hour. I would
like to copy at least 100 GB/hour... more if possible. I have tried to
copy from the IDE CD drive to the SATA array with the same results. I
understand the throughput on SATA to be roughly 60MB/sec which comes
out to 3.6 GB/min which should be 216 GB/hour. Can someone show me how
I might do this faster? Is shutil the problem?
Have you tried doing this from some kind of batch file, or
manually, measuring the results? Have you got any way to
achieve this throughput, or is it only a theory? I see
no reason to try to optimize something if there's no real
evidence that it *can* be optimized.
Also, my first attempt at this did a recursive copy creating subdirs in
dirs as it copied. It would crash everytime it went 85 subdirs deep.
This is an NTFS filesystem. Would this limitation be in the filesystem
or Python?


In general, when faced with the question "Is this a limitation
of Python or of this program X of Microsoft origin?", the answer
should be obvious... ;-)

More practically, perhaps: use your script to create one of those
massively nested folders. Wait for it to crash. Now go in
"manually" (with CD or your choice of fancy graphical browser)
to the lowest level folder and attempt to create a subfolder
with the same name the Python script was trying to use. Report
back here on your success, if any. ;-)

(Alternatively, describe your failure in terms other than "crash".
Python code rarely crashes. It does, sometimes, fail and print
out an exception traceback. These are printed for a very good
reason: they are more descriptive than the word "crash".)

-Peter
Jul 18 '05 #2
<bo*************@hotmail.com> wrote:
Also, my first attempt at this did a recursive copy creating subdirs in
dirs as it copied. It would crash everytime it went 85 subdirs deep.
This is an NTFS filesystem. Would this limitation be in the filesystem
or Python?


see the "Max File Name Length" on this page (random google link)
for an explanation:

http://www.ntfs.com/ntfs_vs_fat.htm

(assuming that "crash" meant "raise an exception", that is)

</F>

Jul 18 '05 #3
Also, my first attempt at this did a recursive copy creating subdirs in
dirs as it copied. It would crash everytime it went 85 subdirs deep.
This is an NTFS filesystem. Would this limitation be in the filesystem
or Python?


see the "Max File Name Length" on this page (random google link)
for an explanation:

http://www.ntfs.com/ntfs_vs_fat.htm


also:

print len(os.path.join("c:\\scratch", *map(str, range(85))))

</F>

Jul 18 '05 #4
You are correct Peter, the exception read something like this:

"Folder 85 not found."

I am paraphrasing, but that is the crux of the error. It takes about an
hour to produce the error so if you want an exact quote from the
exception, let me know and give me awhile. I looked through the nested
dirs several times after the crash and they always went from 0 - 84...
sure enough, directory 85 had not been created... why I do not know.
Doesn't really matter now as the script I posted achieves similar
results witout crashing... still slow though.

As far as drive throughput, it's my understanding that SATA is
theorhetically capable of 150 MB/sec (google for it). However, in
practice, one can normally expect a sustained throughput of 60 to 70
MB/sec. The drives are 7,200 RPM... not the more expensive 10,000 RPM
drives. I have no idea how RAID 5 might impact performance either. It's
hardware RAID on a top-of-the-line DELL server. I am not a hardware
expert so I don't understand how *sustained* drive throughput, RPM and
RAID together fator into this scenario.

Jul 18 '05 #5
I think you solved it Fredrik.

The first ten folders looked like this:

D:\0\1\2\3\4\5\6\7\8\9

22 Chars long.

The rest looked like this:

\10\11\12\13....\82\83\84

~ 222 CHars long.

Subdir 84 had one file in it named XXXXXXXXXXX.bat

That file broke the 255 limit, then subdir 85 wasn't created and when
the script tried to copy a file to 85, an exception was raised. Not
that it matters. Interesting to know that limits still exists and that
this is a NTFS issue.

Jul 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
1233
by: Jeff Magouirk | last post by:
Dear Group, I am tring to use a command that calls the server to fill an adapter, it never seems to get to the adapter, command and the server either times out or does not respond. The timeout...
6
1380
by: Auto | last post by:
I starting to use Visual Studio .NET 2003 creating C# Windows application with SQL Server and I get problem with method Fill() for which when running ends with System Error even with the most...
7
1516
by: Auto | last post by:
I starting to use Visual Studio .NET 2003 creating C# Windows application with SQL Server and I get problem with method Fill() for which when running ends with System Error even with the most...
0
1146
by: AndyAFCW | last post by:
I am developing my first .NET application that connects to a SQL Server 2000 database and I am having a total nightmare :x :evil: I am running Windows 2000 with Visual Studio .NET version...
2
1273
by: Dan | last post by:
I've created a web form which fills a DataGrid with a DataSet generated from the SqlDataAdapter.Fill method. The adapter's query takes about 30 seconds to complete when I run it in the SQL Server...
2
6058
by: Stanav | last post by:
Hello all, I'm developing a web application using VB.Net 2003 and Framework 1.1. This application queries an AS/400 database. I'm using the IBM OleDb provider that came with IBM Client Access for...
4
3737
by: Dave Edwards | last post by:
I understand that I can fill a datagrid with multiple queries, but I cannot figure out how to fill a dataset with the same query but run against multiple SQL servers, the query , table structure...
5
2284
by: moondaddy | last post by:
I have a website where cataloge pages are populated by calling a stored procedure on sql server. I use the sql data adapter's fill method to call this stored procedure and fill the dataset. about...
3
4604
by: Stanav | last post by:
Hello all, I'm developing a web application using VB.Net 2003 and Framework 1.1. This application queries an AS/400 database. I'm using the IBM OleDb provider that came with IBM Client Access for...
2
4189
by: slinky | last post by:
I'm getting a error when I open my . aspx in my browser... line 34: da.Fill(ds, "Assets") Here's the error and my entire code for this .aspx.vb is below that ... I need some clues as to what is...
0
7216
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7098
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7303
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7367
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7018
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7471
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5613
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
5028
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
407
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.