473,396 Members | 1,676 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

SQL Server 2008 - Filestream

Hi,

we just set up a SQL Server 2008 on Windows Server 2008 (configured as file
/ webserver, 16 GB memory, 4 dual core processors, 6 internal disks + 15
disc ISCSI array with overall almost 3 TB space) to use filestream and ran
into the following problem:

We extracted binaries from our 'original' database and imported it into the
new 08 database using filestream - so far, so good.

In the beginning everything looked pretty good, we extracted / imported over
100.000 rows per hour but over time the performance degraded significantly
(down to a roughly 4500 per hour). At this point, we have about 1 million
files in our filesteam filegroup (which means in one directory on our binary
array) but there are 500.000 more to go - and at this speed this will take
days (not even thinking about what it might mean for performance
afterwards).

Searching for reasons, we realized that the inserts into the new table took
most of the time (over 80%), so we first stopped and rebuilt the only index
on that one (primary key) and, after realizing that this did not help at
all, we dropped it completely (since uniqueness is assured through the
original table anyways) - same result, no change, still way too slow.

Did anybody run into a similar problem so far or is it in the end not even a
SQL server but a Windows / File System problem (since through filestreaming
millions of files go into a single folder) ?!

Any ideas or tips would be appreciated ... thanks in advance !

Kind regards

Andy

Jun 27 '08 #1
8 2839
Andreas Zimmermann (so*******@nurfuerspam.de) writes:
we just set up a SQL Server 2008 on Windows Server 2008 (configured as
file / webserver, 16 GB memory, 4 dual core processors, 6 internal disks
+ 15 disc ISCSI array with overall almost 3 TB space) to use filestream
and ran into the following problem:

We extracted binaries from our 'original' database and imported it into
the new 08 database using filestream - so far, so good.

In the beginning everything looked pretty good, we extracted / imported
over 100.000 rows per hour but over time the performance degraded
significantly (down to a roughly 4500 per hour). At this point, we have
about 1 million files in our filesteam filegroup (which means in one
directory on our binary array) but there are 500.000 more to go - and at
this speed this will take days (not even thinking about what it might
mean for performance afterwards).

Searching for reasons, we realized that the inserts into the new table
took most of the time (over 80%), so we first stopped and rebuilt the
only index on that one (primary key) and, after realizing that this did
not help at all, we dropped it completely (since uniqueness is assured
through the original table anyways) - same result, no change, still way
too slow.
I don't have any answer for you, so I've forwarded your post to my MVP
colleagues, to see if they have anything to add. But I have one question:
how do you write the data? Through INSERT, or do you use the
OpenSqlFilestream to get a file handle for the files?
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jun 27 '08 #2
Did anybody run into a similar problem so far or is it in the end not even
a
SQL server but a Windows / File System problem (since through
filestreaming
millions of files go into a single folder) ?!
How many paths in your FILESTREAM filegroup? If you have only a single
path, I'm curious if adding additional paths will improve performance.

--
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/

"Andreas Zimmermann" <so*******@nurfuerspam.dewrote in message
news:g2**********@aioe.org...
Hi,

we just set up a SQL Server 2008 on Windows Server 2008 (configured as
file
/ webserver, 16 GB memory, 4 dual core processors, 6 internal disks + 15
disc ISCSI array with overall almost 3 TB space) to use filestream and ran
into the following problem:

We extracted binaries from our 'original' database and imported it into
the
new 08 database using filestream - so far, so good.

In the beginning everything looked pretty good, we extracted / imported
over
100.000 rows per hour but over time the performance degraded significantly
(down to a roughly 4500 per hour). At this point, we have about 1 million
files in our filesteam filegroup (which means in one directory on our
binary
array) but there are 500.000 more to go - and at this speed this will take
days (not even thinking about what it might mean for performance
afterwards).

Searching for reasons, we realized that the inserts into the new table
took
most of the time (over 80%), so we first stopped and rebuilt the only
index
on that one (primary key) and, after realizing that this did not help at
all, we dropped it completely (since uniqueness is assured through the
original table anyways) - same result, no change, still way too slow.

Did anybody run into a similar problem so far or is it in the end not even
a
SQL server but a Windows / File System problem (since through
filestreaming
millions of files go into a single folder) ?!

Any ideas or tips would be appreciated ... thanks in advance !

Kind regards

Andy
Jun 27 '08 #3
Well, that was an idea we also had ... by the 'standard' setup there is
only one 'file' (in this case path) in the filegroup - so we tried to
add one and had to realize that this is not possible (so far ?!) ... you
get a simple error message that filestream filegroups are not allowed to
contain more than one path ...

Dan Guzman wrote:
>Did anybody run into a similar problem so far or is it in the end not
even a
SQL server but a Windows / File System problem (since through
filestreaming
millions of files go into a single folder) ?!

How many paths in your FILESTREAM filegroup? If you have only a single
path, I'm curious if adding additional paths will improve performance.
Jun 27 '08 #4
At the moment we're using a 'straight forward' insert after extracting
the data from our original table(s) ... is there an advantage in using
OpenSqlFilestream instead (if so we would certainly 'restrucure' the
extract / import) ?!

Erland Sommarskog wrote:
Andreas Zimmermann (so*******@nurfuerspam.de) writes:
>we just set up a SQL Server 2008 on Windows Server 2008 (configured as
file / webserver, 16 GB memory, 4 dual core processors, 6 internal disks
+ 15 disc ISCSI array with overall almost 3 TB space) to use filestream
and ran into the following problem:

We extracted binaries from our 'original' database and imported it into
the new 08 database using filestream - so far, so good.

In the beginning everything looked pretty good, we extracted / imported
over 100.000 rows per hour but over time the performance degraded
significantly (down to a roughly 4500 per hour). At this point, we have
about 1 million files in our filesteam filegroup (which means in one
directory on our binary array) but there are 500.000 more to go - and at
this speed this will take days (not even thinking about what it might
mean for performance afterwards).

Searching for reasons, we realized that the inserts into the new table
took most of the time (over 80%), so we first stopped and rebuilt the
only index on that one (primary key) and, after realizing that this did
not help at all, we dropped it completely (since uniqueness is assured
through the original table anyways) - same result, no change, still way
too slow.

I don't have any answer for you, so I've forwarded your post to my MVP
colleagues, to see if they have anything to add. But I have one question:
how do you write the data? Through INSERT, or do you use the
OpenSqlFilestream to get a file handle for the files?

Jun 27 '08 #5
Andreas Zimmermann (so*******@nurfuerspam.de) writes:
At the moment we're using a 'straight forward' insert after extracting
the data from our original table(s) ... is there an advantage in using
OpenSqlFilestream instead (if so we would certainly 'restrucure' the
extract / import) ?!
As I understand it, OpenSqlFilstream is more or less the whole point
with FILESTREAM: you can write data through the Win32 file API, which
should be faster than going through the needles eye of SQL Server.

However, it's not likely to help you with the issue of number of files
in the same directory getting larger than NTFS can cope with. I will have
to say that this issue exists at all. It should be a no-brainer to spot
this problem in the design.

There seems to be a workaround, though. I don't know if Jimi2Cool is your alter ego, if he just a person with the same problem as you. In any case,
he was able to diagnose the root of the problem and also able to address
it. Whether his fix is supported remains to see:
http://forums.microsoft.com/MSDN/Sho...iteID=1&mode=1

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jun 27 '08 #6
Well, that was an idea we also had ... by the 'standard' setup there is
only one 'file' (in this case path) in the filegroup - so we tried to add
one and had to realize that this is not possible (so far ?!) ... you get a
simple error message that filestream filegroups are not allowed to contain
more than one path ...
My bad - I didn't consider separate filegroups are required for separate
FILESTREAM folders. Another workaround might to create separate FILESTREAM
filegroups and parition the FILESTREAM portion of the table so that data is
distributed evenly in those folders. This is a bit of a kluge, though. I
think the workaround Erland found is a less kludgey kludge ;-)

--
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/

"Andreas Zimmermann" <so*******@nurfuerspam.dewrote in message
news:g2**********@aioe.org...
Well, that was an idea we also had ... by the 'standard' setup there is
only one 'file' (in this case path) in the filegroup - so we tried to add
one and had to realize that this is not possible (so far ?!) ... you get a
simple error message that filestream filegroups are not allowed to contain
more than one path ...

Dan Guzman wrote:
>>Did anybody run into a similar problem so far or is it in the end not
even a
SQL server but a Windows / File System problem (since through
filestreaming
millions of files go into a single folder) ?!

How many paths in your FILESTREAM filegroup? If you have only a single
path, I'm curious if adding additional paths will improve performance.
Jun 27 '08 #7
In fact, he is a collegue of mine ... and we found about that particular
workaround on friday (and it solved the problem for now it seems) - I
just did not have the time to post here ... but thanks for your heads up !

Erland Sommarskog wrote:
Andreas Zimmermann (so*******@nurfuerspam.de) writes:
>At the moment we're using a 'straight forward' insert after extracting
the data from our original table(s) ... is there an advantage in using
OpenSqlFilestream instead (if so we would certainly 'restrucure' the
extract / import) ?!

As I understand it, OpenSqlFilstream is more or less the whole point
with FILESTREAM: you can write data through the Win32 file API, which
should be faster than going through the needles eye of SQL Server.

However, it's not likely to help you with the issue of number of files
in the same directory getting larger than NTFS can cope with. I will have
to say that this issue exists at all. It should be a no-brainer to spot
this problem in the design.

There seems to be a workaround, though. I don't know if Jimi2Cool is your alter ego, if he just a person with the same problem as you. In any case,
he was able to diagnose the root of the problem and also able to address
it. Whether his fix is supported remains to see:
http://forums.microsoft.com/MSDN/Sho...iteID=1&mode=1
Jun 27 '08 #8
Thanks ... we might have a look at this - although you're right, it
sounds like a kludge :)

Dan Guzman wrote:
>Well, that was an idea we also had ... by the 'standard' setup there
is only one 'file' (in this case path) in the filegroup - so we tried
to add one and had to realize that this is not possible (so far ?!)
... you get a simple error message that filestream filegroups are not
allowed to contain more than one path ...

My bad - I didn't consider separate filegroups are required for separate
FILESTREAM folders. Another workaround might to create separate
FILESTREAM filegroups and parition the FILESTREAM portion of the table
so that data is distributed evenly in those folders. This is a bit of a
kluge, though. I think the workaround Erland found is a less kludgey
kludge ;-)
Jun 27 '08 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Jorge | last post by:
Hi, I'm developing a web page application that runs 2 processeses. These two process create postcript files, the first one creates a map and the second one overlays data over that map. ...
4
by: Serban | last post by:
Hi I have two applications(client and server) that pass an XML document but I have a hard time reading the XML document on the server side. So the client application creates the XML document...
0
by: Henrique Berenguel | last post by:
Hello folks. How do I solve the problem below? when my program executes the line below : Package.WriteXml(Server.MapPath(strID + ".XML")); Ireceive the following error message: ...
2
by: FusionGuy | last post by:
I've created a file uploading handler, implemented as an httpHandler. Each time I attempt to upload a file, or files, my HttpContext.Request.Files property never contains the files that were...
9
by: CGW | last post by:
I asked the question yesterday, but know better how to ask it, today: I'm trying to use the File.Copy method to copy a file from a client to server (.Net web app under IIS ). It looks to me that...
9
by: craig.overton | last post by:
All, I am currently developing an FTP class in VB.NET. It's kid tested, mother approved when trying to access an FTP Server on a Windows box meaning I can connect, run commands, upload and...
1
by: dawson | last post by:
Hello, when running the code below, it returns the error message: "The requested URI is invalid for this FTP command.". I have confirmed the FTP URI is correct and working through a standalone...
5
by: vidyu | last post by:
hai everybody, can anyone explain me or give some idea abt uloading of a file. thanq vidyu
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.