Connecting Tech Pros Worldwide Forums | Help | Site Map

SQL Server 2008 - Filestream

Andreas Zimmermann
Guest
 
Posts: n/a
#1: Jun 27 '08
Hi,

we just set up a SQL Server 2008 on Windows Server 2008 (configured as file
/ webserver, 16 GB memory, 4 dual core processors, 6 internal disks + 15
disc ISCSI array with overall almost 3 TB space) to use filestream and ran
into the following problem:

We extracted binaries from our 'original' database and imported it into the
new 08 database using filestream - so far, so good.

In the beginning everything looked pretty good, we extracted / imported over
100.000 rows per hour but over time the performance degraded significantly
(down to a roughly 4500 per hour). At this point, we have about 1 million
files in our filesteam filegroup (which means in one directory on our binary
array) but there are 500.000 more to go - and at this speed this will take
days (not even thinking about what it might mean for performance
afterwards).

Searching for reasons, we realized that the inserts into the new table took
most of the time (over 80%), so we first stopped and rebuilt the only index
on that one (primary key) and, after realizing that this did not help at
all, we dropped it completely (since uniqueness is assured through the
original table anyways) - same result, no change, still way too slow.

Did anybody run into a similar problem so far or is it in the end not even a
SQL server but a Windows / File System problem (since through filestreaming
millions of files go into a single folder) ?!

Any ideas or tips would be appreciated ... thanks in advance !

Kind regards

Andy




Erland Sommarskog
Guest
 
Posts: n/a
#2: Jun 27 '08

re: SQL Server 2008 - Filestream


Andreas Zimmermann (souldiver@nurfuerspam.de) writes:
Quote:
we just set up a SQL Server 2008 on Windows Server 2008 (configured as
file / webserver, 16 GB memory, 4 dual core processors, 6 internal disks
+ 15 disc ISCSI array with overall almost 3 TB space) to use filestream
and ran into the following problem:
>
We extracted binaries from our 'original' database and imported it into
the new 08 database using filestream - so far, so good.
>
In the beginning everything looked pretty good, we extracted / imported
over 100.000 rows per hour but over time the performance degraded
significantly (down to a roughly 4500 per hour). At this point, we have
about 1 million files in our filesteam filegroup (which means in one
directory on our binary array) but there are 500.000 more to go - and at
this speed this will take days (not even thinking about what it might
mean for performance afterwards).
>
Searching for reasons, we realized that the inserts into the new table
took most of the time (over 80%), so we first stopped and rebuilt the
only index on that one (primary key) and, after realizing that this did
not help at all, we dropped it completely (since uniqueness is assured
through the original table anyways) - same result, no change, still way
too slow.
I don't have any answer for you, so I've forwarded your post to my MVP
colleagues, to see if they have anything to add. But I have one question:
how do you write the data? Through INSERT, or do you use the
OpenSqlFilestream to get a file handle for the files?


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Dan Guzman
Guest
 
Posts: n/a
#3: Jun 27 '08

re: SQL Server 2008 - Filestream


Did anybody run into a similar problem so far or is it in the end not even
Quote:
a
SQL server but a Windows / File System problem (since through
filestreaming
millions of files go into a single folder) ?!
How many paths in your FILESTREAM filegroup? If you have only a single
path, I'm curious if adding additional paths will improve performance.

--
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/

"Andreas Zimmermann" <souldiver@nurfuerspam.dewrote in message
news:g29p93$o8u$2@aioe.org...
Quote:
Hi,
>
we just set up a SQL Server 2008 on Windows Server 2008 (configured as
file
/ webserver, 16 GB memory, 4 dual core processors, 6 internal disks + 15
disc ISCSI array with overall almost 3 TB space) to use filestream and ran
into the following problem:
>
We extracted binaries from our 'original' database and imported it into
the
new 08 database using filestream - so far, so good.
>
In the beginning everything looked pretty good, we extracted / imported
over
100.000 rows per hour but over time the performance degraded significantly
(down to a roughly 4500 per hour). At this point, we have about 1 million
files in our filesteam filegroup (which means in one directory on our
binary
array) but there are 500.000 more to go - and at this speed this will take
days (not even thinking about what it might mean for performance
afterwards).
>
Searching for reasons, we realized that the inserts into the new table
took
most of the time (over 80%), so we first stopped and rebuilt the only
index
on that one (primary key) and, after realizing that this did not help at
all, we dropped it completely (since uniqueness is assured through the
original table anyways) - same result, no change, still way too slow.
>
Did anybody run into a similar problem so far or is it in the end not even
a
SQL server but a Windows / File System problem (since through
filestreaming
millions of files go into a single folder) ?!
>
Any ideas or tips would be appreciated ... thanks in advance !
>
Kind regards
>
Andy
>
>
>
Andreas Zimmermann
Guest
 
Posts: n/a
#4: Jun 27 '08

re: SQL Server 2008 - Filestream


Well, that was an idea we also had ... by the 'standard' setup there is
only one 'file' (in this case path) in the filegroup - so we tried to
add one and had to realize that this is not possible (so far ?!) ... you
get a simple error message that filestream filegroups are not allowed to
contain more than one path ...

Dan Guzman wrote:
Quote:
Quote:
>Did anybody run into a similar problem so far or is it in the end not
>even a
>SQL server but a Windows / File System problem (since through
>filestreaming
>millions of files go into a single folder) ?!
>
How many paths in your FILESTREAM filegroup? If you have only a single
path, I'm curious if adding additional paths will improve performance.
>
Andreas Zimmermann
Guest
 
Posts: n/a
#5: Jun 27 '08

re: SQL Server 2008 - Filestream


At the moment we're using a 'straight forward' insert after extracting
the data from our original table(s) ... is there an advantage in using
OpenSqlFilestream instead (if so we would certainly 'restrucure' the
extract / import) ?!

Erland Sommarskog wrote:
Quote:
Andreas Zimmermann (souldiver@nurfuerspam.de) writes:
Quote:
>we just set up a SQL Server 2008 on Windows Server 2008 (configured as
>file / webserver, 16 GB memory, 4 dual core processors, 6 internal disks
>+ 15 disc ISCSI array with overall almost 3 TB space) to use filestream
>and ran into the following problem:
>>
>We extracted binaries from our 'original' database and imported it into
>the new 08 database using filestream - so far, so good.
>>
>In the beginning everything looked pretty good, we extracted / imported
>over 100.000 rows per hour but over time the performance degraded
>significantly (down to a roughly 4500 per hour). At this point, we have
>about 1 million files in our filesteam filegroup (which means in one
>directory on our binary array) but there are 500.000 more to go - and at
>this speed this will take days (not even thinking about what it might
>mean for performance afterwards).
>>
>Searching for reasons, we realized that the inserts into the new table
>took most of the time (over 80%), so we first stopped and rebuilt the
>only index on that one (primary key) and, after realizing that this did
>not help at all, we dropped it completely (since uniqueness is assured
>through the original table anyways) - same result, no change, still way
>too slow.
>
I don't have any answer for you, so I've forwarded your post to my MVP
colleagues, to see if they have anything to add. But I have one question:
how do you write the data? Through INSERT, or do you use the
OpenSqlFilestream to get a file handle for the files?
>
>
Erland Sommarskog
Guest
 
Posts: n/a
#6: Jun 27 '08

re: SQL Server 2008 - Filestream


Andreas Zimmermann (souldiver@nurfuerspam.de) writes:
Quote:
At the moment we're using a 'straight forward' insert after extracting
the data from our original table(s) ... is there an advantage in using
OpenSqlFilestream instead (if so we would certainly 'restrucure' the
extract / import) ?!
As I understand it, OpenSqlFilstream is more or less the whole point
with FILESTREAM: you can write data through the Win32 file API, which
should be faster than going through the needles eye of SQL Server.

However, it's not likely to help you with the issue of number of files
in the same directory getting larger than NTFS can cope with. I will have
to say that this issue exists at all. It should be a no-brainer to spot
this problem in the design.

There seems to be a workaround, though. I don't know if Jimi2Cool is your alter ego, if he just a person with the same problem as you. In any case,
he was able to diagnose the root of the problem and also able to address
it. Whether his fix is supported remains to see:
http://forums.microsoft.com/MSDN/Sho...iteID=1&mode=1

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Dan Guzman
Guest
 
Posts: n/a
#7: Jun 27 '08

re: SQL Server 2008 - Filestream


Well, that was an idea we also had ... by the 'standard' setup there is
Quote:
only one 'file' (in this case path) in the filegroup - so we tried to add
one and had to realize that this is not possible (so far ?!) ... you get a
simple error message that filestream filegroups are not allowed to contain
more than one path ...
My bad - I didn't consider separate filegroups are required for separate
FILESTREAM folders. Another workaround might to create separate FILESTREAM
filegroups and parition the FILESTREAM portion of the table so that data is
distributed evenly in those folders. This is a bit of a kluge, though. I
think the workaround Erland found is a less kludgey kludge ;-)

--
Hope this helps.

Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/

"Andreas Zimmermann" <souldiver@nurfuerspam.dewrote in message
news:g2bic1$2sj$1@aioe.org...
Quote:
Well, that was an idea we also had ... by the 'standard' setup there is
only one 'file' (in this case path) in the filegroup - so we tried to add
one and had to realize that this is not possible (so far ?!) ... you get a
simple error message that filestream filegroups are not allowed to contain
more than one path ...
>
Dan Guzman wrote:
Quote:
Quote:
>>Did anybody run into a similar problem so far or is it in the end not
>>even a
>>SQL server but a Windows / File System problem (since through
>>filestreaming
>>millions of files go into a single folder) ?!
>>
>How many paths in your FILESTREAM filegroup? If you have only a single
>path, I'm curious if adding additional paths will improve performance.
>>
Andreas Zimmermann
Guest
 
Posts: n/a
#8: Jun 27 '08

re: SQL Server 2008 - Filestream


In fact, he is a collegue of mine ... and we found about that particular
workaround on friday (and it solved the problem for now it seems) - I
just did not have the time to post here ... but thanks for your heads up !

Erland Sommarskog wrote:
Quote:
Andreas Zimmermann (souldiver@nurfuerspam.de) writes:
Quote:
>At the moment we're using a 'straight forward' insert after extracting
>the data from our original table(s) ... is there an advantage in using
>OpenSqlFilestream instead (if so we would certainly 'restrucure' the
>extract / import) ?!
>
As I understand it, OpenSqlFilstream is more or less the whole point
with FILESTREAM: you can write data through the Win32 file API, which
should be faster than going through the needles eye of SQL Server.
>
However, it's not likely to help you with the issue of number of files
in the same directory getting larger than NTFS can cope with. I will have
to say that this issue exists at all. It should be a no-brainer to spot
this problem in the design.
>
There seems to be a workaround, though. I don't know if Jimi2Cool is your alter ego, if he just a person with the same problem as you. In any case,
he was able to diagnose the root of the problem and also able to address
it. Whether his fix is supported remains to see:
http://forums.microsoft.com/MSDN/Sho...iteID=1&mode=1
>
Andreas Zimmermann
Guest
 
Posts: n/a
#9: Jun 27 '08

re: SQL Server 2008 - Filestream


Thanks ... we might have a look at this - although you're right, it
sounds like a kludge :)

Dan Guzman wrote:
Quote:
Quote:
>Well, that was an idea we also had ... by the 'standard' setup there
>is only one 'file' (in this case path) in the filegroup - so we tried
>to add one and had to realize that this is not possible (so far ?!)
>... you get a simple error message that filestream filegroups are not
>allowed to contain more than one path ...
>
My bad - I didn't consider separate filegroups are required for separate
FILESTREAM folders. Another workaround might to create separate
FILESTREAM filegroups and parition the FILESTREAM portion of the table
so that data is distributed evenly in those folders. This is a bit of a
kluge, though. I think the workaround Erland found is a less kludgey
kludge ;-)
>
Closed Thread