By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,491 Members | 3,132 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,491 IT Pros & Developers. It's quick & easy.

Script to combine multiple rows into 1 single row

P: n/a
Hi,

I'm working on a system migration and I need to combine data from multiple
rows (with the same ID) into one comma separated string. This is how the
data is at the moment:

Company_ID Material
0x00C00000000053B86 Lead
0x00C00000000053B86 Sulphur
0x00C00000000053B86 Concrete

I need it in the following format:
Company_ID Material
0x00C00000000053B86 Lead, Sulphur, Concrete

There is no definite number of materials per Company.

I have read the part of
http://www.sommarskog.se/arrays-in-sql.html#iterative that talks about 'The
Iterative Method' but my knowledge of SQL is very limited and I don't know
how to use this code to get what I need.

Can anyone help me?
Dec 22 '06 #1
Share this Question
Share on Google+
7 Replies


P: n/a
Mintyman (mi******@ntlworld.com) writes:
I'm working on a system migration and I need to combine data from multiple
rows (with the same ID) into one comma separated string. This is how the
data is at the moment:

Company_ID Material
0x00C00000000053B86 Lead
0x00C00000000053B86 Sulphur
0x00C00000000053B86 Concrete

I need it in the following format:
Company_ID Material
0x00C00000000053B86 Lead, Sulphur, Concrete

There is no definite number of materials per Company.

I have read the part of
http://www.sommarskog.se/arrays-in-sql.html#iterative that talks about
'The Iterative Method' but my knowledge of SQL is very limited and I
don't know how to use this code to get what I need.
And that article covers the opposite process - unpacking the list.

Composing the list is less funny, because it produces a result which
violates basic principles in a relational database: no repeating groups.
That is not to say that it's a stupid thing to ask for; it's not strange
to ask for this format in reporting. I get a little nervous when you
say that you are working with system migraton, because that means that
someone will have to handle the comma-separated list on the other side,
and is not funny at all. But I assume that you don't have control over
that.

Anyway, to give a good answer to the question, I would need to know a
few more things:
o Which version of SQL Server?
o What is a reasonable upper limit of the comma-separated string? You
could determine the current max value with this query:

SELECT MAX(listlen), AVG(listlen)
FROM (SELECT SUM(len(Material) + 2)
FROM tbl
GROUP BY Company_ID) as a

o What is the datatype of Material? That is, is varchar or nvarchar?

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Dec 22 '06 #2

P: n/a
Hi Erland,

I hope it's not to late to get help on this one!

Here are the answers you are looking for:

1) I'm using SQL 2000
2) 40
3) nvarchar

To clarify the field names, it is 'material_name' instead of 'material' and
is 'to_company' instead of 'company_id'

Thanks!

Mintyman

"Erland Sommarskog" <es****@sommarskog.sewrote in message
news:Xn**********************@127.0.0.1...
Mintyman (mi******@ntlworld.com) writes:
>I'm working on a system migration and I need to combine data from
multiple
rows (with the same ID) into one comma separated string. This is how the
data is at the moment:

Company_ID Material
0x00C00000000053B86 Lead
0x00C00000000053B86 Sulphur
0x00C00000000053B86 Concrete

I need it in the following format:
Company_ID Material
0x00C00000000053B86 Lead, Sulphur, Concrete

There is no definite number of materials per Company.

I have read the part of
http://www.sommarskog.se/arrays-in-sql.html#iterative that talks about
'The Iterative Method' but my knowledge of SQL is very limited and I
don't know how to use this code to get what I need.

And that article covers the opposite process - unpacking the list.

Composing the list is less funny, because it produces a result which
violates basic principles in a relational database: no repeating groups.
That is not to say that it's a stupid thing to ask for; it's not strange
to ask for this format in reporting. I get a little nervous when you
say that you are working with system migraton, because that means that
someone will have to handle the comma-separated list on the other side,
and is not funny at all. But I assume that you don't have control over
that.

Anyway, to give a good answer to the question, I would need to know a
few more things:
o Which version of SQL Server?
o What is a reasonable upper limit of the comma-separated string? You
could determine the current max value with this query:

SELECT MAX(listlen), AVG(listlen)
FROM (SELECT SUM(len(Material) + 2)
FROM tbl
GROUP BY Company_ID) as a

o What is the datatype of Material? That is, is varchar or nvarchar?

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Jan 4 '07 #3

P: n/a
Mintyman (mi******@ntlworld.com) writes:
I hope it's not to late to get help on this one!

Here are the answers you are looking for:

1) I'm using SQL 2000
2) 40
3) nvarchar
I the longest list would be 40 characters, this means that there are not
that many materials per company. Since you said no limit, I was afraid
that there was a risk that you could exceed the limit of 4000 for an
nvarchar. In that case, you would have been in real dire straits. Unless
you had been on SQL 2005 where this would have been much simpler.

Here is an example of a query that runs in Northwind. First run:

select max(cnt) from
(select OrderID, cnt = COUNT(*)
from [Order Details]
group by OrderID) s

(but translated to your database). This gives the longest list in number
of elements. In case of Northwind the returned number is 25 which is a tad
many. With a maximum of 40 characters per list, a maximum of seven seems
reasonable. Using that number, here is a query for Northwind that
returns a comma-separated lists per order:

SELECT OrderID,
MAX(CASE OD.rowno WHEN 1 THEN P.ProductName END) +
coalesce(MAX(CASE OD.rowno WHEN 2 THEN ', ' + P.ProductName END), '') +
coalesce(MAX(CASE OD.rowno WHEN 3 THEN ', ' + P.ProductName END), '') +
coalesce(MAX(CASE OD.rowno WHEN 4 THEN ', ' + P.ProductName END), '') +
coalesce(MAX(CASE OD.rowno WHEN 5 THEN ', ' + P.ProductName END), '') +
coalesce(MAX(CASE OD.rowno WHEN 6 THEN ', ' + P.ProductName END), '') +
coalesce(MAX(CASE OD.rowno WHEN 7 THEN ', ' + P.ProductName END), '')
FROM (SELECT a.OrderID, a.ProductID,
rowno = (SELECT COUNT(*)
FROM [Order Details] b
WHERE b.OrderID = a.OrderID
AND b.ProductID <= a.ProductID)
FROM [Order Details] a) AS OD
JOIN Products P ON P.ProductID = OD.ProductID
GROUP BY OD.OrderID
ORDER BY OD.OrderID

If your maximum number is 8, you will need to add one more line.

Caveat: the performance of this is not fantastic. The big culprit is
the SELECT that computes the row number. If you have millions and millions
of rows in that table, you may bave to find a different way to compute
the row number. One way would to be bounce the data over a temp table
with an IDENTITY column. But before you go that route, try a query like
the one above.

If you need to compose many of these queries, I would suggest that you
look into the third-party tool RAC, http://www.rac4sql.net/ which can
help you to generate such queries.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jan 4 '07 #4

P: n/a
Hi Erland,

Thanks for the script. The difference between the Northwind database and
mine is that all the data I want to get access to is in one table (unlike
Northwind where it is spread over [order details[ and [products]. I tried
modifying the script but it doesn't work:

SELECT to_company,
MAX(CASE OD.rowno WHEN 1 THEN Material_Name END) +
coalesce(MAX(CASE OD.rowno WHEN 2 THEN ', ' + Material_Name END), '') +
coalesce(MAX(CASE OD.rowno WHEN 3 THEN ', ' + Material_Name END), '') +
coalesce(MAX(CASE OD.rowno WHEN 4 THEN ', ' + Material_Name END), '') +
coalesce(MAX(CASE OD.rowno WHEN 5 THEN ', ' + Material_Name END), '') +
coalesce(MAX(CASE OD.rowno WHEN 6 THEN ', ' + Material_Name END), '') +
coalesce(MAX(CASE OD.rowno WHEN 7 THEN ', ' + Material_Name END), '')
FROM Material__Bridge AS OD
GROUP BY OD.to_company
ORDER BY OD.to_company

It says there is an invalid column name 'rowno' - I guess this is right
because there is no column with that name in my database! However, when I
check in Northwind, there isn't one called that there either!

Any ideas?
"Erland Sommarskog" <es****@sommarskog.sewrote in message
news:Xn*********************@127.0.0.1...
Mintyman (mi******@ntlworld.com) writes:
>I hope it's not to late to get help on this one!

Here are the answers you are looking for:

1) I'm using SQL 2000
2) 40
3) nvarchar

I the longest list would be 40 characters, this means that there are not
that many materials per company. Since you said no limit, I was afraid
that there was a risk that you could exceed the limit of 4000 for an
nvarchar. In that case, you would have been in real dire straits. Unless
you had been on SQL 2005 where this would have been much simpler.

Here is an example of a query that runs in Northwind. First run:

select max(cnt) from
(select OrderID, cnt = COUNT(*)
from [Order Details]
group by OrderID) s

(but translated to your database). This gives the longest list in number
of elements. In case of Northwind the returned number is 25 which is a tad
many. With a maximum of 40 characters per list, a maximum of seven seems
reasonable. Using that number, here is a query for Northwind that
returns a comma-separated lists per order:

SELECT OrderID,
MAX(CASE OD.rowno WHEN 1 THEN P.ProductName END) +
coalesce(MAX(CASE OD.rowno WHEN 2 THEN ', ' + P.ProductName END), '')
+
coalesce(MAX(CASE OD.rowno WHEN 3 THEN ', ' + P.ProductName END), '')
+
coalesce(MAX(CASE OD.rowno WHEN 4 THEN ', ' + P.ProductName END), '')
+
coalesce(MAX(CASE OD.rowno WHEN 5 THEN ', ' + P.ProductName END), '')
+
coalesce(MAX(CASE OD.rowno WHEN 6 THEN ', ' + P.ProductName END), '')
+
coalesce(MAX(CASE OD.rowno WHEN 7 THEN ', ' + P.ProductName END), '')
FROM (SELECT a.OrderID, a.ProductID,
rowno = (SELECT COUNT(*)
FROM [Order Details] b
WHERE b.OrderID = a.OrderID
AND b.ProductID <= a.ProductID)
FROM [Order Details] a) AS OD
JOIN Products P ON P.ProductID = OD.ProductID
GROUP BY OD.OrderID
ORDER BY OD.OrderID

If your maximum number is 8, you will need to add one more line.

Caveat: the performance of this is not fantastic. The big culprit is
the SELECT that computes the row number. If you have millions and millions
of rows in that table, you may bave to find a different way to compute
the row number. One way would to be bounce the data over a temp table
with an IDENTITY column. But before you go that route, try a query like
the one above.

If you need to compose many of these queries, I would suggest that you
look into the third-party tool RAC, http://www.rac4sql.net/ which can
help you to generate such queries.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Jan 5 '07 #5

P: n/a
Mintyman (mi******@ntlworld.com) writes:
Thanks for the script. The difference between the Northwind database and
mine is that all the data I want to get access to is in one table (unlike
Northwind where it is spread over [order details[ and [products].
I could have done the script with product ids instead of product names
but that seemed boring.
It says there is an invalid column name 'rowno' - I guess this is right
because there is no column with that name in my database! However, when I
check in Northwind, there isn't one called that there either!
The column rowno is defined in the derived table. I suggest that you study
my query a little closer, and try to understand what it's actually doing.

It might be that you want to be spoon-fed a solution, but I have this funny
idea that I like to help people to help themselves. That is, when I post a
solution, I hope that people do not only use it, but also try to understand
how it works, so that the next time they run into a similar problem, they
now have something in their toolbox that they can apply.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jan 5 '07 #6

P: n/a
Hi Erland,

I totally agree with not being spoon fed! I'm sorry I came across as wanting
to be. I'll try and work out what your script is doing :o) Thanks for your
help!

Mintyman

"Erland Sommarskog" <es****@sommarskog.sewrote in message
news:Xn**********************@127.0.0.1...
Mintyman (mi******@ntlworld.com) writes:
>Thanks for the script. The difference between the Northwind database and
mine is that all the data I want to get access to is in one table (unlike
Northwind where it is spread over [order details[ and [products].

I could have done the script with product ids instead of product names
but that seemed boring.
>It says there is an invalid column name 'rowno' - I guess this is right
because there is no column with that name in my database! However, when I
check in Northwind, there isn't one called that there either!

The column rowno is defined in the derived table. I suggest that you study
my query a little closer, and try to understand what it's actually doing.

It might be that you want to be spoon-fed a solution, but I have this
funny
idea that I like to help people to help themselves. That is, when I post a
solution, I hope that people do not only use it, but also try to
understand
how it works, so that the next time they run into a similar problem, they
now have something in their toolbox that they can apply.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Jan 5 '07 #7

P: n/a
Mintyman (mi******@ntlworld.com) writes:
I totally agree with not being spoon fed! I'm sorry I came across as
wanting to be. I'll try and work out what your script is doing :o)
There is one thing I should have pointed out. In my query there was this
part:

(SELECT a.OrderID, a.ProductID,
rowno = (SELECT COUNT(*)
FROM [Order Details] b
WHERE b.OrderID = a.OrderID
AND b.ProductID <= a.ProductID)
FROM [Order Details] a) AS OD

That is a *derived table*. A derived table is logically a temp table in
the query so to speak, but not materialised, and the actually computation
order can be different as long as the result is the same. Derived tables
is an enormously powerful tool to build complex queries with, and saves
you from using real temp tables.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jan 5 '07 #8

This discussion thread is closed

Replies have been disabled for this discussion.