473,785 Members | 3,157 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

TDS and character encoding

I've seen a dump of the TDS traffic going from my webserver to the SQL
Server database and it seems encoded in Unicode (it has two bytes per
char). Seems it would have a huge impact on performance if it
travelled in one byte. Why might this be?

rj

Aug 30 '07 #1
17 2609
(ra************ ***@yahoo.com) writes:
I've seen a dump of the TDS traffic going from my webserver to the SQL
Server database and it seems encoded in Unicode (it has two bytes per
char). Seems it would have a huge impact on performance if it
travelled in one byte. Why might this be?
I have never eavesdropped on TDS, but Unicode is indeed the character
set of SQL Server. You are perfectly able to name your tables in
Cyrillic or Hindi characters if you feel like. And of course character
strings may include all sorts of characters. So an batch of SQL statement
that is sent over the wire must be Unicode. That is beyond dispute.

However, you don't encode something in Unicode. Unicode is the character
set, and there are several encodings available, of which the most popular
are UTF-16 and UTF-8. In UTF-8 each character in the base plane takes up
2 bytes, and characters beyond that takes up 4 bytes. (The base plane
covers the vast majority of living langauges). In UTF-8, ASCII characters
takes up one byte, other characters in the Latin, Greek and Cyrillic
script takes two bytes, and Chinese and Japanese characters takes up three
bytes.

SQL Server uses UTF-16 exclusively. It is true that for network traffic
in the western world, it would be more effective if TDS used UTF-8, but
as you can see that it is necessarily the case in the Far East. And had
TDS used UTF-8, both ends of the wire would have had to convert to
UTF-16, so any reduced network traffic could be eaten up by extra CPU
time.
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 30 '07 #2
Snooping into the TDS would be the very last place I would look when trying
to improve performance. It would be like polishing a clean mirror to remove
one's zits.

--
_______________ _______________ ______
William (Bill) Vaughn
Author, Mentor, Consultant, Dad, Grandpa
Microsoft MVP
INETA Speaker
www.betav.com
www.betav.com/blog/billva
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
_______________ _______________ ____
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
-----------------------------------------------------------------------------------------------------------------------

"Erland Sommarskog" <es****@sommars kog.sewrote in message
news:Xn******** **************@ 127.0.0.1...
(ra************ ***@yahoo.com) writes:
>I've seen a dump of the TDS traffic going from my webserver to the SQL
Server database and it seems encoded in Unicode (it has two bytes per
char). Seems it would have a huge impact on performance if it
travelled in one byte. Why might this be?

I have never eavesdropped on TDS, but Unicode is indeed the character
set of SQL Server. You are perfectly able to name your tables in
Cyrillic or Hindi characters if you feel like. And of course character
strings may include all sorts of characters. So an batch of SQL statement
that is sent over the wire must be Unicode. That is beyond dispute.

However, you don't encode something in Unicode. Unicode is the character
set, and there are several encodings available, of which the most popular
are UTF-16 and UTF-8. In UTF-8 each character in the base plane takes up
2 bytes, and characters beyond that takes up 4 bytes. (The base plane
covers the vast majority of living langauges). In UTF-8, ASCII characters
takes up one byte, other characters in the Latin, Greek and Cyrillic
script takes two bytes, and Chinese and Japanese characters takes up three
bytes.

SQL Server uses UTF-16 exclusively. It is true that for network traffic
in the western world, it would be more effective if TDS used UTF-8, but
as you can see that it is necessarily the case in the Far East. And had
TDS used UTF-8, both ends of the wire would have had to convert to
UTF-16, so any reduced network traffic could be eaten up by extra CPU
time.
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 30 '07 #3
Well William, that is clearly not the case where you have a REAL
database with REAL traffic. When I mean REAL, I mean a 25Mbps stream
between the IIS servers and SQL Server... Getting away from about
10Mbps of unneeded traffic does not seem like polishing to me...
I can guarantee you that this is having serious impact on performance,
and when you're digging really into it (things like TCP/IP slow-
starts...), you really get to know why it's huge impact for the
client, the DB server and performance.
rj

On 30 Ago, 23:16, "William Vaughn" <billvaNoS...@b etav.comwrote:
Snooping into the TDS would be the very last place I would look when trying
to improve performance. It would be like polishing a clean mirror to remove
one's zits.

--
_______________ _______________ ______
William (Bill) Vaughn
Author, Mentor, Consultant, Dad, Grandpa
Microsoft MVP
INETA Speakerwww.beta v.comwww.betav. com/blog/billva
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
_______________ _______________ ____
Aug 31 '07 #4
Given that SQL Server has the highest TPC-E benchmarks in the industry,
don't you think that the SQL Server team has made the TDS stream as
efficient as possible? IMHO, it's not the line protocol or the lowest layers
of the interface that should be the focus of performance tuning, but the
applications, database designs and query methodologies that should dominate
your attempts to improve throughput and scalibility. Reducing the traffic on
the TDS channel will go a long way to improving performance if you have to
move that much volume over the wire to make a difference.

SQL Server Holds Record for TPC-E Database Benchmark
by Brian Moran, br***@solidqual itylearning.com

SQL Server now holds every conceivable world record for the TPC-E database
benchmark. That news would be slightly more impressive if TPC-E scores
existed for any database besides SQL Server, but heck, winning a race with
just one runner doesn't mean that runner did a bad job. I first wrote about
TPC-E, the latest benchmark from the Transaction Processing Performance
Council, in my commentary "TPC's New Benchmark Strives for Realism," October
2006, InstantDoc ID 93955.

Microsoft became the first database vendor to have a published TPC-E result
when Unisys published a TPC-E score on July 12 using SQL Server 2005 on a
dual-core 16-processor ES7000. IBM followed suit with a dual-core
2-processor server two weeks later, and Dell posted a dual-core 4-processor
result on August 24. Both IBM's and Dell's results used SQL Server, so SQL
Server is currently the only database vendor listed, meaning SQL Server
currently holds all the top scores. Sane vendors don't post TPC-E scores
that make them look bad, but I suspect it's only a matter of time before IBM
and Oracle post TPC- E scores for their database products that leapfrog the
latest SQL Server scores, which will in turn be bested by Microsoft in the
never-ending game of benchmark leapfrog.
Read the full article at:
http://lists.sqlmag.com/t?ctl=642B5:...B50D3688BDE645

--
_______________ _______________ ______
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
_______________ _______________ ____
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
-----------------------------------------------------------------------------------------------------------------------
<ra************ ***@yahoo.comwr ote in message
news:11******** *************@g 4g2000hsf.googl egroups.com...
Well William, that is clearly not the case where you have a REAL
database with REAL traffic. When I mean REAL, I mean a 25Mbps stream
between the IIS servers and SQL Server... Getting away from about
10Mbps of unneeded traffic does not seem like polishing to me...
I can guarantee you that this is having serious impact on performance,
and when you're digging really into it (things like TCP/IP slow-
starts...), you really get to know why it's huge impact for the
client, the DB server and performance.
rj

On 30 Ago, 23:16, "William Vaughn" <billvaNoS...@b etav.comwrote:
>Snooping into the TDS would be the very last place I would look when
trying
to improve performance. It would be like polishing a clean mirror to
remove
one's zits.

--
______________ _______________ _______
William (Bill) Vaughn
Author, Mentor, Consultant, Dad, Grandpa
Microsoft MVP
INETA Speakerwww.beta v.comwww.betav. com/blog/billva
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no
rights.
______________ _______________ _____
Aug 31 '07 #5
On 31 Ago, 18:54, "William \(Bill\) Vaughn"
<billvaRemoveT. ..@betav.comwro te:
Given that SQL Server has the highest TPC-E benchmarks in the industry,
don't you think that the SQL Server team has made the TDS stream as
efficient as possible? IMHO, it's not the line protocol or the lowest layers
of the interface that should be the focus of performance tuning, but the
applications, database designs and query methodologies that should dominate
your attempts to improve throughput and scalibility. Reducing the traffic on
the TDS channel will go a long way to improving performance if you have to
move that much volume over the wire to make a difference.
Don't know a lot about TPC-E benchmarks. Are they measured over a
network?

rj

Sep 1 '07 #6
On 31 Ago, 18:54, "William \(Bill\) Vaughn"
<billvaRemoveT. ..@betav.comwro te:
Given that SQL Server has the highest TPC-E benchmarks in the industry,
don't you think that the SQL Server team has made the TDS stream as
efficient as possible? IMHO, it's not the line protocol or the lowest layers
of the interface that should be the focus of performance tuning, but the
applications, database designs and query methodologies that should dominate
your attempts to improve throughput and scalibility. Reducing the traffic on
the TDS channel will go a long way to improving performance if you have to
move that much volume over the wire to make a difference.
Don't know a lot about TPC-E benchmarks. Are they measured over a
network?

rj

Sep 1 '07 #7
Don't know a lot about TPC-E benchmarks. Are they measured over a
network?
Database benchmarks are typically done with a dedicated database server and
remote client(s). You can download the results disclosure reports from
http://www.tpc.org/tpce/tpce_perf_results.asp to get details of the actual
configurations used. Looking at the specs of the network gear, it doesn't
look to me like the benchmark sponsors were too concerned about network
performance.

I agree with the others in this thread that the application and database
design are by far the biggest contributing factors to overall performance.
A little common sense, like filtering data on the server rather than the
client, goes a long way towards improving scalability and performance.
--
Hope this helps.

Dan Guzman
SQL Server MVP

<ra************ ***@yahoo.comwr ote in message
news:11******** **************@ y42g2000hsy.goo glegroups.com.. .
On 31 Ago, 18:54, "William \(Bill\) Vaughn"
<billvaRemoveT. ..@betav.comwro te:
>Given that SQL Server has the highest TPC-E benchmarks in the industry,
don't you think that the SQL Server team has made the TDS stream as
efficient as possible? IMHO, it's not the line protocol or the lowest
layers
of the interface that should be the focus of performance tuning, but the
applications , database designs and query methodologies that should
dominate
your attempts to improve throughput and scalibility. Reducing the traffic
on
the TDS channel will go a long way to improving performance if you have
to
move that much volume over the wire to make a difference.
Don't know a lot about TPC-E benchmarks. Are they measured over a
network?

rj
Sep 1 '07 #8
Erland Sommarskog (es****@sommars kog.se) writes:
(ra************ ***@yahoo.com) writes:
>>And, yes, while you would have seen a gross cut if TDS was UTF-8 on
the wire and not UTF-16, a Chinese user would have seen an increase
instead.

Wouldn't it be great to have an option?

If you thinks, submit this suggest on
http://connect.microsoft.com/SqlServer/Feedback.
Personally, I don't think it is worth the pain, also it would also
require changes in the client APIs. And all it would affect is query
batches sent to SQL Server and metadata sent back. If the query batches
sent to SQL Server is killing your network, maybe you should look into
using stored procedures.
Thinking of it, rather than having to select the character encoding, it's
better if the option was for compression of the network traffic in general.

But I find it difficult to believe that this would be a good option
for the traffic between a web server and an SQL Server that are on
the same LAN. It could possibly be an option if you are on a slow connection
over VPN. In general, I have a feeling that the network considerations
for SQL Server are for LAN connections, because that is surely the most
common scenario.
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Sep 1 '07 #9
In addition to agreement with others statements about refactoring the
application to reduce network traffic
Sure. Dont disagree. I could arrange that I get 5 records back from a SP
query instead of 1 record back per query.
And it may well be that 5 records fit in a network packet.
That is 1 round-trip rather than 5 round-trips.
But it could be that I squeeze 10 records or 20 records etc.
How do we determine this without endlessly going round some design-cycle
trying magic numbers which immediately alter if the fields in the query
alters?

Thanks

Stephen Howe
Sep 1 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
17630
by: David Komanek | last post by:
Hi all, I have a question if it is possible to manipulate the settings of character encoding in Ms Internet Explorer 5.0, 5.5 and 6.0. The problem is that the default instalation of Ms IE seems to have hard selected default encoding to "Western European (ISO)", which means iso-8859-1. When browsing pages with some Central/Eastern European characters these are converted to iso-8859-1 so displayed wrong. I would suppose the...
7
4974
by: Mark | last post by:
Hi... I've been doing a lot of work both creating and consuming web services, and I notice there seems to be a discontinuity between a number of the different cogs in the wheel centering around windows-1252 and that it is not equivalent to iso-8859-1. Looking in the registry under HKEY_CLASSES_ROOT\MIME\Database\Charset and \Codepage, it seems that all variations on iso-8859-1 (latin1, etc) are mapped to code page 1252, which I'm...
37
10173
by: chandy | last post by:
Hi, I have an Html document that declares that it uses the utf-8 character set. As this document is editable via a web interface I need to make sure than high-ascii characters that may be accidentally entered are properly represented when the document is served. My programming language allows me to get the ascii value for any individual character so what I am doing when a change is saved is to look at each character in the content and...
3
5871
by: Jon Davis | last post by:
I have a software application I've written called PowerBlog (PowerBlog.net) that takes the editing capability of the Internet Explorer WebBrowser control (essentially a DHTMLTextBox), extracts the user-typed HTML, assigns it as an XML node's InnerText property (using C#: System.Xml.XmlDocument obj; obj.InnerText = myHTML). Then I later get the InnerText as a string and write to disk. When this text is displayed in a web browser, special...
2
2360
by: John Dalberg | last post by:
The below html validates correctly on w3.org's html validator when the file has an html extension. When the same file gets an aspx extension, I get the error below from the validator. This tells me that ASP.NET is changing the character encoding in the http header. How can this be corrected so that aspx pages validate correctly? Error Message: "The character encoding specified in the HTTP header (utf-8) is different from the value in...
5
1971
by: Dadi | last post by:
Hi, My native language is Icelandic and I´m making a web service that returns results that contain many Icelandic characters. This works fine, however, when I return a soap:Fault, the string in the faultstring element has the Icelandic characters encoded like "réttindi", where "é" is an Icelandic accented character. Does anyone know why? If so, how do I deal with this?
13
28001
by: Michal | last post by:
Hello, is there any way how to detect string encoding in Python? I need to proccess several files. Each of them could be encoded in different charset (iso-8859-2, cp1250, etc). I want to detect it, and encode it to utf-8 (with string function encode). Thank you for any answer Regards Michal
37
3382
by: Zhiv Kurilka | last post by:
Hi, I have a text file with following content: "((^)|(.* +))§§§§§§§§" if I read it with: k=System.IO.StreamReader( "file.txt",System.Text.Encoding.ASCII); k.readtotheend()
2
15693
by: stup | last post by:
Hi! I have a small javascript snippet that does the following: // an entire html document is in here data = "\u003c!DOCTYPE html PUBLIC \u0022-//W3C//DTD XHTML 1.1 Strict//EN\u0022\n ...."; newwin = window.open("", "outputwin", ""); newwin.document.write(data);
4
1855
by: GGnOrE | last post by:
Hey, When I am writing an HTML Document, how do i know what character encoding I am using. Is Times New Roman have a specific character encoding or can it be found on my host server? What do you think is the best type of encoding? Thanks for any help that can be given
0
9645
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9480
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9950
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8972
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7499
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6740
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5381
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5511
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4053
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.