Doing a project that makes heavy use of domain names such as
"www.yahoo.com. "
Domain names preserve case but are concidered equal if names are same but
case is different.
I know I can store these names as keys in a hashtable with case insens
comparer and CaseInsensitive HashCodeProvide r. That works fine. The benifit
is I can store the domain name and not worry about case and return the user
supplied case without storing an state, etc. However, this comes at a cost
because all string compare operations now must be case sensitive such as
endswith, etc. If case was all "lower" for example, string compares is very
fast and if interned, then really fast. I could store domain name as all
lower case and then store a bitArray that tells me what chars where upper
case. However that seems like a pain and still requires at least 32 bytes
for a 255 char domain name, or 3 bytes for a 20 char name. I could also
store both the original case as a string and the lower case version that is
used for all compare, endswith, hash, etc operations. However this doubles
the storage needed. This can be leveraged with string interning for
duplicates. That is my most attractive option in terms of performance I
think, but was wonder what others think? Cheers!
--
William Stacey, MVP 10 1283
Go back a couple of days and look up that IndexOf I helped a user with.
You should probably write your own custom string operations for this one
that give you maximum speed with the trade-off, that you won't be culture
aware. In this case, your case insens work does not need to be culture aware
it simply has to follow the RFC's for domain names which are fairly strict.
--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message
news:uT******** ******@TK2MSFTN GP10.phx.gbl... Doing a project that makes heavy use of domain names such as "www.yahoo.com. " Domain names preserve case but are concidered equal if names are same but case is different. I know I can store these names as keys in a hashtable with case insens comparer and CaseInsensitive HashCodeProvide r. That works fine. The benifit is I can store the domain name and not worry about case and return the user supplied case without storing an state, etc. However, this comes at a cost because all string compare operations now must be case sensitive such as endswith, etc. If case was all "lower" for example, string compares is very fast and if interned, then really fast. I could store domain name as all lower case and then store a bitArray that tells me what chars where upper case. However that seems like a pain and still requires at least 32 bytes for a 255 char domain name, or 3 bytes for a 20 char name. I could also store both the original case as a string and the lower case version that is used for all compare, endswith, hash, etc operations. However this doubles the storage needed. This can be leveraged with string interning for duplicates. That is my most attractive option in terms of performance I think, but was wonder what others think? Cheers!
-- William Stacey, MVP
Why do you have to return the domain back to the user in the case it was
entered (it would actually make more sense to correct it to lower case I
think, because all domains are in lower case and anything else they enter is
probably mistyped).
You could even correct the domain by doing a reverse DNS lookup and storing
the result.
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message
news:uT******** ******@TK2MSFTN GP10.phx.gbl... Doing a project that makes heavy use of domain names such as "www.yahoo.com. " Domain names preserve case but are concidered equal if names are same but case is different. I know I can store these names as keys in a hashtable with case insens comparer and CaseInsensitive HashCodeProvide r. That works fine. The
benifit is I can store the domain name and not worry about case and return the
user supplied case without storing an state, etc. However, this comes at a
cost because all string compare operations now must be case sensitive such as endswith, etc. If case was all "lower" for example, string compares is
very fast and if interned, then really fast. I could store domain name as all lower case and then store a bitArray that tells me what chars where upper case. However that seems like a pain and still requires at least 32 bytes for a 255 char domain name, or 3 bytes for a 20 char name. I could also store both the original case as a string and the lower case version that
is used for all compare, endswith, hash, etc operations. However this
doubles the storage needed. This can be leveraged with string interning for duplicates. That is my most attractive option in terms of performance I think, but was wonder what others think? Cheers!
-- William Stacey, MVP
> think, because all domains are in lower case and anything else they enter
is probably mistyped).
That would be nice, but not allowed by the 1034-1035 . You must preserve
the case of domain names and labels. You could even correct the domain by doing a reverse DNS lookup and
storing the result.
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message news:uT******** ******@TK2MSFTN GP10.phx.gbl... Doing a project that makes heavy use of domain names such as "www.yahoo.com. " Domain names preserve case but are concidered equal if names are same
but case is different. I know I can store these names as keys in a hashtable with case insens comparer and CaseInsensitive HashCodeProvide r. That works fine. The benifit is I can store the domain name and not worry about case and return the user supplied case without storing an state, etc. However, this comes at a cost because all string compare operations now must be case sensitive such as endswith, etc. If case was all "lower" for example, string compares is very fast and if interned, then really fast. I could store domain name as
all lower case and then store a bitArray that tells me what chars where
upper case. However that seems like a pain and still requires at least 32
bytes for a 255 char domain name, or 3 bytes for a 20 char name. I could also store both the original case as a string and the lower case version that is used for all compare, endswith, hash, etc operations. However this doubles the storage needed. This can be leveraged with string interning for duplicates. That is my most attractive option in terms of performance I think, but was wonder what others think? Cheers!
-- William Stacey, MVP
but but... why?
Internet explorer doesn't for a start. Enter a URL in messed up case, and
it'll correct the domain et al when it displays the page.
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message
news:%2******** ********@TK2MSF TNGP09.phx.gbl. .. think, because all domains are in lower case and anything else they
enter is probably mistyped).
That would be nice, but not allowed by the 1034-1035 . You must preserve the case of domain names and labels.
You could even correct the domain by doing a reverse DNS lookup and
storing the result.
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message news:uT******** ******@TK2MSFTN GP10.phx.gbl... Doing a project that makes heavy use of domain names such as "www.yahoo.com. " Domain names preserve case but are concidered equal if names are same but case is different. I know I can store these names as keys in a hashtable with case insens comparer and CaseInsensitive HashCodeProvide r. That works fine. The benifit is I can store the domain name and not worry about case and return the user supplied case without storing an state, etc. However, this comes at a cost because all string compare operations now must be case sensitive such
as endswith, etc. If case was all "lower" for example, string compares
is very fast and if interned, then really fast. I could store domain name as all lower case and then store a bitArray that tells me what chars where upper case. However that seems like a pain and still requires at least 32 bytes for a 255 char domain name, or 3 bytes for a 20 char name. I could
also store both the original case as a string and the lower case version
that is used for all compare, endswith, hash, etc operations. However this doubles the storage needed. This can be leveraged with string interning for duplicates. That is my most attractive option in terms of performance
I think, but was wonder what others think? Cheers!
-- William Stacey, MVP
That is IE (the application) that downcases it. The resolver and the dns
servers preserve the case that was entered when the RR was created. Utils
like dig and nslookup can be used to see that case is preserved. IE should
be the benchmark in this case.
--
William Stacey, MVP
"John Wood" <sp**@isannoyin g.com> wrote in message
news:#s******** *****@tk2msftng p13.phx.gbl... but but... why?
Internet explorer doesn't for a start. Enter a URL in messed up case, and it'll correct the domain et al when it displays the page.
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message news:%2******** ********@TK2MSF TNGP09.phx.gbl. .. think, because all domains are in lower case and anything else they enter is probably mistyped). That would be nice, but not allowed by the 1034-1035 . You must
preserve the case of domain names and labels.
You could even correct the domain by doing a reverse DNS lookup and
storing the result.
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message news:uT******** ******@TK2MSFTN GP10.phx.gbl... > Doing a project that makes heavy use of domain names such as > "www.yahoo.com. " > Domain names preserve case but are concidered equal if names are
same but > case is different. > I know I can store these names as keys in a hashtable with case
insens > comparer and CaseInsensitive HashCodeProvide r. That works fine. The benifit > is I can store the domain name and not worry about case and return
the user > supplied case without storing an state, etc. However, this comes at
a cost > because all string compare operations now must be case sensitive
such as > endswith, etc. If case was all "lower" for example, string compares is very > fast and if interned, then really fast. I could store domain name
as all > lower case and then store a bitArray that tells me what chars where upper > case. However that seems like a pain and still requires at least 32 bytes > for a 255 char domain name, or 3 bytes for a 20 char name. I could also > store both the original case as a string and the lower case version that is > used for all compare, endswith, hash, etc operations. However this doubles > the storage needed. This can be leveraged with string interning for > duplicates. That is my most attractive option in terms of
performance I > think, but was wonder what others think? Cheers! > > -- > William Stacey, MVP > >
well i'm not saying that's the wrong thing to do... just interested in why
it's so important. Surely it's more important to reflect the intent of the
company/person hosting the site, than the person who entered in the site
name?
That's a bit like someone mispronouncing your name, and you continuing that
mispronunciatio n, rather than either correcting them, or ignoring them and
continuing with the correct pronunciation.
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message
news:em******** ******@TK2MSFTN GP10.phx.gbl... That is IE (the application) that downcases it. The resolver and the dns servers preserve the case that was entered when the RR was created. Utils like dig and nslookup can be used to see that case is preserved. IE
should be the benchmark in this case.
-- William Stacey, MVP
"John Wood" <sp**@isannoyin g.com> wrote in message news:#s******** *****@tk2msftng p13.phx.gbl... but but... why?
Internet explorer doesn't for a start. Enter a URL in messed up case,
and it'll correct the domain et al when it displays the page.
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message news:%2******** ********@TK2MSF TNGP09.phx.gbl. .. > think, because all domains are in lower case and anything else they enter is > probably mistyped).
That would be nice, but not allowed by the 1034-1035 . You must preserve the case of domain names and labels.
> > You could even correct the domain by doing a reverse DNS lookup and storing > the result. > > "William Stacey [MVP]" <st***********@ mvps.org> wrote in message > news:uT******** ******@TK2MSFTN GP10.phx.gbl... > > Doing a project that makes heavy use of domain names such as > > "www.yahoo.com. " > > Domain names preserve case but are concidered equal if names are same but > > case is different. > > I know I can store these names as keys in a hashtable with case insens > > comparer and CaseInsensitive HashCodeProvide r. That works fine.
The > benifit > > is I can store the domain name and not worry about case and return the > user > > supplied case without storing an state, etc. However, this comes
at a > cost > > because all string compare operations now must be case sensitive such as > > endswith, etc. If case was all "lower" for example, string
compares is > very > > fast and if interned, then really fast. I could store domain name as all > > lower case and then store a bitArray that tells me what chars
where upper > > case. However that seems like a pain and still requires at least
32 bytes > > for a 255 char domain name, or 3 bytes for a 20 char name. I
could also > > store both the original case as a string and the lower case
version that > is > > used for all compare, endswith, hash, etc operations. However
this > doubles > > the storage needed. This can be leveraged with string interning
for > > duplicates. That is my most attractive option in terms of
performance I > > think, but was wonder what others think? Cheers! > > > > -- > > William Stacey, MVP > > > > > >
If you do an axfr, for example, you will see the case of all your rrs in the
zone in the case you entered.
If you do "dig abc.test.com", the server will return "abc.test.c om" even if
the case on the server is "ABC.test.c om."
If you do "dig abC.test.com", the server will return "abC.test.c om" - or the
same case as your question. The match is case insensitive. Not sure I know
how to comment other then that is how it works currently. I think the
important point is that the server maintains case, but does case insensitive
matching, so it does not matter what case the QName is sent in. Cheers,
--
William Stacey, MVP
"John Wood" <sp**@isannoyin g.com> wrote in message
news:eX******** ******@tk2msftn gp13.phx.gbl... well i'm not saying that's the wrong thing to do... just interested in why it's so important. Surely it's more important to reflect the intent of the company/person hosting the site, than the person who entered in the site name?
That's a bit like someone mispronouncing your name, and you continuing
that mispronunciatio n, rather than either correcting them, or ignoring them and continuing with the correct pronunciation.
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message news:em******** ******@TK2MSFTN GP10.phx.gbl... That is IE (the application) that downcases it. The resolver and the
dns servers preserve the case that was entered when the RR was created.
Utils like dig and nslookup can be used to see that case is preserved. IE should be the benchmark in this case.
-- William Stacey, MVP
"John Wood" <sp**@isannoyin g.com> wrote in message news:#s******** *****@tk2msftng p13.phx.gbl... but but... why?
Internet explorer doesn't for a start. Enter a URL in messed up case, and it'll correct the domain et al when it displays the page.
"William Stacey [MVP]" <st***********@ mvps.org> wrote in message news:%2******** ********@TK2MSF TNGP09.phx.gbl. .. > > think, because all domains are in lower case and anything else
they enter > is > > probably mistyped). > > That would be nice, but not allowed by the 1034-1035 . You must preserve > the case of domain names and labels. > > > > > You could even correct the domain by doing a reverse DNS lookup
and > storing > > the result. > > > > "William Stacey [MVP]" <st***********@ mvps.org> wrote in message > > news:uT******** ******@TK2MSFTN GP10.phx.gbl... > > > Doing a project that makes heavy use of domain names such as > > > "www.yahoo.com. " > > > Domain names preserve case but are concidered equal if names are same > but > > > case is different. > > > I know I can store these names as keys in a hashtable with case insens > > > comparer and CaseInsensitive HashCodeProvide r. That works fine. The > > benifit > > > is I can store the domain name and not worry about case and
return the > > user > > > supplied case without storing an state, etc. However, this
comes at a > > cost > > > because all string compare operations now must be case sensitive such as > > > endswith, etc. If case was all "lower" for example, string compares is > > very > > > fast and if interned, then really fast. I could store domain
name as > all > > > lower case and then store a bitArray that tells me what chars where > upper > > > case. However that seems like a pain and still requires at
least 32 > bytes > > > for a 255 char domain name, or 3 bytes for a 20 char name. I could also > > > store both the original case as a string and the lower case version that > > is > > > used for all compare, endswith, hash, etc operations. However this > > doubles > > > the storage needed. This can be leveraged with string interning for > > > duplicates. That is my most attractive option in terms of
performance I > > > think, but was wonder what others think? Cheers! > > > > > > -- > > > William Stacey, MVP > > > > > > > > > > >
Hi William,
First of all, I would like to confirm my understanding of your issue. From
your description, I understand that you need to know the best way to do
case-insensitive compare and storage. If there is any misunderstandin g,
please feel free to let me know.
As far as I know, it's hard to get both of time complexity and space
complexity. When more performance is get, we will lose much room for
storage. When less memory is used, we'll get better performance.
So I think whether to choose time or space depends on the project. When the
server is very fast and not much users are accessing the service
simultaneously, we can save the URL with the original case and compare with
CaseInsensitive HashCodeProvide r. If the users are doing the compare
frequently, we can try to save the text in two editions and compare with
the lower cased edition.
HTH. If anything is unclear, please feel free to reply to the post.
Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
Agreed. I decided on caseinsensitive hcp and storing the domain names as
entered in what ever case they are. This does mean I can't intern them and
do quick object.refequal s testing or simple string.equal(a) testing.
However, after you factor in that each request would require downcasing
(slow) and time of interning or getting intern pool ref to string, it takes
more time to do those two things. Hope you get what I mean. Cheers!
--
William Stacey, MVP
"Kevin Yu [MSFT]" <v-****@online.mic rosoft.com> wrote in message
news:i3******** ******@cpmsftng xa10.phx.gbl... Hi William,
First of all, I would like to confirm my understanding of your issue. From your description, I understand that you need to know the best way to do case-insensitive compare and storage. If there is any misunderstandin g, please feel free to let me know.
As far as I know, it's hard to get both of time complexity and space complexity. When more performance is get, we will lose much room for storage. When less memory is used, we'll get better performance.
So I think whether to choose time or space depends on the project. When
the server is very fast and not much users are accessing the service simultaneously, we can save the URL with the original case and compare
with CaseInsensitive HashCodeProvide r. If the users are doing the compare frequently, we can try to save the text in two editions and compare with the lower cased edition.
HTH. If anything is unclear, please feel free to reply to the post.
Kevin Yu ======= "This posting is provided "AS IS" with no warranties, and confers no rights." This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: thechaosengine |
last post by:
Hi all,
Is it ok to use a string as the key for Hashtable entries? I want to use
the name of entity in question, which I know will always be unique.
Do I have to do anything fancy equality-wise or are there any caveats I should
be aware of?
Thanks to anyone who can advise.
Kindest Regards
|
by: francois |
last post by:
First of all I would to to apologize for resending this post again but I
feel like my last post as been spoiled
Here I go for my problem:
Hi,
I have a webservice that I am using and I would like it to return an XML
serialized version of an object.
|
by: Cyrus |
last post by:
I have a question regarding synchronization across multiple threads for a
Hashtable. Currently I have a Threadpool that is creating worker threads
based on requests to read/write to a hashtable. One function of the
Hashtable is to iterate through its keys, which apparently is inherently not
thread-safe. Other functions of the Hashtable include
adding/modifying/deleting.
To solve the synchronization issues I am doing two things:
1. Lock...
|
by: SenthilVel |
last post by:
how to get the corresponding values for a given Key in hashtable ??
|
by: Ken |
last post by:
I have a C# Program where multiple threads will operate on a same Hashtable.
This Hashtable is synchronized by using Hashtable.Synchronized(myHashtable)
method, so no further Lock statements are used before adding, removing or
iterating the Hashtable. The program runs in a high workload environment.
After running a few days, now it suddenly catchs this Exception when
inserting a pair of key and object,
stacktrace =...
| |
by: Sreekanth |
last post by:
Hello,
Is there any better collection than HashTable in terms of performance, when
the type of the key is integer?
Regards,
Sreekanth.
|
by: Fred |
last post by:
I'm trying to build a hashtable and a arraylist as object value
I'm not able to retrieve stored object from the hashtable.
Hashtable mp = new Hashtable(); // THE HASHTABLE
ArrayList atemp = new ArrayList(); // THE ARRAY
StreamWriter sw = new StreamWriter(@"C:\temp\fred.html");
|
by: PAzevedo |
last post by:
I have this Hashtable of Hashtables, and I'm accessing this object from
multiple threads, now the Hashtable object is thread safe for reading,
but not for writing, so I lock the object every time I need to write to
it, but now it occurred to me that maybe I could just lock one of the
Hashtables inside without locking the entire object, but then I thought
maybe some thread could instruct the outside Hashtable to remove an
inside Hashtable...
|
by: archana |
last post by:
Hi all,
I am having one confusion regarding hashtable.
I am having function in which i am passing hashtable as reference. In
function i am creating one hashtable which is local to that function.
Then i am setting this hash table to hashtable which i am passing as
ref.
So my question is how scope is mention when i am assigning local
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |