String Comparison - DB2 Database

Michel Esber

Hello,

DB2 V8 FP 11.

Given two strings, I need an UDF to compare both and return the
percentage of matching characters.

For example:

ABCDEFGHIJ
ACBDEFGHIJ

The strings are 80% alike.

I can think of an easy UDF that compares each byte and returns the
number of different chars, in %. Just wondering if there is any other
built-in function to do this, or any other better way.

Thanks in advance,

May 12 '06 #1

Subscribe Post Reply

8686

shenanwei

DIFFERENCE to compare the SOUND of 2 string.

May 12 '06 #2

Michel Esber

Thanks for the hint.

I saw this function before posting. Unfortunately, it does not meet my
requirements. For example and according to the docs:

SELECT EMPNO, LASTNAME FROM EMPLOYEE
WHERE SOUNDEX(LASTNAME) = SOUNDEX('Loucesy')

EMPNO LASTNAME
------ ---------------
000110 LUCCHESSI

But for my application, 'Loucesy' and 'LUCCHESSI' are very different
strings, even though their sounds are similar.

Any other ideas?

Thanks

May 12 '06 #3

Dave Hughes

Michel Esber wrote:

Thanks for the hint.

I saw this function before posting. Unfortunately, it does not meet my
requirements. For example and according to the docs:

SELECT EMPNO, LASTNAME FROM EMPLOYEE
WHERE SOUNDEX(LASTNAME) = SOUNDEX('Loucesy')

EMPNO LASTNAME
------ ---------------
000110 LUCCHESSI

But for my application, 'Loucesy' and 'LUCCHESSI' are very different
strings, even though their sounds are similar.

Any other ideas?

I *think* what you're looking for is the "distance" between strings;
i.e. the number of changes one must make to get from one string to
another. The Levenshtein Distance algorithm provides a way to calculate
this. See:

Levenshtein Distance Article
http://en.wikipedia.org/wiki/Levenshtein_distance

Example implementations in several languages
http://en.wikisource.org/wiki/Levenshtein_distance

This algorithm returns 2 for the distance between "ABCDEFGHIJ" and
"ACBDEFGHIJ" (indicating that 2 alterations, an insertion and a
deletion, have to be made to get from one to the other). There are
refinements of the Levenshtein Distance algorithm that include swapping
characters as an operation which could return 1 for the distance.

To get a percentage similarity you could do something fairly crude like
comparing the distance to the length of the string, e.g.:

100 * (len - distance) / len

Which in this case would give 80%.

Unfortunately, looking at the implementations, the algorithm is
probably quite hard to implement efficiently in an SQL UDF. You'd
likely be better off implementing it as an external UDF in C or Java
(there are C++ and Java implementations at the link above, as well as
Lisp, Python, Ruby, Perl, Haskell, etc.)
HTH,

Dave.

--

May 12 '06 #4

Michel Esber

Dave, that is exactly what I needed.

Thanks a lot.

May 12 '06 #5

Similar topics

empty string

by: David Graham | last post by:

Hi I have been busy going through the last weeks postings in an attempt to absorb javascript syntax (I guess it's not possible to just absorb this stuff in a passive way - I'm getting way out of...

Javascript

One more reason for using std::string instead of char * in C++ programs.

by: Neil Zanella | last post by:

Hello, Consider the following program. There are two C style string stack variables and one C style string heap variable. The compiler may or may not optimize the space taken up by the two stack...

C / C++

String vs new String

by: Grant Wagner | last post by:

I'm a bit confused by String() (typeof 'string') vs new String() (typeof 'object'). When you need to access a method or property of a -String-, what type is JavaScript expecting (or rather, what...

Javascript

[C] simple string question

by: Alan | last post by:

hi all, I want to define a constant length string, say 4 then in a function at some time, I want to set the string to a constant value, say a below is my code but it fails what is the correct...

C / C++

string comparison

by: yadurajj | last post by:

Hello i am newbie trying to learn C..I need to know about string comparisons in C, without using a library function,...recently I was asked this in an interview..I can write a small program but I...

C / C++

String Comparison

by: MaSTeR | last post by:

Can anyone provide a practical short example of why in C# I shouldn't compare two strings with == ? If I write this in JAVA String string1 = "Widget"; if (string1 == "Widget") ...

C# / C Sharp

string comparison with ==

by: Peter Kirk | last post by:

Hi I am looking at some code which in many places performs string comparison using == instead of Equals. Am I right in assuming that this will in fact work "as expected" when it is strings...

C# / C Sharp

Lowercase std::string compare?

by: Jim Langston | last post by:

Is there any builtin lowercase std::string compare? Right now I'm doing this: if ( _stricmp( AmmoTypeText.c_str(), "GunBullet" ) == 0 ) AmmoType = Item_Ammo_GunBullet; Is there anything the...

C / C++

Empty string comparisons

by: Neville Lang | last post by:

Hi all, I am having a memory blank at the moment. I have been writing in C# for a number of years and now need to do something in VB.NET, so forgive me such a primitive question. In C#, I...

Visual Basic .NET

getvalue from string of binary to compare the value

by: aznimah | last post by:

hi, i'm work on image comparison. i'm using the similarity measurement which i need to: 1) convert the image into the binary form since the algorithm that i've use works with binary data for the...

C# / C Sharp

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice