473,407 Members | 2,359 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,407 software developers and data experts.

levenshtein for large strings

hi folks,

i need an algorithm for comparing two strings for equivalence, but
levenshtein is just working for strings up to 255 characters, i have to
process longer strings (lets say, about 10kB).

does anybody know a *simple* algorithm, which brings me the distance like
levenshtein (maybe not so exact than this)?

what do You think about the cost function of levenshtein, and with it to
adjust the strings and use levensthein for each single partitioned string?

tia walter

Oct 30 '08 #1
2 2519
Walter Kerelitsch wrote:
i need an algorithm for comparing two strings for equivalence, but
levenshtein is just working for strings up to 255 characters, i have to
process longer strings (lets say, about 10kB).
The Levenshtein algorithm has no limit in length. Your problem is probably
that PHP limits the length. Just implement the algorithm yourself. How it
works you can see e.g. on wikipedia:
http://en.wikipedia.org/wiki/Levenshtein_distance
Oct 31 '08 #2

"Boris Stumm" <st***@informatik.uni-kl.deschrieb im Newsbeitrag
news:ge**********@news.uni-kl.de...
The Levenshtein algorithm has no limit in length. Your problem is probably
that PHP limits the length. Just implement the algorithm yourself. How it
yes, exactly, PHP limits...
thanks a lot for your hint, danke

greetings walter

Oct 31 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Stephen Ramsay | last post by:
I'm working on an application that needs to move very large strings around (XML files, several megabytes in length). I can't do anything about the fact they're available to my program as Strings. ...
13
by: Eric | last post by:
I need an effective way (time is my main concern here) to generate 10 000 000 unique alphanumeric strings of 16 characters each. I used STL set and map but after about 5 000 000 entries, it...
9
by: C3 | last post by:
I have to process some data in C that is given to me as a char * array. I have a fairly large number of substrings (well, they're not actually printable, but let's treat them as strings) that I...
6
by: Peter Hickman | last post by:
I have a program that requires x strings all of y length. x will be in the range of 100-10000 whereas the strings will all be < 200 each. This does not need to be grown once it has been created....
1
by: Maxim | last post by:
Hello, I was wondering what the best data structure is to hold a large list of strings? Currently I am using the ArrayList, which seems to be terribly slow to add a large number of strings. ...
16
by: Claudio Grondi | last post by:
What started as a simple test if it is better to load uncompressed data directly from the harddisk or load compressed data and uncompress it (Windows XP SP 2, Pentium4 3.0 GHz system with 3 GByte...
16
by: Dukkov | last post by:
Hi Folks, I need to generate a very large string (1 MB or so) in my C# code, so I can test the code. What is the most elegant way to do so? Thanks! Dim
4
by: joshbloom | last post by:
Hi Guys, I've been using this c implementation http://trific.ath.cx/resources/python/levenshtein/ on a windows box and it works great. I'd like to move my app over to linux machine and am...
2
by: luftikus143 | last post by:
Hi there, I would like to achieve some kind of rating of the results of a query. As it searches in different fields of the (metadata) database, matching keywords of the field of the "data variable...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.