473,387 Members | 1,540 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Comparing text strings

I have a monthly safety slogan competition which requires back
checking to a list of already submitted slogans. This takes forerver
to do. I have 2 lists: this month's slogans and a master list of all
slogans.

Here is an example:
S1Master: An unsafe behavior can bring you down
S1Submitted: Unsafe behaviors can bring you down

I am thinking that after removing the plural "s", and then counting
the number of words that are in both sentences, as a percentage of the
total number of words. Percentage greater than 90% say would be listed
as a match.

6 words in 2 sentences the same = 12/13 words = 92.31%
therefore this is a matched pair.

I am trying to keep it simple but have enough accuracy, its not
terrible if a few slip by. Will the above method work and can I
achieve it without VBA. Please guide me.

Thanks
Steve
Nov 13 '05 #1
3 2867
steve wrote:
I have a monthly safety slogan competition which requires back
checking to a list of already submitted slogans. This takes forerver
to do. I have 2 lists: this month's slogans and a master list of all
slogans.

Here is an example:
S1Master: An unsafe behavior can bring you down
S1Submitted: Unsafe behaviors can bring you down

I am thinking that after removing the plural "s", and then counting
the number of words that are in both sentences, as a percentage of the
total number of words. Percentage greater than 90% say would be listed
as a match.

6 words in 2 sentences the same = 12/13 words = 92.31%
therefore this is a matched pair.

I am trying to keep it simple but have enough accuracy, its not
terrible if a few slip by. Will the above method work and can I
achieve it without VBA. Please guide me.

Thanks
Steve


I don't know how you can do it without VBA. Also, do you want the
computer to figure this out for you or do you want to figure it out? To
keep it simple, some human interaction would be better.

For example, the 2 key words I saw in your example was "unsafe
behavior". You could run a query to find all master records that have
"unsafe behavior". Or you could create a query that finds all records
that contain "unsafe" and "behavior". See, here you are getting rid of
plurals. You control the keywords to search on.

Let's say you the master had a key value and the phrase. You have a
form to input 6 up to key words called Form1 with 6 keywords called
Key1..Key6. These Keywords that you type in have a default value of
Null. You could create a query to select the phrase.

Select Phrase, _
IIF(instr([Phrase],IIF(Not
IsNull(Forms!Form1!Key1),Forms!Form1!Key1,chr(0))) > 0,1,0) As Key1Cnt,
IIF(instr([Phrase],IIF(Not
IsNull(Forms!Form1!Key2),Forms!Form1!Key2,chr(0))) > 0,1,0) As Key2Cnt,
....etc.

What this does is it looks for the Key1 word in the phrase if there is a
Key1 value. Same for Key2. Does this for Key1 to Key6. If the key
word is found in the phrase, the value of the column is 1, if not found
or if the keyword to search is blank, the value is 0.

Save this as query1.

Now create another query. The first column is the phrase. The second
column is Key1Cnt + Key2Cnt + Key3Cnt...Key6Cnt. Click Show to off.
Set the criteria to >0. Sort in descending value. Save as query2. Run
this query.

This will exclude all records from the master table where no keywords
matched and present those that did have matches in the number of words
that did match. This method does not account for misspelled words.
Nov 13 '05 #2
Browsing these forums some more I see that that it might be important
to point out that I am using Access 2000. Also it seems like the split
function to break the string into an array is what I need to do but I
don;t know how.
Thanks,
Nov 13 '05 #3
You actually might need to take a few steps back.... natural language
processing is simply NOT going to be this simple, no matter how simple it
appears to be nor how much you try to "dumb down" the approach.

As this is a pure programming task, the version does not matter quite as
much. but developing a language parser is a monumental task, the kind of
thing that might get you a doctorate from the Dept. of Brain and Cognitive
Sciences at MIT if you could get in there and could actually prove a theory
enough to have such a successful applied model. To do so in VBA is a
herculean task, worthy of only a masochist beyond the level of the
aforementioned doctoral candidate.

For more info on why you are moving into an area that will definitely
"stretch" your knowledge and probably your sanity, I would recommend "The
Language Instinct" by Steven Pinker. He will show you that once you think
you have plural forms figured out, that there are many exceptions to the "s"
suffix pluralization rules (ones not as easy to detect). And then he will
show you how other markers, such as those for case and tense, can be used to
confound your parsing efforts, as can reversals of the typical SVO order of
English that are commonly done with case markers, especially in slogans
which are meant to be catchy.

Summary -- this is way to big of a job for anything less than a team of
people, with an actual PhD on the team.
--
MichKa [MS]
(armchair linguist)
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies
Windows International Division

This posting is provided "AS IS" with
no warranties, and confers no rights.
"steve" <ha***@asus.net> wrote in message
news:2d**************************@posting.google.c om...
Browsing these forums some more I see that that it might be important
to point out that I am using Access 2000. Also it seems like the split
function to break the string into an array is what I need to do but I
don;t know how.
Thanks,

Nov 13 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: beliavsky | last post by:
By mistake I coded something like print ("1" > 1) and got the result "True". Comparing an integer and a string seems meaningless to me, and I would prefer to have an exception thrown. Can...
1
by: Doug | last post by:
I need to compare two "address" structures within a document, and perform some action if they are not equal. The XML document is a purchase order, with an address at both the header and line...
5
by: Curtis Gilchrist | last post by:
I am required to read in records from a file and store them in descending order by an customer number, which is a c-style string of length 5. I am storing these records in a linked list. My...
4
by: agent349 | last post by:
First off, I know arrays can't be compared directly (ie: if (arrary1 == array2)). However, I've been trying to compare two arrays using pointers with no success. Basically, I want to take three...
3
by: Robert Dell | last post by:
I have a problem comparing strings in an order form i'm writing. I want to give a running total at the bottom of the page and it appears to be working except it doesn't compare correctly (it...
41
by: Odd-R. | last post by:
I have to lists, A and B, that may, or may not be equal. If they are not identical, I want the output to be three new lists, X,Y and Z where X has all the elements that are in A, but not in B, and...
88
by: William Krick | last post by:
I'm currently evaluating two implementations of a case insensitive string comparison function to replace the non-ANSI stricmp(). Both of the implementations below seem to work fine but I'm...
15
by: luc.saffre | last post by:
Hello, here is something that surprises me. #coding: iso-8859-1 s1=u"Frau Müller machte große Augen" s2="Frau Müller machte große Augen" if s1 == s2: pass
2
by: Pugi! | last post by:
hi, I am using this code for checking wether a value (form input) is an integer and wether it is smaller than a given maximum and greater then a given minimum value: function...
6
by: cutlass | last post by:
Need help in thus script. Trying to create the script that uses comparison operators and functions to compare two strings entered by user. When I add in the info in the script section everything...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.