473,324 Members | 2,254 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Find out if a word in a text field is in a table

Hello, my friends, I hope this is not an already asked (and resolved) question:

Is it possible to find out which word or words in a text field (an address for example) are in another table? and get the position in the string, I know I can use the InStr function but i have no idea how to compare each of these words against a table

Thanks and I hope I made myself clear
Oct 2 '07 #1
6 2213
NeoPa
32,556 Expert Mod 16PB
I get the general sort of thing you're after but the answer depends on the specifics.
Try to give some examples so that we can picture what you mean (or explain more clearly if that's easier).
Oct 3 '07 #2
Hi Neopa and everyone, I will do my best to be clear

I have a column with records like:

Mz 10 Lt 5 Urb Los Trabajadores
Mza 53-B Lte 5 Coop San Ignacio
Jr Gran Chimu 143 Lima

These are address which are not normalized

I have created another table where i put the common words used in addresses

Mz Mz
Lt Lt
Mza Mz
Lte Lt
Coop Coop

What I'm tryin to do is to split these string every time the above words are found

so I could get: Mz 10 // Lt 5 // Urb Los Trabajadores
Mza 53-B // Lte 5 // Coop. San Ignacio
Jr Gran Chimu 153
As you may notice these are spanish address (Lima Peru to be specific) and I'm trying to normalize them

Thanks in advanced for your help
Oct 5 '07 #3
NeoPa
32,556 Expert Mod 16PB
I'm not ignoring this. I'll just need to give it some more thought.
Bump it if you hear nothing from me by Monday.
Oct 5 '07 #4
HI, NeoPa, I was wondering if you gave it a thought about my post, I know it is been awhile, but I'm trying this again, thanks in advance
Nov 7 '07 #5
NeoPa
32,556 Expert Mod 16PB
So really you're trying to take plain text fields and extract data embedded within them out into separate fields depending on the keywords found within the text?
Nov 7 '07 #6
Jim Doherty
897 Expert 512MB
HI, NeoPa, I was wondering if you gave it a thought about my post, I know it is been awhile, but I'm trying this again, thanks in advance

Some serious consideration and rules would have to go into something like this. I am not a native of Peru so am not familiar with any rules that might be available for well formatting of addresses over there.

Normalising address data from denormalised data which ever way you look at it would have to follow a set of rules in order to be successful. When examining a 'string' of data like that 'any' set based logic would programatically need to know when to parse and when to wrap and when to add carriage return or line break characters and so on. You have already identified the necessity of it by the requirement of the posting.

This can be beset with problems of course, unless you have a mature set of rules to work by because given the computer will do exactly what it is told how would it know for instance 'when' to concatenate one word to another AND in the proper context or conversely leave the currently examined word alone and proceed onto the next word and examine that until you get to the end of the string. This can be achieved eventually but invariably is only done so when all the possible permutations for anomolies that 'might' occur have been listed so that comparitively speaking any data can be adjusted on the fly to suit any output display

I am sure anyone viewing the examples posted will appreciate the mechanics of what I speak of. The startpoint for this it seems to me is to break what you have into its key elements namely separate words firstly parsing them out to see in actual fact in what context they can be put back together successfully again in order to normalise to properly structured columns that contain the 'correct' type of data.

Now this is not the answer but what follows is at least an illustration of what I am eluding to. You will need to replicate this your end in order to see what I mean.

1) Create a table called tblAddress with a single field called 'Address' datasize text 100

2) Populate it with the example data you posted

3) Create these functions in a standard module

Expand|Select|Wrap|Line Numbers
  1. Function CountWords(s) As Integer
  2. '
  3. ' Counts words in a string separated by 1 or more spaces
  4. '
  5.     Dim WC As Integer, i As Integer, OnASpace As Integer
  6.     If VarType(s) <> 8 Or Len(Trim(s)) = 0 Then
  7.         CountWords = 0
  8.         Exit Function
  9.     End If
  10.     WC = 0
  11.     OnASpace = True
  12.     For i = 1 To Len(s)
  13.         If Mid(s, i, 1) = " " Then
  14.             OnASpace = True
  15.         Else
  16.             If OnASpace Then
  17.                 OnASpace = False
  18.                 WC = WC + 1
  19.             End If
  20.         End If
  21.     Next i
  22.     CountWords = WC
  23. End Function
  24.  
  25. Function GetWord(s, Indx As Integer)
  26. '
  27. ' Extracts a word in text where words are separated by 1 or more spaces
  28. '
  29.     Dim i As Integer, WC As Integer, Count As Integer, SPos As Integer, EPos As Integer, OnASpace As Integer
  30.     WC = CountWords(s)
  31.     If Indx < 1 Or Indx > WC Then
  32.         GetWord = Null
  33.         Exit Function
  34.     End If
  35.     Count = 0
  36.     OnASpace = True
  37.     For i = 1 To Len(s)
  38.         If Mid(s, i, 1) = " " Then
  39.             OnASpace = True
  40.         Else
  41.             If OnASpace Then
  42.                 OnASpace = False
  43.                 Count = Count + 1
  44.                 If Count = Indx Then
  45.                     SPos = i
  46.                     Exit For
  47.                 End If
  48.             End If
  49.         End If
  50.     Next i
  51.     EPos = InStr(SPos, s, " ") - 1
  52.     If EPos <= 0 Then EPos = Len(s)
  53.     GetWord = Mid(s, SPos, EPos - SPos + 1)
  54. End Function
  55.  
4) Create a new query and in the SQL window paste the following SQL and save the query as qryStringManipulation

Expand|Select|Wrap|Line Numbers
  1.  
  2. SELECT tblAddressTest.Address, CountSWords([Address],Space(1)) AS WordCount, GetWord([Address],1) AS Word1, GetWord([Address],2) AS Word2, GetWord([Address],3) AS Word3, GetWord([Address],4) AS Word4, GetWord([Address],5) AS Word5, GetWord([Address],6) AS Word6, GetWord([Address],7) AS Word7
  3. FROM tblAddressTest;
  4.  
5) Run the query.

You will see a matrix display of the addresses as posted to the right of which will be a count of the words in each address and to the right of that seven columns of data with one word in each.

Now the logical question is this!

Which words can be brought back together to identify the physical individual entity aspects properly.? ie: can data contained in column (word2) be treated as house number? ...no it has both numeric data (the figure 10) below which is a combination of numeric and alpha (53-B) and below that the word 'Gran' Now we could invent all sorts of logic per row to say things like if word2 is numeric then combine with word1 to become the house number and so on but this is sophistication that only your data can determine what route you are obliged to take.

IMHO opinion it would be indeed by a superstar who could give you a solution that 'practically' solves this for you taking into account all possible combinations of all of your live data but reponding in a manner in this thread to only three lines of data as an example on which to base a solution. (data cleansing software is commercially available, big bucks too..... as are superstars, but I hasten to humble myself to not being one of them :)))

I know this is not much help and your table lookup is a direction that you are considering but even that will be working around the same logic pattern I have described here so the point remains the same in effect.

I hope this helps you to some extent or at least gives you a different angle to consider!

Regards

Jim :)
Nov 7 '07 #7

Sign in to post your reply or Sign up for a free account.

Similar topics

11
by: JoshuaF | last post by:
hello everyone i am very new to postgresql 7.3.3 and am using it on my mac G4 OS X and i am having some trouble which i hope you can help me out with some answers to my questions: 1.) How can I...
5
by: ivan | last post by:
Trying to figure out the best way to accomplish this... I have a Purchase Order form that we need to place an item "imprint" into with formatted text, so that the manufacturer will see the exact...
2
by: Daniel | last post by:
I use an Access database to basically take data exports, import them, manipulate the data, and then turn them into exportable reports. I do this using numerous macros, and queries to get the data...
41
by: Ruby Tuesday | last post by:
Hi, I was wondering if expert can give me some lite to convert my word table into access database. Note: within each cell of my word table(s), some has multi-line data in it. In addition, there...
8
by: brian kaufmann | last post by:
Hi, I'm new to Access and this may be a basic question but I would appreciate it if you could let me know how to do this: I've created an Access table and would like to insert a column with...
8
by: jquest | last post by:
Hi Again; I have had help from this group before and want to thank everyone, especially PCDatasheet. My database includes a field called HomePhone, it uses the (xxx)xxx-xxx format to include...
8
by: Darryl Kerkeslager | last post by:
I hope that although this is 25% Access and 75% Word, that someone will know ... The whole problem here arises because 1) Microsoft acknowledges an 'issue' wherein TextInput type FormFields are...
1
by: rlntemp-gns | last post by:
Re: Access 2003/Word 2003 I have about 100 motivational documents that I am wanting to import into a small database and put a front end on it that would allow the user to select one, read it and...
0
by: alivip | last post by:
I write code to get most frequent words in the file I won't to implement bigram probability by modifying the code to do the following: How can I get every Token (word) and ...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.