473,322 Members | 1,425 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

parsing names and businesses with asterisk

Hello. I'm trying to parse out a large list of names and businesses I have but am running into trouble. I'd like to filter it into First, Middle and Last name fields, and if it has an asterisk in front of it, remove the asterisk and enter everything following it into the last name field. I've had some mild success doing this in excel but am having problems in MS Access. Here is an example of the data:

Expand|Select|Wrap|Line Numbers
  1. John Smith
  2. *Acme Paint
  3. John Lee Smith
  4. John D Van Gogh
  5. *Google
  6. *Random Business
  7.  
Can anybody lend a helping hand?
Jan 25 '13 #1
16 1543
Rabbit
12,516 Expert Mod 8TB
What have you tried so far?
Jan 25 '13 #2
NeoPa
32,556 Expert Mod 16PB
Your example data illustrates that you understand that you don't even know what question you're asking. I suggest that would be the place to start.

What logic are you hoping to implement here? I assume you appreciate that hoping for a computer to recognise commonly used surnames in the way a human would is a forlorn hope. What, then, are you thinking would be the logic that you are asking us to help you to implement?
Jan 26 '13 #3
zmbd
5,501 Expert Mod 4TB
James Armstrong:
Why for this project? Sounds a lot like a homework issue given by the Demon CompSci Profs.

(I had really evil CompSci Profs, good guys, just vary devious - and this was a real pain in FORTRAN!).

However, it seems that most people don't have Demons for CompSci professors, let's just think this thing thru:

Using the data in your example, the parsing will be a pain.

Just some pseudo/air code:
Lets start with the "*" that will be simple enough.
Pull the LEFT 1 of the string compare for "*" and if true then MID string from the 2nd position thru the length and as you want all of this in the last name field - well, put it there.

As for 3 and 4... here I'd locate the spaces.
Logic behind normal English based names is that he first space following the leading string is the deliminator separating between the first and either the middle or last names. SO, now you have a logical seperation for the first name thus LEFT pull thru to the space (or MID up to you) and stuff into the First name field.

Now look for a second separation within the context. A second separation will more than likely be the separation between a middle initial and the last name. Take the length of the MID string between the first separation and the second separation. A single character will normally be a middle initial - Put it there and take the remaining string following the second separation and parse to last name.

HOWEVER, as in line 3 of the example you have more than one character we'll have to make a few more guesses... check for any more spaces (ie line 4) in which case, the MID string between the first and second spaces is highly likely to be a middle name - compound middle names are not very common; however, they do happen so look for another space. If found then we're dealing with compound middle name. Taking a small leap of faith: as a majority of compound middle names from a Englished base data-set will have only one space we can then use the MID string between the first space and the third spaces and push that data into the middle name field with the remainder pushed into the last name field.

Now two things:
You will have to prove that this isn't a homework assignment
You will have to show what you have for code.
Once done, I'll go dig my old FORTRAN code out (that may have been in the box that the fire destroyed... I hope not, it was a really brilliant piece of code - it got me out of the mid-term exam!) and translate it to VBA - otherwise, you should have enough to write the code.
Jan 26 '13 #4
ADezii
8,834 Expert 8TB
I would take a slightly different approach, such as:
Expand|Select|Wrap|Line Numbers
  1. Public Function fAnalyzeName(strName As String) As String
  2. Dim varName As Variant
  3. Dim intNumOfSpaces As Integer
  4.  
  5. varName = Split(strName, " ")
  6.  
  7. If Left$(varName(0), 1) = "*" Then
  8.   'If Name begins with a '*' remove Asterisk and enter as Last Name
  9.   fAnalyzeName = Replace(varName(0), "*", "")
  10. Else
  11.   'Determine the Number of Spaces in the Name
  12.   intNumOfSpaces = UBound(varName)
  13.  
  14.   'Do the parsing depending on the Number of Spaces
  15.   Select Case intNumOfSpaces
  16.     Case 0      'No Space in Name
  17.     Case 1      '1 Space in Name, First Name is varName(0),
  18.                 'Last Name will be in varName(1)
  19.     Case 2
  20.                 '1 Space in Name, First Name is varName(0),
  21.                 'MI is in varName(1), Last Name will be in varName(2)
  22.     Case 3
  23.     Case 4
  24.     Case 5
  25.     Case 6
  26.     Case Else
  27.   End Select
  28. End If
  29. End Function
Jan 29 '13 #5
NeoPa
32,556 Expert Mod 16PB
That doesn't deal with the fundamental problem inherent in the question though ADezii. Data of "Edwin Van Gogh" would be mistreated in your code. The point I made was that the question itself hasn't been properly considered. To apply logic, that logic must first be defined and/or understood. That is not the case here.
Jan 29 '13 #6
Hi guys, thanks for all of your input. I am pretty sure I know what I'm asking. With a little help I believe I have a working excel macro that does account for names such as "Edwin Van Gogh". Sadly, I need this to work in access and am having a devil of a time translating it. Below is the code:

Expand|Select|Wrap|Line Numbers
  1. Sub Macro1()
  2.  
  3. startrow = 1
  4. startcol = 1
  5.  
  6. While Cells(startrow, startcol) <> ""
  7.     work_name = Cells(startrow, startcol)
  8.     space1 = ""
  9.     space2 = ""
  10.     space3 = ""
  11.     space4 = ""
  12.     first_name = ""
  13.     middle_name = ""
  14.     last_name = ""
  15.     If Mid(work_name, 1, 1) <> "*" Then
  16.         On Error GoTo -1
  17.         On Error GoTo no_more_spaces
  18.         space1 = InStr(1, work_name, " ")
  19.         If space1 <> 0 And space1 <> "" Then
  20.             space2 = InStr(space1 + 1, work_name, " ")
  21.         End If
  22.         If space2 <> 0 And space2 <> "" Then
  23.             space3 = InStr(space2 + 1, work_name, " ")
  24.         End If
  25.         If space3 <> 0 And space3 <> "" Then
  26.             space4 = InStr(space3 + 1, work_name, " ")
  27.         End If
  28.     End If
  29.  
  30. no_more_spaces:
  31.  
  32.         If space4 > 0 And space4 <= 99 Then
  33.                         last_name = Mid(work_name, space2 + 1, Len(work_name) - space2)
  34.                         first_name = Left(work_name, space1 - 1)
  35.                         middle_name = Mid(work_name, space1 + 1, space2 - space1 - 1)
  36.         Else
  37.         If space3 > 0 And space3 <= 99 Then
  38.                         last_name = Mid(work_name, space2 + 1, Len(work_name) - space2)
  39.                         first_name = Left(work_name, space1 - 1)
  40.                         middle_name = Mid(work_name, space1 + 1, space2 - space1 - 1)
  41.         Else
  42.         If space2 > 0 And space2 <= 99 Then
  43.                         last_name = Mid(work_name, space2 + 1, Len(work_name) - space2)
  44.                         first_name = Left(work_name, space1 - 1)
  45.                         middle_name = Mid(work_name, space1 + 1, space2 - space1 - 1)
  46.         Else
  47.         If space1 > 0 And space1 <= 99 Then
  48.                         last_name = Mid(work_name, space1 + 1, Len(work_name) - space1)
  49.                         first_name = Left(work_name, space1 - 1)
  50.         Else
  51.                         last_name = work_name
  52.                         middle_name = ""
  53.                         first_name = ""
  54.         End If
  55.         End If
  56.         End If
  57.         End If
  58.  
  59.         Cells(startrow, 2) = first_name
  60.         Cells(startrow, 3) = middle_name
  61.         Cells(startrow, 4) = last_name
  62.  
  63.  
  64.     startrow = startrow + 1
  65. Wend
  66.  
  67.  
  68.  
  69. End Sub
Jan 29 '13 #7
Definately not a homework assignment, more of a challenge I am facing to make my life easier at work. More of a self teaching moment.
Jan 29 '13 #8
NeoPa
32,556 Expert Mod 16PB
James:
I am pretty sure I know what I'm asking. ... Below is the code:
We need you to specify the question James. Dropping a bunch of code in and expecting us to work it out for you is not acceptable. Not very respectful TBF. Please specify your question properly as you have already been requested to do.

I appreciate that it is probably not a homework question. That should no longer be an issue. We still need a properly posed question to work from though. I'll leave that with you.
Jan 29 '13 #9
Rabbit
12,516 Expert Mod 8TB
You can simplify this to:
Expand|Select|Wrap|Line Numbers
  1. Dim last_name As String
  2. Dim middle_name As String
  3. Dim first_name As String
  4. Dim arrNames
  5.  
  6. If Left(work_name, 1) = "*" Then
  7.    last_name = work_name 
  8. Else
  9.    arrNames = Split(work_name, " ")
  10.    first_name = arrNames(0)
  11.  
  12.    If UBound(arrNames) = 1 Then
  13.       last_name = arrNames(1)
  14.    Else
  15.       middle_name = arrNames(1)
  16.       last_name = arrNames(2)
  17.  
  18.       If UBound(arrNames) = 3 Then
  19.          last_name = arrNames(2) & " " & arrNames(3)
  20.       End If
  21.    End If
  22. End If
Jan 29 '13 #10
I definately don't mean to be rude. I apologize if I have been. I have limited understanding of code that prevents me from explaining further than what you require. I will rephrase in light of the excel code I posted.

How do I take the code I posted above and modify it to work in an ms access table? Assuming I have tables of Unparsed - FirstName - MiddleName - LastName.

Sorry if I'm not being very clear.

Rabbit what is arrNames in your code referring to?
Jan 29 '13 #11
Rabbit
12,516 Expert Mod 8TB
It holds the array from the split function.
Jan 29 '13 #12
zmbd
5,501 Expert Mod 4TB
@James:
First the devil of a time you are having between excel and access is probably along the lines of how they relate. One very simple way to think of the relationship is as: Worksheets = tables; Rows = records; Columns = Fields; cells = a specific field within a record.
The next mindwarp is that where excel tends to be focused on the cell level up... access is focused from the table level down.
With this in mind along with the logic in my post #4, followed by both very good outlines given by ADezii and Rabbit - you should be able to write the VBA code.
Jan 29 '13 #13
Thanks for all of your input guys, I really appreciate it. I'm going to give it a go with all of the information provided and see what I can come up with. I'll update this thread if I can get it working.
Jan 29 '13 #14
NeoPa
32,556 Expert Mod 16PB
James:
I definately don't mean to be rude. I apologize if I have been. I have limited understanding of code that prevents me from explaining further than what you require.
You can't say fairer than that, and I accept what you say at face value. Unfortunate, possibly, but that's as good a reason as you could want.

I won't chime in now, as the other chaps have already much for you to be working with. I will if required though.

PS. For what it's worth, your Excel code (Post #7) is fairly basic and treats the separate words in the text as discrete items (I'm not trying to disparage the code, but merely translate the routine into an explanation for you). It would also not handle a surname of "Van Gogh". When you say that this routine works for "Edwin Van Gogh", I assume you mean it separates the items out, but it would certainly not identify the surname as "Van Gogh" (which is, after all, the accurate position). I read your post to indicate this had been correctly handled in this routine. This is the point I was trying to bring your attention to. Until it is addressed you will have a routine that only approximates to your actual stated requirement.
Jan 30 '13 #15
zmbd
5,501 Expert Mod 4TB
I did a quick search against the census database for European last names (well, I live in the US; thus, I don't see alot of European names :) ) and it appears that "Van" and "de" are the most common parts of surnames and as such are easily included in the logic given in my first reply when testing the midstring section.
Jan 30 '13 #16
NeoPa
32,556 Expert Mod 16PB
They don't even appear in English names Z, and we have one or two that are spelled without a hyphen too you know (although rare). Nevertheless, it all boils down to what logic is "good enough" for the OP, and as they are struggling to specify that deeply we may never know. What you suggest may well be adequate.
Jan 30 '13 #17

Sign in to post your reply or Sign up for a free account.

Similar topics

8
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $...
3
by: sarmin | last post by:
Hi Pythoners... It seems to me the Asterisk (*) sign in python means many things... if a python code line looks like this: def__init__(self, func, *param): bla bla bla... and then u have...
10
by: george young | last post by:
For each run of my app, I have a known set of (<100) wafer names. Names are sometimes simply integers, sometimes a short string, and sometimes a short string followed by an integer, e.g.: 5, 6,...
5
by: LFM | last post by:
I have a table I'm importing from our SQL accounting database. The Employee Name is in one field. For the purpose of my needs, I need to extract the last name, first name and middle initial. ...
1
by: Mike in Paradise | last post by:
Is there a more effcient way of removing the spaces from the names for a Enumerated value that has several values when you split it)??? When you do a toString it puts ,<SPACE> between the entries...
5
by: Wavemaker | last post by:
I was wondering if there is way in the .NET framework to test if a string represents a valid identifier name for the C# language. I want to write a program that will generate code based on user...
3
by: Robert M | last post by:
Hello, I am having a problem when I enter a new row using data grid footer and the data contains asterisk ('). I can use replace function, and it will not crash the program but it shows 2...
3
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...
13
by: Chris Carlen | last post by:
Hi: Having completed enough serial driver code for a TMS320F2812 microcontroller to talk to a terminal, I am now trying different approaches to command interpretation. I have a very simple...
1
by: striker07 | last post by:
Someone help me how can i display a diamond shaped asterisk correctly pls. correct my program this is my code: Dim ctr, space, asterisk As Integer For ctr = 1 To 9 Step 2 For space = (9...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.