Regular expressions and Python

1 New Member

Hello,

I am working with Regular Expressions in Python.

I have a text file (authors.txt) file that contains the first and last name of an author separated with a whitespace, then a whitespace and the book title:

Peter Smith The Lobster story
Christine Bower In the closet
Tom Martin How to paint your furniture

My questions:

1) I want to transform the name into a string like this:

Peter Smith => psmith

I tried to get the first character of the first group, then to concatenate it with the second group and transform the whole string into lower case.

2.) Then I want to transform the book title into a string like this:

The Lobster story => the_lobster_story

I guess I just have to replace the whitespaces with an underscore '_' and transform the whole thing into lowercase but I don't know what function to use and how...

Then I wrote this script.py:

import re
import string

rgx = re.compile("(([A-Z])+\w+)[ ](([A-Z])+\w+)[ ]([^:]+)")

inf = open('authors.txt', 'r')
outf= open('authors2.txt', 'w')

for row in inf.readlines():
corr = rgx.search(row)
#these variables are false
first_name = corr.group(0)
last_name = corr.group(1)
name = corr.lower(first_name[0])+corr.lower(last_name)
title = corr.group(2)
title2 = title.lower(title.replace(' ', '_'))

if corr != None:
str = name ,"@bookstore.com" + " " + http://www.bookstore.org/", title2
outf.write(str)

inf.close()
outf.close()

Can anyone help?

Mar 23 '08 #1

Subscribe Reply

1341

bvdet

2,851

Recognized Expert Moderator Specialist

Please use code tags when posting code. Posting Guidelines - How to ask a question
You do not need regular expressions for this. Create an empty list to hold the results. Split the string, append the processed results to the list.

Expand|Select|Wrap|Line Numbers

 f = open('file_name')

output = []

for line in f:

    lineList = line.split()

    output.append(['%s%s' % (lineList[0][0].lower(), lineList[1].lower()), \

                   '_'.join([word.lower() for word in lineList[2:]])])
 
f.close()    
 
for item in output:

    print item

>>> ['psmith', 'the_lobster_story']
['cbower', 'in_the_closet']
['tmartin', 'how_to_paint_your_furniture']
>>>

Mar 23 '08 #2

bvdet

2,851

Recognized Expert Moderator Specialist

I modified your regex pattern somewhat and added names to the groups. Using names, it is easier to read the structure of the pattern, and you can access the matched substrings with the MatchObject.groupdict() method. The rest is almost the same code as the non-regex solution.

Expand|Select|Wrap|Line Numbers

 
import re
 
rgx = re.compile(r'%s %s %s' % ('(?P<first_name>(?P<first_initial>[A-Z])?\w+)', \

                                '(?P<last_name>(?P<last_initial>[A-Z])?\w+)', \

                                '(?P<book_title>.+)'))

f = open(fn)

output = []

for line in f:

    m = rgx.search(line)

    dd = m.groupdict()

    output.append(['%s%s' % (dd['first_initial'].lower(), dd['last_name'].lower()), \

                   '_'.join([word.lower() for word in dd['book_title'].split()])])
 
f.close()    
 
for item in output:

    print item

>>> ['psmith', 'the_lobster_story']
['cbower', 'in_the_closet']
['tmartin', 'how_to_paint_your_furniture']
>>>

Mar 23 '08 #3

Similar topics

4154

Request for Feedback; a module making it easier to use regular expressions.

by: Kenneth McDonald | last post by:

I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...

Python

2284

Python's regular expression?

by: Davy | last post by:

Hi all, I am a C/C++/Perl user and want to switch to Python (I found Python is more similar to C). Does Python support robust regular expression like Perl? And Python and Perl's File...

Python

1805

builtin regular expressions?

by: Antoine De Groote | last post by:

Hello, Can anybody tell me the reason(s) why regular expressions are not built into Python like it is the case with Ruby and I believe Perl? Like for example in the following Ruby code line =...

Python

3365

Regular Expressions

by: Geoff Hill | last post by:

What's the way to go about learning Python's regular expressions? I feel like such an idiot - being so strong in a programming language but knowing nothing about RE.

Python

7456

Python regular expressions just ain't PCRE

by: Wiseman | last post by:

I'm kind of disappointed with the re regular expressions module. In particular, the lack of support for recursion ( (?R) or (?n) ) is a major drawback to me. There are so many great things that can...

Python

7027

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

6899

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

7067

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

6847

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

4463

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

2970

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

1288

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

C# / C Sharp

555

php

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

166

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

General