requestion regarding regular expression

Kelie

Hello,

I'm trying to analyze some autolisp code with python. In the file to
be analyzed there are many functions. Each function begins with a
"defun" statement. And before that, there may or may not have comment
line(s), which begins with ";". My goal is to export each function
into separate files, with comments, if there is any. Below is the code
that I'm struggling with:

Expand|Select|Wrap|Line Numbers

  
path = "C:\\AutoCAD\\LSP\\Sub.lsp"

string = file(path, 'r').read()
 
import re

pat = "\\;+.+\\n\\(DEFUN"

p = re.compile(pat,re.I)
 
iterator = p.finditer(string)

spans = [match.span() for match in iterator]
 
for i in range(min(15, len(spans))):

print string[spans[i][0]:spans[i][1]]

The code above runs fine. But it only takes care of the situation in
which there is exactly one comment line above the "defun" statement.
How do I repeat the sub-pattern "\\;+.+\\n" here?
For example if I want to repeat this pattern 0 to 10 times, I know
"\\;+.+\\n{0:10}\\(DEFUN" does not work. But don't know where to put
"{0:10}". As a work around, I tried to use
pat = "|".join(["\\;+.+\\n"*i+ "\\(DEFUN" for i in range(11)]), and it
turned out to be very slow. Any help?

Thank you.

Kelie

Apr 14 '06 #1

Subscribe Post Reply

1144

Kent Johnson

Kelie wrote:

Hello,

I'm trying to analyze some autolisp code with python. In the file to
be analyzed there are many functions. Each function begins with a
"defun" statement. And before that, there may or may not have comment
line(s), which begins with ";". My goal is to export each function
into separate files, with comments, if there is any. Below is the code
that I'm struggling with:

Expand|Select|Wrap|Line Numbers

path = "C:\\AutoCAD\\LSP\\Sub.lsp"

string = file(path, 'r').read()

import re

pat = "\\;+.+\\n\\(DEFUN"

p = re.compile(pat,re.I)

iterator = p.finditer(string)

spans = [match.span() for match in iterator]

for i in range(min(15, len(spans))):

print string[spans[i][0]:spans[i][1]]

The code above runs fine. But it only takes care of the situation in
which there is exactly one comment line above the "defun" statement.

ISTM you don't need regex here, a simple line processor will work.
Something like this (untested):

path = "C:\\AutoCAD\\LSP\\Sub.lsp"
lines = open(path).readlines()

# Find the starts of all the functions
starts = [i for i, line in enumerate(lines) if line.startswith('(DEFUN')]

# Check for leading comments
for i, start in starts:
while start > 0 and lines[start-1].startswith(';'):
starts[i] = start = start-1

# Now starts should be a list of line numbers for the start of each function

Kent

Apr 14 '06 #2

BartlebyScrivener

Kent,

Running

path = "d:/emacs files/emacsinit.txt"
lines = open(path).readlines()
# my defun lines are lowercase,
# next two lines are all on one
starts = [i for i, line in enumerate(lines) if
line.startswith('(defun')]
for i, start in starts:
while start > 0 and lines[start-1].startswith(';'):
starts[i] = start = start-1
print starts

I get

File "D:\Python\findlines.py", line 7, in __main__
for i, start in starts:
TypeError: unpack non-sequence

Also, I don't understand the "i for i", but I don't understand a lot of
things yet :)

thanks,

rick

Apr 14 '06 #3

Felipe Almeida Lessa

Em Sex, 2006-04-14 Ã*s 07:47 -0700, BartlebyScrivener escreveu:

starts = [i for i, line in enumerate(lines) if
line.startswith('(defun')]
This line makes a list of integers. enumerate gives you a generator that
yields tuples consisting of (integer, object), and by "i for i, line"
you unpack the tuple into "(i, line)" and pick just "i".
for i, start in starts:

Here you try to unpack the elements of the list "starts" into "(i,
start)", but as we saw above the list contains just "i", so an exception
is raised.

I don't know what you want, but...

starts = [i, line for i, line in enumerate(lines) if
line.startswith('(defun')]

or

starts = [x for x in enumerate(lines) if x[1].startswith('(defun')]

....may (or may not) solve your problem.

--
Felipe.

Apr 14 '06 #4

Kent Johnson

BartlebyScrivener wrote:

Kent,

Running

path = "d:/emacs files/emacsinit.txt"
lines = open(path).readlines()
# my defun lines are lowercase,
# next two lines are all on one
starts = [i for i, line in enumerate(lines) if
line.startswith('(defun')]
for i, start in starts:
while start > 0 and lines[start-1].startswith(';'):
starts[i] = start = start-1
print starts

I get

File "D:\Python\findlines.py", line 7, in __main__
for i, start in starts:
TypeError: unpack non-sequence

Sorry, should be
for i, start in enumerate(starts):

start is a specific start line, i is the index of that start line in the
starts array (so the array can be modified in place).

Kent

Apr 14 '06 #5

BartlebyScrivener

That's it. Thank you! Very instructive.

Final:

path = "d:/emacs files/emacsinit.txt"
lines = open(path).readlines()
# next two lines all on one
starts = [i for i, line in enumerate(lines) if
line.startswith('(defun')]
for i, start in enumerate(starts):
while start > 0 and lines[start-1].startswith(';'):
starts[i] = start = start-1
print starts

Apr 14 '06 #6

Scott David Daniels

BartlebyScrivener wrote:

That's it. Thank you! Very instructive.

Final:

path = "d:/emacs files/emacsinit.txt"
lines = open(path).readlines()
# next two lines all on one
starts = [i for i, line in enumerate(lines) if
line.startswith('(defun')]
for i, start in enumerate(starts):
while start > 0 and lines[start-1].startswith(';'):
starts[i] = start = start-1
print starts

If you don't want to hold the whole file in memory, this gets the
starts a result at a time:

def starts(source):
prelude = None
for number, line in enumerate(source): # read and number a line
if line[0] == ';':
if prelude is None:
prelude = number # Start of commented region
# else: this line just extends previous prelude
else:
if line.startswith('(defun'):
# You could append to a result here, but yield lets
# the first found one get out straightaway.
if prelude is None:
yield number
else:
yield prelude
prelude = None
path = "d:/emacs files/emacsinit.txt"
source = open(path)
try:
for line in starts(source):
print line,
# could just do: print list(starts(source))
finally:
source.close()
print

--
-Scott David Daniels
sc***********@acm.org

Apr 14 '06 #7

BartlebyScrivener

This is very helpful.

I wasn't the OP. I'm just learning, but I'm on the verge of making my
own file searching scripts. This will be a huge help. Thanks for
posting, and especially thanks for the comments in the code. Big help!

rick

Apr 14 '06 #8

Kelie

Thanks to both of you, Kent and Scott.

Apr 15 '06 #9

by: Kenneth McDonald | last post by:

I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...

Python

regarding ignore case sensitive of a string using regularexpressions

by: Mosas | last post by:

Dear All In Perl when we are checking some conditions using regular expression we can ignore the case sensitive of a string using the following regular expression /(^)|/|(\.\.)/i. But If I try...

Python

Regular Expression

by: Buddy | last post by:

Can someone please show me how to create a regular expression to do the following My text is set to MyColumn{1, 100} Test I want a regular expression that sets the text to the following...

C# / C Sharp

Help needed with a regular expression

by: Neri | last post by:

Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...

C# / C Sharp

Regular expression problem - Replacing a pattern

by: Dimitris Georgakopuolos | last post by:

Hello, I have a text file that I load up to a string. The text includes certain expression like {firstName} or {userName} that I want to match and then replace with a new expression. However,...

C# / C Sharp

Replacing special chars using regular expressions

by: James D. Marshall | last post by:

The issue at hand, I believe is my comprehension of using regular expression, specially to assist in replacing the expression with other text. using regular expression (\s*) my understanding is...

Visual Basic .NET

Regular expression optimization

by: Billa | last post by:

Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...

.NET Framework

Get regular expression

by: Mike | last post by:

I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...

C# / C Sharp

Dynamic list of regular expressions, find the one that matches.

by: Allan Ebdrup | last post by:

I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find...

C# / C Sharp

Regarding regular expressions in Solaris

by: sunil | last post by:

Hi, Am writing one C program for one of my module and facing one problem with the regular expression functions provided by the library libgen.h in solaris. In this library we are having two...

C / C++

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

requestion regarding regular expression

Similar topics