473,386 Members | 2,129 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

requestion regarding regular expression

Hello,

I'm trying to analyze some autolisp code with python. In the file to
be analyzed there are many functions. Each function begins with a
"defun" statement. And before that, there may or may not have comment
line(s), which begins with ";". My goal is to export each function
into separate files, with comments, if there is any. Below is the code
that I'm struggling with:

Expand|Select|Wrap|Line Numbers
  1.  
  2. path = "C:\\AutoCAD\\LSP\\Sub.lsp"
  3. string = file(path, 'r').read()
  4.  
  5. import re
  6. pat = "\\;+.+\\n\\(DEFUN"
  7. p = re.compile(pat,re.I)
  8.  
  9. iterator = p.finditer(string)
  10. spans = [match.span() for match in iterator]
  11.  
  12. for i in range(min(15, len(spans))):
  13. print string[spans[i][0]:spans[i][1]]
  14.  
  15.  
The code above runs fine. But it only takes care of the situation in
which there is exactly one comment line above the "defun" statement.
How do I repeat the sub-pattern "\\;+.+\\n" here?
For example if I want to repeat this pattern 0 to 10 times, I know
"\\;+.+\\n{0:10}\\(DEFUN" does not work. But don't know where to put
"{0:10}". As a work around, I tried to use
pat = "|".join(["\\;+.+\\n"*i+ "\\(DEFUN" for i in range(11)]), and it
turned out to be very slow. Any help?

Thank you.

Kelie

Apr 14 '06 #1
8 1144
Kelie wrote:
Hello,

I'm trying to analyze some autolisp code with python. In the file to
be analyzed there are many functions. Each function begins with a
"defun" statement. And before that, there may or may not have comment
line(s), which begins with ";". My goal is to export each function
into separate files, with comments, if there is any. Below is the code
that I'm struggling with:

Expand|Select|Wrap|Line Numbers
  1.  path = "C:\\AutoCAD\\LSP\\Sub.lsp"
  2.  string = file(path, 'r').read()
  3.  import re
  4.  pat = "\\;+.+\\n\\(DEFUN"
  5.  p = re.compile(pat,re.I)
  6.  iterator = p.finditer(string)
  7.  spans = [match.span() for match in iterator]
  8.  for i in range(min(15, len(spans))):
  9.      print string[spans[i][0]:spans[i][1]]
  10.  

The code above runs fine. But it only takes care of the situation in
which there is exactly one comment line above the "defun" statement.


ISTM you don't need regex here, a simple line processor will work.
Something like this (untested):

path = "C:\\AutoCAD\\LSP\\Sub.lsp"
lines = open(path).readlines()

# Find the starts of all the functions
starts = [i for i, line in enumerate(lines) if line.startswith('(DEFUN')]

# Check for leading comments
for i, start in starts:
while start > 0 and lines[start-1].startswith(';'):
starts[i] = start = start-1

# Now starts should be a list of line numbers for the start of each function

Kent
Apr 14 '06 #2
Kent,

Running

path = "d:/emacs files/emacsinit.txt"
lines = open(path).readlines()
# my defun lines are lowercase,
# next two lines are all on one
starts = [i for i, line in enumerate(lines) if
line.startswith('(defun')]
for i, start in starts:
while start > 0 and lines[start-1].startswith(';'):
starts[i] = start = start-1
print starts

I get

File "D:\Python\findlines.py", line 7, in __main__
for i, start in starts:
TypeError: unpack non-sequence

Also, I don't understand the "i for i", but I don't understand a lot of
things yet :)

thanks,

rick

Apr 14 '06 #3
Em Sex, 2006-04-14 Ã*s 07:47 -0700, BartlebyScrivener escreveu:
starts = [i for i, line in enumerate(lines) if
line.startswith('(defun')]
This line makes a list of integers. enumerate gives you a generator that
yields tuples consisting of (integer, object), and by "i for i, line"
you unpack the tuple into "(i, line)" and pick just "i".
for i, start in starts:


Here you try to unpack the elements of the list "starts" into "(i,
start)", but as we saw above the list contains just "i", so an exception
is raised.

I don't know what you want, but...

starts = [i, line for i, line in enumerate(lines) if
line.startswith('(defun')]

or

starts = [x for x in enumerate(lines) if x[1].startswith('(defun')]

....may (or may not) solve your problem.

--
Felipe.

Apr 14 '06 #4
BartlebyScrivener wrote:
Kent,

Running

path = "d:/emacs files/emacsinit.txt"
lines = open(path).readlines()
# my defun lines are lowercase,
# next two lines are all on one
starts = [i for i, line in enumerate(lines) if
line.startswith('(defun')]
for i, start in starts:
while start > 0 and lines[start-1].startswith(';'):
starts[i] = start = start-1
print starts

I get

File "D:\Python\findlines.py", line 7, in __main__
for i, start in starts:
TypeError: unpack non-sequence


Sorry, should be
for i, start in enumerate(starts):

start is a specific start line, i is the index of that start line in the
starts array (so the array can be modified in place).

Kent
Apr 14 '06 #5
That's it. Thank you! Very instructive.

Final:

path = "d:/emacs files/emacsinit.txt"
lines = open(path).readlines()
# next two lines all on one
starts = [i for i, line in enumerate(lines) if
line.startswith('(defun')]
for i, start in enumerate(starts):
while start > 0 and lines[start-1].startswith(';'):
starts[i] = start = start-1
print starts

Apr 14 '06 #6
BartlebyScrivener wrote:
That's it. Thank you! Very instructive.

Final:

path = "d:/emacs files/emacsinit.txt"
lines = open(path).readlines()
# next two lines all on one
starts = [i for i, line in enumerate(lines) if
line.startswith('(defun')]
for i, start in enumerate(starts):
while start > 0 and lines[start-1].startswith(';'):
starts[i] = start = start-1
print starts

If you don't want to hold the whole file in memory, this gets the
starts a result at a time:

def starts(source):
prelude = None
for number, line in enumerate(source): # read and number a line
if line[0] == ';':
if prelude is None:
prelude = number # Start of commented region
# else: this line just extends previous prelude
else:
if line.startswith('(defun'):
# You could append to a result here, but yield lets
# the first found one get out straightaway.
if prelude is None:
yield number
else:
yield prelude
prelude = None
path = "d:/emacs files/emacsinit.txt"
source = open(path)
try:
for line in starts(source):
print line,
# could just do: print list(starts(source))
finally:
source.close()
print

--
-Scott David Daniels
sc***********@acm.org
Apr 14 '06 #7
This is very helpful.

I wasn't the OP. I'm just learning, but I'm on the verge of making my
own file searching scripts. This will be a huge help. Thanks for
posting, and especially thanks for the comments in the code. Big help!

rick

Apr 14 '06 #8
Thanks to both of you, Kent and Scott.

Apr 15 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...
1
by: Mosas | last post by:
Dear All In Perl when we are checking some conditions using regular expression we can ignore the case sensitive of a string using the following regular expression /(^)|/|(\.\.)/i. But If I try...
4
by: Buddy | last post by:
Can someone please show me how to create a regular expression to do the following My text is set to MyColumn{1, 100} Test I want a regular expression that sets the text to the following...
4
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...
11
by: Dimitris Georgakopuolos | last post by:
Hello, I have a text file that I load up to a string. The text includes certain expression like {firstName} or {userName} that I want to match and then replace with a new expression. However,...
3
by: James D. Marshall | last post by:
The issue at hand, I believe is my comprehension of using regular expression, specially to assist in replacing the expression with other text. using regular expression (\s*) my understanding is...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
1
by: Allan Ebdrup | last post by:
I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find...
1
by: sunil | last post by:
Hi, Am writing one C program for one of my module and facing one problem with the regular expression functions provided by the library libgen.h in solaris. In this library we are having two...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.