473,387 Members | 1,510 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Regular expression for file name

Hello All,

In a configuration file there can be ID's and filename tokens.
The file names have a known suffix (.o or .mls) and I need to get a regular
expression that will catch filename but not an ID.

Currently:
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"

However if I have the filename "Sources/kernel/rom_kernel.mls" then
"Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
as file name.

Any way to do better?

BTW: I'm using PLY (http://systems.cs.uchicago.edu/ply/) for parsing.

Bye.
--
------------------------------------------------------------------------
Miki Tebeka <mi*********@zoran.com>
http://tebeka.spymac.net
The only difference between children and adults is the price of the toys
Jul 18 '05 #1
2 1991
On Sun, 18 Jul 2004 14:21:14 +0200, "Miki Tebeka" <mi*********@zoran.com> wrote:
Hello All,

In a configuration file there can be ID's and filename tokens.
The file names have a known suffix (.o or .mls) and I need to get a regular
expression that will catch filename but not an ID.

Currently:
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"

However if I have the filename "Sources/kernel/rom_kernel.mls" then
"Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
as file name. ITYM s/interrupted/interpreted/ ;-)
Any way to do better?

If you want to prioritize matching amongst several
patterns with some leading commonality, UIAM or'ed terms get
tried left to right. I'm not checking your terms, but I think
here's a possible way to give priority to the FILENAME
pattern:
import re
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"
COMBINED = '(?P<file>%s)|(?P<id>%s)' % (FILENAME, ID)
rxo = re.compile(COMBINED)
filename = "Sources/kernel/rom_kernel.mls"
rxo.search(filename).groupdict() {'id': None, 'file': 'Sources/kernel/rom_kernel.mls'}

Try it with an id:
rxo.search('no_slashes_in_this').groupdict() {'id': 'no_slashes_in_this', 'file': None}

Of course you can mess with the result, e.g.,
result = rxo.search('no_slashes_in_this').groupdict()
result['id'] 'no_slashes_in_this' result['file']
result['file'] is None True result['id'], result['file']

('no_slashes_in_this', None)

No guarantees, but HTH

Regards,
Bengt Richter
Jul 18 '05 #2
On Sun, 18 Jul 2004, Miki Tebeka wrote:
In a configuration file there can be ID's and filename tokens.
The file names have a known suffix (.o or .mls) and I need to get a regular
expression that will catch filename but not an ID.

Currently:
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"

However if I have the filename "Sources/kernel/rom_kernel.mls" then
"Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
as file name.


I'm not familiar with PLY, but my guess as to the cause is that it gives
you those results because it is trying to match ID first, and then
FILENAME. The best way to solve this is to incorporate another restraint
in your RE, that is, the delimiter at the end of the pattern (presumably
whitespace):

ID = r"[a-zA-Z\.]\w+(?=\s)"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))(?=\s)"

I'm not sure if PLY supports (?=...) or not, but I assume it does, since
you used its complement ((?!...)) in your original REs.

Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...
2
by: Bryce Budd | last post by:
Hi all, I am trying to use a regular expression validator to check for the existence of PO Box in an address textbox. The business rule is "No addresses with PO Boxes are allowed." What I...
5
by: Bradley Plett | last post by:
I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL,...
4
by: Ben Dewey | last post by:
Hey, I have only been playing with regular expressions for some time. I am working on some code that parses and object 560 event log. I have created two expressions the first one which works...
3
by: moondaddy | last post by:
I need to rename file names with a specific naming convention which includes adding a SKU number at the end of the file name and I would like to use a regular expression to do this, but I don't...
3
by: rodchar | last post by:
hey all, what would my expression look like if i wanted to make sure that the input matched the following pattern. c:\filename.ext it doesn't have to be the c drive just a letter, colon,...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
3
by: LordHog | last post by:
Hello all, I am attempting to create a small scripting application to be used during testing. I extract the commands from the script file I was going to tokenize the each line as one of the...
1
by: vtxr1300 | last post by:
I'm having a problem with a regular expression in conjunction with the regular expression validator. I am trying to make sure that when a user browses for a file to upload, it ends in gif, jpeg or...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.