Regular expression for file name

Miki Tebeka

Hello All,

In a configuration file there can be ID's and filename tokens.
The file names have a known suffix (.o or .mls) and I need to get a regular
expression that will catch filename but not an ID.

Currently:
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"

However if I have the filename "Sources/kernel/rom_kernel.mls" then
"Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
as file name.

Any way to do better?

BTW: I'm using PLY (http://systems.cs.uchicago.edu/ply/) for parsing.

Bye.
--
------------------------------------------------------------------------
Miki Tebeka <mi*********@zoran.com>
http://tebeka.spymac.net
The only difference between children and adults is the price of the toys

Jul 18 '05 #1

Subscribe Post Reply

1991

Bengt Richter

On Sun, 18 Jul 2004 14:21:14 +0200, "Miki Tebeka" <mi*********@zoran.com> wrote:

Hello All,

In a configuration file there can be ID's and filename tokens.
The file names have a known suffix (.o or .mls) and I need to get a regular
expression that will catch filename but not an ID.

Currently:
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"

However if I have the filename "Sources/kernel/rom_kernel.mls" then
"Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
as file name. ITYM s/interrupted/interpreted/ ;-)
Any way to do better?

If you want to prioritize matching amongst several
patterns with some leading commonality, UIAM or'ed terms get
tried left to right. I'm not checking your terms, but I think
here's a possible way to give priority to the FILENAME
pattern:

import re
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"
COMBINED = '(?P<file>%s)|(?P<id>%s)' % (FILENAME, ID)
rxo = re.compile(COMBINED)
filename = "Sources/kernel/rom_kernel.mls"
rxo.search(filename).groupdict() {'id': None, 'file': 'Sources/kernel/rom_kernel.mls'}

Try it with an id:
rxo.search('no_slashes_in_this').groupdict() {'id': 'no_slashes_in_this', 'file': None}

Of course you can mess with the result, e.g.,
result = rxo.search('no_slashes_in_this').groupdict()
result['id'] 'no_slashes_in_this' result['file']
result['file'] is None True result['id'], result['file']

('no_slashes_in_this', None)

No guarantees, but HTH

Regards,
Bengt Richter

Jul 18 '05 #2

Christopher T King

On Sun, 18 Jul 2004, Miki Tebeka wrote:

In a configuration file there can be ID's and filename tokens.
The file names have a known suffix (.o or .mls) and I need to get a regular
expression that will catch filename but not an ID.

Currently:
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"

However if I have the filename "Sources/kernel/rom_kernel.mls" then
"Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
as file name.

I'm not familiar with PLY, but my guess as to the cause is that it gives
you those results because it is trying to match ID first, and then
FILENAME. The best way to solve this is to incorporate another restraint
in your RE, that is, the delimiter at the end of the pattern (presumably
whitespace):

ID = r"[a-zA-Z\.]\w+(?=\s)"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))(?=\s)"

I'm not sure if PLY supports (?=...) or not, but I assume it does, since
you used its complement ((?!...)) in your original REs.

Jul 18 '05 #3

by: Kenneth McDonald | last post by:

I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...

Python

Regular Expression Validator

by: Bryce Budd | last post by:

Hi all, I am trying to use a regular expression validator to check for the existence of PO Box in an address textbox. The business rule is "No addresses with PO Boxes are allowed." What I...

.NET Framework

Help with regular expression?

by: Bradley Plett | last post by:

I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL,...

.NET Framework

Regular Expressions Named Groups Problem

by: Ben Dewey | last post by:

Hey, I have only been playing with regular expressions for some time. I am working on some code that parses and object 560 event log. I have created two expressions the first one which works...

C# / C Sharp

Need help with regular expression

by: moondaddy | last post by:

I need to rename file names with a specific naming convention which includes adding a SKU number at the end of the file name and I would like to use a regular expression to do this, but I don't...

Visual Basic .NET

regular expression help cc

by: rodchar | last post by:

hey all, what would my expression look like if i wanted to make sure that the input matched the following pattern. c:\filename.ext it doesn't have to be the c drive just a letter, colon,...

Visual Basic .NET

Regular expression optimization

by: Billa | last post by:

Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...

.NET Framework

Regular Expression Matches

by: Pete Davis | last post by:

I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...

C# / C Sharp

Regular Expressions in C#

by: LordHog | last post by:

Hello all, I am attempting to create a small scripting application to be used during testing. I extract the commands from the script file I was going to tokenize the each line as one of the...

.NET Framework

problem with regular expression validator on asp.net page

by: vtxr1300 | last post by:

I'm having a problem with a regular expression in conjunction with the regular expression validator. I am trying to make sure that when a user browses for a file to upload, it ends in gif, jpeg or...

C# / C Sharp

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Regular expression for file name

Similar topics