473,785 Members | 3,417 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regular expression for file name

Hello All,

In a configuration file there can be ID's and filename tokens.
The file names have a known suffix (.o or .mls) and I need to get a regular
expression that will catch filename but not an ID.

Currently:
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"

However if I have the filename "Sources/kernel/rom_kernel.mls" then
"Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
as file name.

Any way to do better?

BTW: I'm using PLY (http://systems.cs.uchicago.edu/ply/) for parsing.

Bye.
--
------------------------------------------------------------------------
Miki Tebeka <mi*********@zo ran.com>
http://tebeka.spymac.net
The only difference between children and adults is the price of the toys
Jul 18 '05 #1
2 2005
On Sun, 18 Jul 2004 14:21:14 +0200, "Miki Tebeka" <mi*********@zo ran.com> wrote:
Hello All,

In a configuration file there can be ID's and filename tokens.
The file names have a known suffix (.o or .mls) and I need to get a regular
expression that will catch filename but not an ID.

Currently:
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"

However if I have the filename "Sources/kernel/rom_kernel.mls" then
"Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
as file name. ITYM s/interrupted/interpreted/ ;-)
Any way to do better?

If you want to prioritize matching amongst several
patterns with some leading commonality, UIAM or'ed terms get
tried left to right. I'm not checking your terms, but I think
here's a possible way to give priority to the FILENAME
pattern:
import re
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"
COMBINED = '(?P<file>%s)|( ?P<id>%s)' % (FILENAME, ID)
rxo = re.compile(COMB INED)
filename = "Sources/kernel/rom_kernel.mls"
rxo.search(file name).groupdict () {'id': None, 'file': 'Sources/kernel/rom_kernel.mls' }

Try it with an id:
rxo.search('no_ slashes_in_this ').groupdict() {'id': 'no_slashes_in_ this', 'file': None}

Of course you can mess with the result, e.g.,
result = rxo.search('no_ slashes_in_this ').groupdict()
result['id'] 'no_slashes_in_ this' result['file']
result['file'] is None True result['id'], result['file']

('no_slashes_in _this', None)

No guarantees, but HTH

Regards,
Bengt Richter
Jul 18 '05 #2
On Sun, 18 Jul 2004, Miki Tebeka wrote:
In a configuration file there can be ID's and filename tokens.
The file names have a known suffix (.o or .mls) and I need to get a regular
expression that will catch filename but not an ID.

Currently:
ID = r"[a-zA-Z\.]\w+(?![/\\])"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))"

However if I have the filename "Sources/kernel/rom_kernel.mls" then
"Source" is interrupted as ID and "s/kernel/rom_kernel.mls" is interrupted
as file name.


I'm not familiar with PLY, but my guess as to the cause is that it gives
you those results because it is trying to match ID first, and then
FILENAME. The best way to solve this is to incorporate another restraint
in your RE, that is, the delimiter at the end of the pattern (presumably
whitespace):

ID = r"[a-zA-Z\.]\w+(?=\s)"
FILENAME = r"([a-zA-Z]:)?[\w./\\]+\.((mls)|(o))( ?=\s)"

I'm not sure if PLY supports (?=...) or not, but I assume it does, since
you used its complement ((?!...)) in your original REs.

Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
4182
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make regular expressions easier to create and use (and in my experience as a regular expression user, it makes them MUCH easier to create and use.) I'm still working on formal documentation, and in any case, such documentation isn't necessarily the...
2
18080
by: Bryce Budd | last post by:
Hi all, I am trying to use a regular expression validator to check for the existence of PO Box in an address textbox. The business rule is "No addresses with PO Boxes are allowed." What I want to happen is the Regular Expression Validator to return false only when the string contains PO Box. Currently it is false even when a valid address exists.
5
2534
by: Bradley Plett | last post by:
I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL, and then reconstruct another URL based on it. For example, I need to scan a web page looking for something like <a href="some_dir/list_20050815100225.csv">. I don't know in advance what the date/time in the file name will be. I need to take the...
4
2183
by: Ben Dewey | last post by:
Hey, I have only been playing with regular expressions for some time. I am working on some code that parses and object 560 event log. I have created two expressions the first one which works okay is for the actual csv of each log. The second one parses out the description of the log. My problem is with the accesses section of the description. How do I parse multiple groups that have the same name. When I do a for each through the...
3
278
by: moondaddy | last post by:
I need to rename file names with a specific naming convention which includes adding a SKU number at the end of the file name and I would like to use a regular expression to do this, but I don't even know where to start on this one. For example say my file name is MyFileName.jpg and my SKU number is 1234. I need to rename the file to MyFileName---1234.jpg where the sku number gets prefixed with "---". However, since there will be a...
3
1443
by: rodchar | last post by:
hey all, what would my expression look like if i wanted to make sure that the input matched the following pattern. c:\filename.ext it doesn't have to be the c drive just a letter, colon, filename, and extension.
7
3830
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I want to avoid that. My question here is if there is a way to pass either a memory stream or array of "find", "replace" expressions or any other way to avoid multiple copies of a string. Any help will be highly appreciated
9
3358
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use an app call The Regulator, which makes it pretty easy to build and test regular expressions. As a warning, I'm real weak with regular expressions. Let's say my regular expression is:
3
3334
by: LordHog | last post by:
Hello all, I am attempting to create a small scripting application to be used during testing. I extract the commands from the script file I was going to tokenize the each line as one of the requirements is there one command per line. I have always wanted to learn Regular Expressions, so I was hoping I might do this using Regular Expressions. For a fair number of the command will have the syntax like Write( 0x123, 0x12, 25, 100 ) <-...
1
2266
by: vtxr1300 | last post by:
I'm having a problem with a regular expression in conjunction with the regular expression validator. I am trying to make sure that when a user browses for a file to upload, it ends in gif, jpeg or jpg. I have the following expression which validates fine in a .net tester I use and also a javascript tester. But when I use the following path on the page, it gives me the error message that I haven't entered a valid image. ...
0
10315
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10085
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9947
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6737
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5379
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5511
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4045
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3645
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2877
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.