473,224 Members | 1,644 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,224 software developers and data experts.

Trivial string substitution/parser

Hi,

How would you implement a simple parser for the following string:

---
In this string $variable1 is substituted, while \$variable2 is not.
---

I know how to write a parser, but I am looking for an elegant (and lazy)
way. Any idea?

-Samuel
Jun 17 '07 #1
7 1643
Samuel <ne********@debain.orgwrote:
Hi,

How would you implement a simple parser for the following string:

---
In this string $variable1 is substituted, while \$variable2 is not.
---

I know how to write a parser, but I am looking for an elegant (and lazy)
way. Any idea?
The elegant and lazy way would be to change your specification so that $
characters are escaped by $$ not by backslashes. Then you can write:
>>from string import Template
t = Template("In this string $variable1 is substituted, while
$$variable2 is not.")
>>t.substitute(variable1="hello", variable2="world")
'In this string hello is substituted, while $variable2 is not.'

If you must insist on using backslash escapes (which introduces the
question of how you get backslashes into the output: do they have to be
escaped as well?) then use string.Template with a custom pattern.

Jun 17 '07 #2
On Sun, 17 Jun 2007 11:00:58 +0000, Duncan Booth wrote:
The elegant and lazy way would be to change your specification so that $
characters are escaped by $$ not by backslashes. Then you can write:
>>>from string import Template
...
Thanks, however, turns out my specification of the problem was
incomplete: In addition, the variable names are not known at compilation
time.
I just did it that way, this looks fairly easy already:

-------------------
import re

def variable_sub_cb(match):
prepend = match.group(1)
varname = match.group(2)
value = get_variable(varname)
return prepend + value

string_re = re.compile(r'(^|[^\\])\$([a-z][\w_]+\b)', re.I)

input = r'In this string $variable1 is substituted,'
input += 'while \$variable2 is not.'

print string_re.sub(variable_sub_cb, input)
-------------------

-Samuel
Jun 17 '07 #3
Samuel wrote:
On Sun, 17 Jun 2007 11:00:58 +0000, Duncan Booth wrote:
>The elegant and lazy way would be to change your specification so that $
characters are escaped by $$ not by backslashes. Then you can write:
>>>>from string import Template
...

Thanks, however, turns out my specification of the problem was
incomplete: In addition, the variable names are not known at compilation
time.
You mean at edit-time.
>>t.substitute(variable1="hello", variable2="world")
Can be replaced by...
>>t.substitute(**vars)
....as per the standard **kwargs passing semantics.
- Josiah
Jun 18 '07 #4
Samuel wote:
Thanks, however, turns out my specification of the problem was
incomplete: In addition, the variable names are not known at compilation
time.
I just did it that way, this looks fairly easy already:

-------------------
import re

def variable_sub_cb(match):
prepend = match.group(1)
varname = match.group(2)
value = get_variable(varname)
return prepend + value

string_re = re.compile(r'(^|[^\\])\$([a-z][\w_]+\b)', re.I)

input = r'In this string $variable1 is substituted,'
input += 'while \$variable2 is not.'

print string_re.sub(variable_sub_cb, input)
-------------------
It gets easier:

import re

def variable_sub_cb(match):
return get_variable(match.group(1))

string_re = re.compile(r'(?<!\\)\$([A-Za-z]\w+)')

def get_variable(varname):
return globals()[varname]

variable1 = 'variable 1'

input = r'In this string $variable1 is substituted,'
input += 'while \$variable2 is not.'

print string_re.sub(variable_sub_cb, input)

or even

import re

def variable_sub_cb(match):
return globals()[match.group(1)]

variable1 = 'variable 1'
input = (r'In this string $variable1 is substituted,'
'while \$variable2 is not.')

print re.sub(r'(?<!\\)\$([A-Za-z]\w+)', variable_sub_cb, input)
Graham

Jun 18 '07 #5
Josiah Carlson <jo************@sbcglobal.netwrote:
Samuel wrote:
>On Sun, 17 Jun 2007 11:00:58 +0000, Duncan Booth wrote:
>>The elegant and lazy way would be to change your specification so
that $ characters are escaped by $$ not by backslashes. Then you can
write:

>from string import Template
>...

Thanks, however, turns out my specification of the problem was
incomplete: In addition, the variable names are not known at
compilation time.

You mean at edit-time.
>t.substitute(variable1="hello", variable2="world")

Can be replaced by...
>t.substitute(**vars)

...as per the standard **kwargs passing semantics.
You don't even need to do that. substitute will accept a dictionary as a
positional argument:

t.substitute(vars)

If you use both forms then the keyword arguments take priority.

Also, of course, vars just needs to be something which quacks like a dict:
it can do whatever it needs to do such as looking up a database or querying
a server to generate the value only when it needs it, or even evaluating
the name as an expression; in the OP's case it could call get_variable.

Anyway, the question seems to be moot since the OP's definition of 'elegant
and lazy' includes regular expressions and reinvented wheels.

.... and in another message Graham Breed wrote:
def get_variable(varname):
return globals()[varname]
Doesn't the mere thought of creating global variables with unknown names
make you shudder?

Jun 18 '07 #6
Duncan Booth wote:
Also, of course, vars just needs to be something which quacks like a dict:
it can do whatever it needs to do such as looking up a database or querying
a server to generate the value only when it needs it, or even evaluating
the name as an expression; in the OP's case it could call get_variable.
And in case that sounds difficult, the code is

class VariableGetter:
def __getitem__(self, key):
return get_variable(key)
Anyway, the question seems to be moot since the OP's definition of 'elegant
and lazy' includes regular expressions and reinvented wheels.
Your suggestion of subclassing string.Template will also require a
regular expression -- and a fairly hairy one as far as I can work out
from the documentation. There isn't an example and I don't think it's
the easiest way of solving this problem. But if Samuel really wants
backslash escaping it'd be easier to do a replace('$$','$$$$') and
replace('\\$', '$$') (or replace('\\$','\\$$') if he really wants the
backslash to persist) before using the template.

Then, if he really does want to reject single letter variable names,
or names beginning with a backslash, he'll still need to subclass
Template and supply a regular expression, but a simpler one.
... and in another message Graham Breed wrote:
def get_variable(varname):
return globals()[varname]

Doesn't the mere thought of creating global variables with unknown names
make you shudder?
Not at all. It works, it's what the shell does, and it's easy to test
interactively. Obviously the application code wouldn't look like
that.
Graham

Jun 19 '07 #7
Duncan Booth wote:
If you must insist on using backslash escapes (which introduces the
question of how you get backslashes into the output: do they have to be
escaped as well?) then use string.Template with a custom pattern.
If anybody wants this, I worked out the following regular expression
which seems to work:

(?P<escaped>\\)\$ | # backslash escape pattern
\$(?:
(?P<named>[_a-z][_a-z0-9]*) | # delimiter and Python identifier
{(?P<braced>[_a-z][_a-z0-9]*)} | # delimiter and braced identifier
(?P<invalid>) # Other ill-formed delimiter exprs
)

The clue is string.Template.pattern.pattern

So you compile that with verbose and case-insensitive flags and set it
to "pattern" in a string.Template subclass. (In fact you don't have
to compile it, but that behaviour's undocumented.) Something like
>>regexp = """
.... (?P<escaped>\\\\)\\$ | # backslash escape pattern
.... \$(?:
.... (?P<named>[_a-z][_a-z0-9]*) | # delimiter and identifier
.... {(?P<braced>[_a-z][_a-z0-9]*)} | # ... and braced identifier
.... (?P<invalid>) # Other ill-formed delimiter exprs
.... )
.... """
>>class BackslashEscape(Template):
.... pattern = re.compile(regexp, re.I | re.X)
....
Graham

Jun 19 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: peter leonard | last post by:
Hi, This is a basic question but I can't figure out what is wron - even after reading the documentation. I have a script that normalizes strings. One of the steps is to convert all fractions too...
6
by: Troll | last post by:
Hi, This is what I have: #!/usr/bin/perl -w use strict; sub replace { s/one/two/; }
5
by: Murali | last post by:
In Python, dictionaries can have any hashable value as a string. In particular I can say d = {} d = "Right" d = "Wrong" d = "test" In order to print "test" using % substitution I can say
2
by: Bernd Muent | last post by:
Hi together, most of the time I'm using .net framework to programm C++. F.E. I can do the following: Regex* rex; String*...
8
by: Ben Dewey | last post by:
Anyone, I am trying to do a string replace of a custom Html Tag that is Case Insensitive and Fast, I will be calling this function a bunch of times. Any thoughts about using maybe a...
6
by: Generic Usenet Account | last post by:
I was extremely surprised to learn that the extremely rich C++ string API does not have even a single menthod devoted to string substitution i.e. given a string, replace all instances of pattern-1...
6
by: hidrkannan | last post by:
In the below code, I have used 5 different variables var1xxx,...var5xxx using 5 statements. But I would like to loop over the aList elements to substitute for 1 to 5 in the variable names and hence...
1
by: Horacius ReX | last post by:
Hi, I have a file with a lot of the following ocurrences: denmark.handa.1-10 denmark.handa.1-12344 denmark.handa.1-4 denmark.handa.1-56 ....
0
by: veera ravala | last post by:
ServiceNow is a powerful cloud-based platform that offers a wide range of services to help organizations manage their workflows, operations, and IT services more efficiently. At its core, ServiceNow...
0
by: VivesProcSPL | last post by:
Obviously, one of the original purposes of SQL is to make data query processing easy. The language uses many English-like terms and syntax in an effort to make it easy to learn, particularly for...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
0
by: mar23 | last post by:
Here's the situation. I have a form called frmDiceInventory with subform called subfrmDice. The subform's control source is linked to a query called qryDiceInventory. I've been trying to pick up the...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.