472,341 Members | 2,174 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,341 software developers and data experts.

Reverse string-formatting (maybe?)

Is there any builtin function or module with a function similar to my
made-up, not-written deformat function as follows? I can't imagine it
would be too easy to write, but possible...
>>template = 'I am %s, and he %s last %s.'
values = ('coding', "coded', 'week')
formatted = template % values
formatted
'I am coding, and he coded last week.'
>>deformat(formatted, template)
('coding', 'coded', 'week')

expanded (for better visual):
>>deformat('I am coding, and he coded last week.', 'I am %s, and he %s last %s.')
('coding', 'coded', 'week')

It would return a tuple of strings, since it has no way of telling what
the original type of each item was.
Any input? I've looked through the documentation of the string module
and re module, did a search of the documentation and a search of this
group, and come up empty-handed.

Oct 14 '06 #1
11 4874
Dustan wrote:
Is there any builtin function or module with a function similar to my
made-up, not-written deformat function as follows? I can't imagine it
would be too easy to write, but possible...
>>>template = 'I am %s, and he %s last %s.'
values = ('coding', "coded', 'week')
formatted = template % values
formatted
'I am coding, and he coded last week.'
>>>deformat(formatted, template)
('coding', 'coded', 'week')

expanded (for better visual):
>>>deformat('I am coding, and he coded last week.', 'I am %s, and he %s
last %s.')
('coding', 'coded', 'week')

It would return a tuple of strings, since it has no way of telling what
the original type of each item was.
Any input? I've looked through the documentation of the string module
and re module, did a search of the documentation and a search of this
group, and come up empty-handed.
Simple, but unreliable:
>>import re
template = "I am %s, and he %s last %s."
values = ("coding", "coded", "week")
formatted = template % values
def deformat(formatted, template):
.... r = re.compile("(.*)".join(template.split("%s")))
.... return r.match(formatted).groups()
....
>>deformat(formatted, template)
('coding', 'coded', 'week')

Peter
Oct 14 '06 #2
>>>template = 'I am %s, and he %s last %s.'
>>>values = ('coding', "coded', 'week')
formatted = template % values
formatted
'I am coding, and he coded last week.'
>>>deformat(formatted, template)
('coding', 'coded', 'week')

expanded (for better visual):
>>>deformat('I am coding, and he coded last week.', 'I am %s, and he %s last %s.')
('coding', 'coded', 'week')

It would return a tuple of strings, since it has no way of telling what
the original type of each item was.

Any input? I've looked through the documentation of the string module
and re module, did a search of the documentation and a search of this
group, and come up empty-handed.

Yes, in the trivial case you provide, it can be done fairly
easily using the re module:
>>import re
template = 'I am %s, and he %s last %s.'
values = ('coding', 'coded', 'week')
formatted = template % values
unformat_re = re.escape(template).replace('%s', '(.*)')
# unformat_re = unformat_re.replace('%i', '([0-9]+)')
r = re.compile(unformat_re)
r.match(formatted).groups()
('coding', 'coded', 'week')

Thing's get crazier when you have things like
>>answer ='format values into a string'
template = 'The formatting string %%s is used to %s' % answer
or
>>template = 'The value is %0*.*f'
values = (10, 4, 3.14159)
formatted = template % values
formated
'The value is 00003.1415'

or
>>template = 'Dear %(name)s, Thank you for the %(gift)s. It
was very %(adj).' % {'name': 'Grandma', 'gift': 'sweater', 'adj':
'nice'}

Additionally, things go a little tangled when the replacement
values duplicate matters in the template. Should the unformatting
of "I am tired, and he didn't last last All Saint's Day" be
parsed as ('tired', "didn't last", "All Saint's Day") or
('tired', "didn't", "last All Saint's Day"). The /intent/ is
likely the former, but getting a computer to understand intent is
a non-trivial task ;)

Just a few early-morning thoughts...

-tkc


Oct 14 '06 #3
>>>template = 'I am %s, and he %s last %s.'
>>>values = ('coding', "coded', 'week')
formatted = template % values
formatted
'I am coding, and he coded last week.'
>>>deformat(formatted, template)
('coding', 'coded', 'week')

expanded (for better visual):
>>>deformat('I am coding, and he coded last week.', 'I am %s, and he %s last %s.')
('coding', 'coded', 'week')

It would return a tuple of strings, since it has no way of telling what
the original type of each item was.

Any input? I've looked through the documentation of the string module
and re module, did a search of the documentation and a search of this
group, and come up empty-handed.

Yes, in the trivial case you provide, it can be done fairly
easily using the re module:
>>import re
template = 'I am %s, and he %s last %s.'
values = ('coding', 'coded', 'week')
formatted = template % values
unformat_re = re.escape(template).replace('%s', '(.*)')
# unformat_re = unformat_re.replace('%i', '([0-9]+)')
r = re.compile(unformat_re)
r.match(formatted).groups()
('coding', 'coded', 'week')

Thing's get crazier when you have things like
>>answer ='format values into a string'
template = 'The formatting string %%s is used to %s' % answer
or
>>template = 'The value is %0*.*f'
values = (10, 4, 3.14159)
formatted = template % values
formated
'The value is 00003.1415'

or
>>template = 'Dear %(name)s, Thank you for the %(gift)s. It
was very %(adj).' % {'name': 'Grandma', 'gift': 'sweater', 'adj':
'nice'}

Additionally, things go a little tangled when the replacement
values duplicate matters in the template. Should the unformatting
of "I am tired, and he didn't last last All Saint's Day" be
parsed as ('tired', "didn't last", "All Saint's Day") or
('tired', "didn't", "last All Saint's Day"). The /intent/ is
likely the former, but getting a computer to understand intent is
a non-trivial task ;)

Just a few early-morning thoughts...

-tkc


Oct 14 '06 #4

Peter Otten wrote:
Dustan wrote:
Is there any builtin function or module with a function similar to my
made-up, not-written deformat function as follows? I can't imagine it
would be too easy to write, but possible...
>>template = 'I am %s, and he %s last %s.'
values = ('coding', "coded', 'week')
formatted = template % values
formatted
'I am coding, and he coded last week.'
>>deformat(formatted, template)
('coding', 'coded', 'week')

expanded (for better visual):
>>deformat('I am coding, and he coded last week.', 'I am %s, and he %s
last %s.')
('coding', 'coded', 'week')

It would return a tuple of strings, since it has no way of telling what
the original type of each item was.
Any input? I've looked through the documentation of the string module
and re module, did a search of the documentation and a search of this
group, and come up empty-handed.

Simple, but unreliable:
>import re
template = "I am %s, and he %s last %s."
values = ("coding", "coded", "week")
formatted = template % values
def deformat(formatted, template):
... r = re.compile("(.*)".join(template.split("%s")))
... return r.match(formatted).groups()
...
>deformat(formatted, template)
('coding', 'coded', 'week')

Peter
Trying to figure out the 'unreliable' part of your statement...

I'm sure 2 '%s' characters in a row would be a bad idea, and if you
have similar expressions for the '%s' characters within as well as in
the neighborhood of the '%s', that would cause difficulty. Is there any
other reason it might not work properly?

My template outside of the '%s' characters contains only commas and
spaces, and within, neither commas nor spaces. Given that information,
is there any reason it might not work properly?

Oct 15 '06 #5
My template outside of the '%s' characters contains only commas and
spaces, and within, neither commas nor spaces. Given that information,
is there any reason it might not work properly?
Given this new (key) information along with the assumption that
you're doing straight string replacement (not dictionary
replacement of the form "%(key)s" or other non-string types such
as "%05.2f"), then yes, a reversal is possible. To make it more
explicit, one would do something like
>>template = '%s, %s, %s'
values = ('Tom', 'Dick', 'Harry')
formatted = template % values
import re
unformat_string = template.replace('%s', '([^, ]+)')
unformatter = re.compile(unformat_string)
extracted_values = unformatter.search(formatted).groups()
using '[^, ]+' to mean "one or more characters that aren't a
comma or a space".

-tkc


Oct 15 '06 #6

Tim Chase wrote:
My template outside of the '%s' characters contains only commas and
spaces, and within, neither commas nor spaces. Given that information,
is there any reason it might not work properly?

Given this new (key) information along with the assumption that
you're doing straight string replacement (not dictionary
replacement of the form "%(key)s" or other non-string types such
as "%05.2f"), then yes, a reversal is possible. To make it more
explicit, one would do something like
>>template = '%s, %s, %s'
>>values = ('Tom', 'Dick', 'Harry')
>>formatted = template % values
>>import re
>>unformat_string = template.replace('%s', '([^, ]+)')
>>unformatter = re.compile(unformat_string)
>>extracted_values = unformatter.search(formatted).groups()

using '[^, ]+' to mean "one or more characters that aren't a
comma or a space".

-tkc
Thanks.

One more thing (I forgot to mention this other situation earlier)
The %s characters are ints, and outside can be anything except int
characters. I do have one situation of '%s%s%s', but I can change it to
'%s', and change the output into the needed output, so that's not
important. Think something along the lines of "abckdaldj iweo%s
qwierxcnv !%sjd".

Oct 15 '06 #7

Dustan wrote:
Tim Chase wrote:
My template outside of the '%s' characters contains only commas and
spaces, and within, neither commas nor spaces. Given that information,
is there any reason it might not work properly?
Given this new (key) information along with the assumption that
you're doing straight string replacement (not dictionary
replacement of the form "%(key)s" or other non-string types such
as "%05.2f"), then yes, a reversal is possible. To make it more
explicit, one would do something like
>>template = '%s, %s, %s'
>>values = ('Tom', 'Dick', 'Harry')
>>formatted = template % values
>>import re
>>unformat_string = template.replace('%s', '([^, ]+)')
>>unformatter = re.compile(unformat_string)
>>extracted_values = unformatter.search(formatted).groups()
using '[^, ]+' to mean "one or more characters that aren't a
comma or a space".

-tkc

Thanks.

One more thing (I forgot to mention this other situation earlier)
The %s characters are ints, and outside can be anything except int
characters. I do have one situation of '%s%s%s', but I can change it to
'%s', and change the output into the needed output, so that's not
important. Think something along the lines of "abckdaldj iweo%s
qwierxcnv !%sjd".
That was written in haste. All the information is true. The question:
I've already created a function to do this, using your original
deformat function. Is there any way in which it might go wrong?

Oct 15 '06 #8

Dustan wrote:
Dustan wrote:
Tim Chase wrote:
My template outside of the '%s' characters contains only commas and
spaces, and within, neither commas nor spaces. Given that information,
is there any reason it might not work properly?
>
Given this new (key) information along with the assumption that
you're doing straight string replacement (not dictionary
replacement of the form "%(key)s" or other non-string types such
as "%05.2f"), then yes, a reversal is possible. To make it more
explicit, one would do something like
>
>>template = '%s, %s, %s'
>>values = ('Tom', 'Dick', 'Harry')
>>formatted = template % values
>>import re
>>unformat_string = template.replace('%s', '([^, ]+)')
>>unformatter = re.compile(unformat_string)
>>extracted_values = unformatter.search(formatted).groups()
>
using '[^, ]+' to mean "one or more characters that aren't a
comma or a space".
>
-tkc
Thanks.

One more thing (I forgot to mention this other situation earlier)
The %s characters are ints, and outside can be anything except int
characters. I do have one situation of '%s%s%s', but I can change it to
'%s', and change the output into the needed output, so that's not
important. Think something along the lines of "abckdaldj iweo%s
qwierxcnv !%sjd".

That was written in haste. All the information is true. The question:
I've already created a function to do this, using your original
deformat function. Is there any way in which it might go wrong?
Again, haste. I used Peter's deformat function.

Oct 15 '06 #9
On 14 Oct 2006 05:35:02 -0700,
"Dustan" <Du**********@gmail.comwrote:
Is there any builtin function or module with a function similar to my
made-up, not-written deformat function as follows? I can't imagine it
would be too easy to write, but possible...
[ snip ]
Any input? I've looked through the documentation of the string module
and re module, did a search of the documentation and a search of this
group, and come up empty-handed.
Track down pyscanf. (Google is your friend, but I can't find any sort
of licensing/copyright information, and the web addresses in the source
code aren't available, so I hesitate to post my ancient copy.)

HTH,
Dan

--
Dan Sommers
<http://www.tombstonezero.net/dan/>
"I wish people would die in alphabetical order." -- My wife, the genealogist
Oct 15 '06 #10
>> >>template = '%s, %s, %s'
>> >>values = ('Tom', 'Dick', 'Harry')
>>formatted = template % values
>>import re
>>unformat_string = template.replace('%s', '([^, ]+)')
>>unformatter = re.compile(unformat_string)
>>extracted_values = unformatter.search(formatted).groups()

using '[^, ]+' to mean "one or more characters that aren't a
comma or a space".

One more thing (I forgot to mention this other situation earlier)
The %s characters are ints, and outside can be anything except int
characters. I do have one situation of '%s%s%s', but I can change it to
'%s', and change the output into the needed output, so that's not
important. Think something along the lines of "abckdaldj iweo%s
qwierxcnv !%sjd".

That was written in haste. All the information is true. The question:
I've already created a function to do this, using your original
deformat function. Is there any way in which it might go wrong?
Only you know what anomalies will be found in your data-sets. If
you know/assert that

-the only stuff in the formatting string is one set of characters

-that stuff in the replacement-values can never include any of
your format-string characters

-that you're not using funky characters/formatting in your format
string (such as "%%" possibly followed by an "s" to get the
resulting text of "%s" after formatting, or trying to use other
formatters such as the aforementioned "%f" or possibly "%i")

then you should be safe. It could also be possible (with my
original replacement of "(.*)") if your values will never include
any substring of your format string. If you can't guarantee
these conditions, you're trying to make a cow out of hamburger.
Or a pig out of sausage. Or a whatever out of a hotdog. :)

Conventional wisdom would tell you to create a test-suite of
format-strings and sample values (preferably worst-case funkiness
in your expected format-strings/values), and then have a test
function that will assert that the unformatting of every
formatted string in the set returns the same set of values that
went in. Something like

tests = {
'I was %s but now I am %s' : [
('hot', 'cold'),
('young', 'old'),
],
'He has 3 %s and 2 %s' : [
('brothers', 'sisters'),
('cats', 'dogs')
]
}

for format_string, values in tests:
unformatter = format.replace('%s', '(.*)')
for value_tuple in values:
formatted = format_string % value_tuple
unformatted = unformatter.search(formatted).groups()
if unformatted <value_tuple:
print "%s doesn't match %s when unformatting %s" % (
unformatted,
value_tuple
format_string)

-tkc


Oct 15 '06 #11
Only you know what anomalies will be found in your data-sets. If
you know/assert that

-the only stuff in the formatting string is one set of characters

-that stuff in the replacement-values can never include any of
your format-string characters

-that you're not using funky characters/formatting in your format
string (such as "%%" possibly followed by an "s" to get the
resulting text of "%s" after formatting, or trying to use other
formatters such as the aforementioned "%f" or possibly "%i")

then you should be safe. It could also be possible (with my
original replacement of "(.*)") if your values will never include
any substring of your format string. If you can't guarantee
these conditions, you're trying to make a cow out of hamburger.
Or a pig out of sausage. Or a whatever out of a hotdog. :)

Conventional wisdom would tell you to create a test-suite of
format-strings and sample values (preferably worst-case funkiness
in your expected format-strings/values), and then have a test
function that will assert that the unformatting of every
formatted string in the set returns the same set of values that
went in. Something like

tests = {
'I was %s but now I am %s' : [
('hot', 'cold'),
('young', 'old'),
],
'He has 3 %s and 2 %s' : [
('brothers', 'sisters'),
('cats', 'dogs')
]
}

for format_string, values in tests:
unformatter = format.replace('%s', '(.*)')
for value_tuple in values:
formatted = format_string % value_tuple
unformatted = unformatter.search(formatted).groups()
if unformatted <value_tuple:
print "%s doesn't match %s when unformatting %s" % (
unformatted,
value_tuple
format_string)

-tkc
Thanks for all your help. I've gotten the idea.

Oct 15 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Thomas Philips | last post by:
Consider the following simple dictionary e={1:'one', 2: 'two'} e >>>'one' However, If I attempt to print e using a formatted string print "...
20
by: Pierre Fortin | last post by:
Hi! "Python Essential Reference" - 2nd Ed, on P. 47 states that a string format can include "*" for a field width (no restrictions noted); yet......
5
by: rodney.maxwell | last post by:
Was doing some string formatting, noticed the following: >>> x = None >>> "%s" % x 'None' Is there a reason it maps to 'None'? I had expected...
0
by: David Rifkind | last post by:
I've seen some strange string formatting behavior on several XP and 2000 machines. I can get the same results with Graphics.DrawString, but the...
0
by: Top Gun | last post by:
I have been finding difficulty in locating decent, in-depth information on a standard matter in the IT world... STRING FORMATTING. Particuarlly as...
3
by: Franck | last post by:
hello, i'm looking for code (C# preferably) in order to change programmatically in a datagrid the string formatting expression of one bound...
7
by: Steven D'Aprano | last post by:
I have a sinking feeling I'm missing something really, really simple. I'm looking for a format string similar to '%.3f' except that trailing...
27
by: fdu.xiaojf | last post by:
Hi, String formatting can be used to converting an integer to its octal or hexadecimal form: '307' 'c7' But, can string formatting be used...
7
by: sherifffruitfly | last post by:
Hi, God I hate datetime string formatting... How do I get a string of the form "04-Oct-2006", for example, from a DateTime object? Thanks a...
2
by: Tim Chase | last post by:
Is there an easy way to make string-formatting smart enough to gracefully handle iterators/generators? E.g. transform = lambda s: s.upper()...
0
by: concettolabs | last post by:
In today's business world, businesses are increasingly turning to PowerApps to develop custom business applications. PowerApps is a powerful tool...
0
better678
by: better678 | last post by:
Question: Discuss your understanding of the Java platform. Is the statement "Java is interpreted" correct? Answer: Java is an object-oriented...
0
by: teenabhardwaj | last post by:
How would one discover a valid source for learning news, comfort, and help for engineering designs? Covering through piles of books takes a lot of...
0
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was...
0
by: Matthew3360 | last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function. Here is my code. ...
2
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
0
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.