473,387 Members | 1,575 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Insert characters into string based on re ?

I am attempting to reformat a string, inserting newlines before certain
phrases. For example, in formatting SQL, I want to start a new line at
each JOIN condition. Noting that strings are immutable, I thought it
best to spllit the string at the key points, then join with '\n'.

Regexps can seem the best way to identify the points in the string
('LEFT.*JOIN' to cover 'LEFT OUTER JOIN' and 'LEFT JOIN'), since I need
to identify multiple locationg in the string. However, the re.split
method returns the list without the split phrases, and re.findall does
not seem useful for this operation.

Suggestions?

Oct 12 '06 #1
7 2350

Matt wrote:
I am attempting to reformat a string, inserting newlines before certain
phrases. For example, in formatting SQL, I want to start a new line at
each JOIN condition. Noting that strings are immutable, I thought it
best to spllit the string at the key points, then join with '\n'.

Regexps can seem the best way to identify the points in the string
('LEFT.*JOIN' to cover 'LEFT OUTER JOIN' and 'LEFT JOIN'), since I need
to identify multiple locationg in the string. However, the re.split
method returns the list without the split phrases
Not without some minor effort on your part :-)
See below.
and re.findall does
not seem useful for this operation.

Suggestions?
Read the fine manual:
"""
split( pattern, string[, maxsplit = 0])

Split string by the occurrences of pattern. If capturing parentheses
are used in pattern, then the text of all groups in the pattern are
also returned as part of the resulting list. If maxsplit is nonzero, at
most maxsplit splits occur, and the remainder of the string is returned
as the final element of the list. (Incompatibility note: in the
original Python 1.5 release, maxsplit was ignored. This has been fixed
in later releases.)
>>re.split('\W+', 'Words, words, words.')
['Words', 'words', 'words', '']

# Now see what happens when you use capturing parentheses:
>>re.split('(\W+)', 'Words, words, words.')
['Words', ', ', 'words', ', ', 'words', '.', '']
>>re.split('\W+', 'Words, words, words.', 1)
['Words', 'words, words.']
"""

HTH,
John

Oct 12 '06 #2

Matt wrote:
I am attempting to reformat a string, inserting newlines before certain
phrases. For example, in formatting SQL, I want to start a new line at
each JOIN condition. Noting that strings are immutable, I thought it
best to spllit the string at the key points, then join with '\n'.

Regexps can seem the best way to identify the points in the string
('LEFT.*JOIN' to cover 'LEFT OUTER JOIN' and 'LEFT JOIN'), since I need
to identify multiple locationg in the string. However, the re.split
method returns the list without the split phrases, and re.findall does
not seem useful for this operation.

Suggestions?
I think that re.sub is a more appropriate method rather than split and
join

trivial example (non SQL):
>>addnlre = re.compile('LEFT\s.*?\s*JOIN|RIGHT\s.*?\s*JOIN', re.DOTALL + re.IGNORECASE).sub
addnlre(lambda x: x.group() + '\n', '... LEFT JOIN x RIGHT OUTER join y')
'... LEFT JOIN\n x RIGHT OUTER join\n y'

Oct 13 '06 #3
ha***********@informa.com wrote:
>
Matt wrote:
>I am attempting to reformat a string, inserting newlines before
certain phrases. For example, in formatting SQL, I want to start a
new line at each JOIN condition. Noting that strings are immutable, I
thought it best to spllit the string at the key points, then join
with '\n'.

I think that re.sub is a more appropriate method rather than split and
join

trivial example (non SQL):
>>>addnlre = re.compile('LEFT\s.*?\s*JOIN|RIGHT\s.*?\s*JOIN',
re.DOTALL + re.IGNORECASE).sub addnlre(lambda x: x.group() + '\n',
'... LEFT JOIN x RIGHT OUTER join y')
'... LEFT JOIN\n x RIGHT OUTER join\n y'

Quite apart from the original requirement being to insert newlines before
rather than after the phrase, I wouldn't have said re.sub was appropriate.
>>addnlre(lambda x: x.group() + '\n',
"select * from whatever where action in ['user left site', 'user joined site']")
"select * from whatever where action in ['user left site', 'user join\ned site']"

or with the newline before the pattern:
>>addnlre(lambda x: '\n'+x.group(),
"select * from whatever where action in ['user left site', 'user joined site']")
"select * from whatever where action in ['user \nleft site', 'user joined site']"

Oct 13 '06 #4
Matt wrote:
I am attempting to reformat a string, inserting newlines before certain
phrases. For example, in formatting SQL, I want to start a new line at
each JOIN condition. Noting that strings are immutable, I thought it
best to spllit the string at the key points, then join with '\n'.

Regexps can seem the best way to identify the points in the string
('LEFT.*JOIN' to cover 'LEFT OUTER JOIN' and 'LEFT JOIN'), since I need
to identify multiple locationg in the string. However, the re.split
method returns the list without the split phrases, and re.findall does
not seem useful for this operation.

Suggestions?

Matt,

You may want to try this solution:
>>import SE
>>Formatter = SE.SE (' "~(?i)(left|inner|right|outer).*join~=\n=" ')
# Details explained below the dotted line
>>print Formatter ('select id, people.* from ids left outer join
people where ...\nSELECT name, job from people INNER JOIN jobs WHERE
....;\n')
select id, people.* from ids
left outer join people where ...
SELECT name, job from people
INNER JOIN jobs where ...;

You may add other substitutions as required one by one, interactively
tweaking each one until it does what it is supposed to do:
>>Formatter = SE.SE ('''
"~(?i)(left|inner|right|outer).*join~=\n =" # Add an indentation
"where=\n where" "WHERE=\n WHERE" # Add a newline also
before 'where'
";\n=;\n\n" # Add an extra line feed
"\n=;\n\n" # And add any missing
semicolon
# etc.
''')
>>print Formatter ('select id, people.* from ids left outer join
people where ...\nSELECT name, job from people INNER JOIN jobs WHERE
....;\n')
select id, people.* from ids
left outer join people
where ...;

SELECT name, job from people
INNER JOIN jobs
WHERE ...;
http://cheeseshop.python.org/pypi?:a...SE&version=2.3
Frederic
----------------------------------------------------------------------------------------------------------------------

The anatomy of a replacement definition
>>Formatter = SE.SE (' "~(?i)(left|inner|right|outer).*join~=\n=" ')
target=substitute (first '=')
>>Formatter = SE.SE (' "~(?i)(left|inner|right|outer).*join~=\n=" ')
= (each
following '=' stands for matched target)
>>Formatter = SE.SE (' "~(?i)(left|inner|right|outer).*join~=\n=" ')
~ ~ (contain
regular expression)
>>Formatter = SE.SE (' "~(?i)(left|inner|right|outer).*join~=\n=" ')
" "
(contain definition containing white space)

Oct 14 '06 #5
Frederic Rentsch wrote:
Matt wrote:
>I am attempting to reformat a string, inserting newlines before certain
phrases. For example, in formatting SQL, I want to start a new line at
each JOIN condition. Noting that strings are immutable, I thought it
best to spllit the string at the key points, then join with '\n'.

Regexps can seem the best way to identify the points in the string
('LEFT.*JOIN' to cover 'LEFT OUTER JOIN' and 'LEFT JOIN'), since I need
to identify multiple locationg in the string. However, the re.split
method returns the list without the split phrases, and re.findall does
not seem useful for this operation.

Suggestions?


Matt,

You may want to try this solution:
>import SE
.... snip
>
http://cheeseshop.python.org/pypi?:a...SE&version=2.3
For reasons unknown, the new download for SE is on the old page:
http://cheeseshop.python.org/pypi/SE/2.2%20beta.
>

Frederic
----------------------------------------------------------------------------------------------------------------------
Oct 15 '06 #6
Hi,
initially I had the same idea before I started writing a SQL Formatter.
I was sure that coding a few "change" commands in a script would
reformat my SQL statements. But step by step I recognized that SQL
statements can not be formatted by regular expressions. Why not?
Because there is a risk that you change e.g. values in literals and
this is changing the result of a query!!
Example:

--Select pieces where status like "Join with master piece"

Inserting line-breaks before joins using a "change" command would
change the SQL statement into

--Select pieces where status like "\nJoin with master piece"

The new select statement is no more working in the same way as the
original one.

In the meantime, the "script" has about 80 pages of code .....

Regards
GuidoMarcel

Oct 19 '06 #7

You can test it here: http://www.sqlinform.com

Oct 27 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

24
by: deko | last post by:
I'm trying to log error messages and sometimes (no telling when or where) the message contains a string with double quotes. Is there a way get the query to insert the string with the double...
14
by: Adam Clauss | last post by:
I've an application which is using a multiline textbox to log the status of a fairly long procedure. "Updates" are made to the status by calling textbox.AppendText. As my task is fairly lengthy,...
13
by: Matt | last post by:
10 len space designated strings grows when i do sb.insert. is there way to stop string growing. It becomes 14 space len on test.file I like to be able insert 3rd position but length should stay...
11
by: anony | last post by:
Hello, I can't figure out why my parameterized query from an ASP.NET page is dropping "special" characters such as accented quotes & apostrophes, the registered trademark symbol, etc. These...
9
by: anachronic_individual | last post by:
Hi all, Is there a standard library function to insert an array of characters at a particular point in a text stream without overwriting the existing content, such that the following data in...
6
by: rn5a | last post by:
During registration, users are supposed to enter the following details: First Name, Last Name, EMail, UserName, Password, Confirm Password, Address, City, State, Country, Zip & Phone Number. I am...
12
by: Torsten Bronger | last post by:
Hallöchen! I need some help with finding matches in a string that has some characters which are marked as escaped (in a separate list of indices). Escaped means that they must not be part of...
6
by: TheRealDan | last post by:
Hi all. I'm having a problem with a special characters. I have php script that reads from an xml file and writes to a mysql db. It's a script called phptunest that I found on the net, although the...
2
by: parasuram | last post by:
Hi friends ............. this is a question regarding the data structures trees Pleas post it if possible with in 2 days I will thankful if some body could help doing this. Operating...
2
by: franc sutherland | last post by:
Hello, I am using Access 2003. Is it possible to use string variables in the INSERT INTO statement? I am using the INSERT INTO statement to add a long list of contacts to a group by looping...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.