473,587 Members | 2,504 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regexp Neg. set of chars HowTo?

Hi!

I want to replace some seqs. in a html.
Let:
a-
b
= ab

but:
xxx -
b
must be unchanged, because it is not word split.

I want to search and replace with re, but I don't know how to neg. this
set ['\ \n\t'].

This time I use full set without these chars, but neg. is better and
shorter.

Ok, I can use [^\s], but I want to know, how to neg. set of chars.
sNorm1= '([^[\ \t\n]]{1})\-\<br\ \/\>\n' - this is not working.

Thanks for the help:
dd

sNorm1= '([%s]{1})\-\<br\ \/\>\n'
c = range(0, 256)
c.remove(32)
c.remove(13)
c.remove(10)
c.remove(9)
s = ["\\%s" % (hex(v).replace ('00x', '')) for v in c]
sNorm1 = sNorm1 % ("".join(s))
print sNorm1

def Normalize(Text) :

rx = re.compile(sNor m1)
def replacer(match) :
return match.group(1)
return rx.sub(replacer , Text)

print Normalize('a -<br />\nb')
print Normalize('a-<br />\nb')
sys.exit()

Dec 20 '06 #1
3 1425

On Dec 20, 7:40 am, durumdara <durumd...@gmai l.comwrote:
Hi!

I want to replace some seqs. in a html.
Let:
a-
b
= ab

but:
xxx -
b
must be unchanged, because it is not word split.

I want to search and replace with re, but I don't know how to neg. this
set ['\ \n\t'].

This time I use full set without these chars, but neg. is better and
shorter.

Ok, I can use [^\s], but I want to know, how to neg. set of chars.
sNorm1= '([^[\ \t\n]]{1})\-\<br\ \/\>\n' - this is not working.

Thanks for the help:
dd

sNorm1= '([%s]{1})\-\<br\ \/\>\n'
c = range(0, 256)
c.remove(32)
c.remove(13)
c.remove(10)
c.remove(9)
s = ["\\%s" % (hex(v).replace ('00x', '')) for v in c]
sNorm1 = sNorm1 % ("".join(s))
print sNorm1

def Normalize(Text) :

rx = re.compile(sNor m1)
def replacer(match) :
return match.group(1)
return rx.sub(replacer , Text)

print Normalize('a -<br />\nb')
print Normalize('a-<br />\nb')
sys.exit()
It looks like you are trying to de-hyphenate words that have been
broken across line breaks.

Well, this isn't a regexp solution, it uses pyparsing instead. But
I've added a number of other test cases which may be problematic for an
re.

-- Paul

from pyparsing import makeHTMLTags,Li teral,Word,alph as,Suppress

brTag,brEndTag = makeHTMLTags("b r")
hyphen = Literal("-")
hyphen.leaveWhi tespace() # don't skip whitespace before matching this

collapse = Word(alphas) + Suppress(hyphen ) + Suppress(brTag) \
+ Word(alphas)
# define action to replace expression with the word before hyphen
# concatenated with the word after the <BRtag
collapse.setPar seAction(lambda toks: toks[0]+toks[1])

print collapse.transf ormString('a -<br />\nb')
print collapse.transf ormString('a-<br />\nb')
print collapse.transf ormString('a-<br/>\nb')
print collapse.transf ormString('a-<br>\nb')
print collapse.transf ormString('a- <BR clear=all>\nb')

Dec 21 '06 #2
Hi!

Thanks for this! I'll use that!

I found a solution my question in regexp way too:
import re
testtext = " minion battalion nation dion sion wion alion"
m = re.compile("[^t^l]ion")
print m.findall(testt ext)

I search for all text that not lion and tion.

dd

Paul McGuire wrote:
It looks like you are trying to de-hyphenate words that have been
broken across line breaks.

Well, this isn't a regexp solution, it uses pyparsing instead. But
I've added a number of other test cases which may be problematic for an
re.

-- Paul
Dec 22 '06 #3
In <ma************ *************** ************@py thon.org>, durumdara
wrote:
I found a solution my question in regexp way too:
import re
testtext = " minion battalion nation dion sion wion alion"
m = re.compile("[^t^l]ion")
print m.findall(testt ext)

I search for all text that not lion and tion.
And ^ion. The first ^ in that character group "negates" that group, the
second is a literal ^, so I guess you meant "[^tl]ion".

Ciao,
Marc 'BlackJack' Rintsch
Dec 22 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
2555
by: Patryk Konieczka | last post by:
Hello Here's the thing I have a database edited by some company workers editing descriptions of books in the sotre , unfortunately these workers do not have the habit of inserting a space character after word-ending dot, this is why i use the function: function getCorrect($txt) { $txt=str_replace(",",", ",$txt); $txt=str_replace(".",". ",$txt);
5
9027
by: Bosconian | last post by:
Using preg_replace() is there a simple regexp to strip everything from a string except alpha and numeric chars (a-zA-Z0-9)? $input = "$tring1!"; $pattern = $input = preg_replace($pattern, "", $input); result: "tring1"
5
2342
by: Lukas Holcik | last post by:
Hi everyone! How can I simply search text for regexps (lets say <a href="(.*?)">(.*?)</a>) and save all URLs(1) and link contents(2) in a dictionary { name : URL}? In a single pass if it could. Or how can I replace the html &entities; in a string "blablabla&amp;blablabal&amp;balbalbal" with the chars they mean using re.sub? I found out they are stored in an dict . I though about this functionality:
8
1593
by: B. | last post by:
Hello, I've got the following problem: Suppose you have the strings contains: "xxxx aaa { 111, 222, 333} bbb {111, 222,333} yyyy" "xxxx aaa {1112, 2223, 3334} bbb {11112, 22223,33334,44445} yyyy" I need to select only those strings which match the sequence aaa {...} with number 222 inside brackets. Number 222 can be of course at any position inside brackets e.g. aaa {222, 333, 111}, or aaa
3
1393
by: MD | last post by:
What do most of the experienced programmers here do when they encounter a situation, constrained to a C programming env obviously, where they could use a regexp system? I find the regcomp(), regexec(),system a little unwieldy, especially when dealing with user inputted expressions. Is there example code someone can point me to as a howto on dealing with regexp issues?
6
1829
by: Edward | last post by:
I need to validate a text box entry, but ONLY if it is 17 characters, otherwise I have to ignore it. My regular expression for the validation is: ^(({9})()()(\d{6}))$ Can I adapt this to "fire" only if the string in question is 17 chars in length? Or do I have to do this server-side? Thanks
8
1513
by: Paddy | last post by:
Proposal: Named RE variables ====================== The problem I have is that I am writing a 'good-enough' verilog tag extractor as a long regular expression (with the 'x' flag for readability), and find myself both 1) Repeating sections of the RE, and 2) Wanting to add '(?P<some_clarifier>...) ' around sections because I know what the section does but don't really want the group.
3
1207
by: vendredi5h | last post by:
Hello all, I would like to create a Regexp that (very simplified situation) read a web page and retreive some informations in the table rows (<tr>) that include a particular text except if it include too another particular text. Say I've got the following variable: var myhtml = "
1
8363
Atli
by: Atli | last post by:
The following small HowTo is a compilation of an original problem in getting some cookie-values through different methods of string-handling. The original Problem was posted as follows: As you can see, there could have been a problem with the split-method. The following short article handles ways around this possible problem, that we couldn't reproduce, but someone may possibly encounter it too sometimes. If nothing else, the shown...
0
7924
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, weíll explore What is ONU, What Is Router, ONU & Routerís main usage, and What is the difference between ONU and Router. Letís take a closer look ! Part I. Meaning of...
0
7854
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
7978
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
5395
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
3845
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
3882
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2364
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1455
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1192
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.