By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,898 Members | 1,183 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,898 IT Pros & Developers. It's quick & easy.

How to replace an attribute with empty string using regex?

Expert 100+
P: 542
I am parsing one xml, I want to replace an attribute with empty string.

every node has an attribute, something like this:


I am using following regex pattern, but its not working

d ='(id="([a-z0-9A-Z]+)")*',text)
Nov 2 '10 #1
Share this Question
Share on Google+
2 Replies

Expert Mod 2.5K+
P: 2,851
In what way is it not working?

Are you parsing the XML with an XML parser such as minidom? If not, you should consider doing so.

If parsing the string directly, have you tried parsing it line by line?

This seems to work:
Expand|Select|Wrap|Line Numbers
  1. >>> import re
  2. >>> patt = re.compile(r'id="[a-z0-9A-Z]+"')
  3. >>> s = 'id="1hyx36uhpi780iq8oiu355"   xxxxx xxxxx id="46fhrt5976jkfjhrh"'
  4. >>> s1 = patt.sub('id=""', s)
  5. >>> s1
  6. 'id=""   xxxxx xxxxx id=""'
  7. >>> 
Nov 2 '10 #2

Expert 100+
P: 983

It looks like you've got a problematic indefinite repetition operator in your statement:

Expand|Select|Wrap|Line Numbers
  1. d ='(id="([a-z0-9A-Z]+)")*',text)
  2.     HERE  --------------------------^
Which I think will cause undesired matches of zero length.

Also, I don't know what your data stream looks like, however you may need to check for single quotes (')enclosing the attribute value, as well.

Nov 2 '10 #3

Post your reply

Sign in to post your reply or Sign up for a free account.