By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,215 Members | 1,005 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,215 IT Pros & Developers. It's quick & easy.

String Replacement

P: 19
Hello everyone, I've got a simple one today. I have a string and I want to remove all carriage returns ('\n') between the characters [ACGU] and [ACGU] and preserve the other ones. For example:

'Musmusculuslet-7gstem-loop\nCCAGGCUGAGGUAGUAGUUUGUACAGUUUGAGGGUCUAUGAUAC CACCCGGUACAGGAGA\nUAACUGUACAGGCCACUGCCUUGCCAGG\n
Musmusculuslet-7istem-loop\nCUGGCUGAGGUAGUAGUUUGUGCUGUUGGUCGGGUUGUGACAUU GCCCGCUGUGGAGAUA\nACUGCGCAAGCUACUGCCUUGCUAG\n'

editted string:

'Musmusculuslet-7gstem-loop\nCCAGGCUGAGGUAGUAGUUUGUACAGUUUGAGGGUCUAUGAUAC CACCCGGUACAGGAGAUAACUGUACAGGCCACUGCCUUGCCAGG\n
Musmusculuslet-7istem-loop\nCUGGCUGAGGUAGUAGUUUGUGCUGUUGGUCGGGUUGUGACAUU GCCCGCUGUGGAGAUAACUGCGCAAGCUACUGCCUUGCUAG\n'

I tried the following code but I got a syntax error:

Expand|Select|Wrap|Line Numbers
  1. for a in chunk3:
  2.     if ([ACGU]'\n'[ACGU]):
  3.                chunk3 = chunk3.replace('\n','')
  4.  
Thanks,

Mark
Aug 8 '07 #1
Share this Question
Share on Google+
5 Replies


ilikepython
Expert 100+
P: 844
Hello everyone, I've got a simple one today. I have a string and I want to remove all carriage returns ('\n') between the characters [ACGU] and [ACGU] and preserve the other ones. For example:

'Musmusculuslet-7gstem-loop\nCCAGGCUGAGGUAGUAGUUUGUACAGUUUGAGGGUCUAUGAUAC CACCCGGUACAGGAGA\nUAACUGUACAGGCCACUGCCUUGCCAGG\n
Musmusculuslet-7istem-loop\nCUGGCUGAGGUAGUAGUUUGUGCUGUUGGUCGGGUUGUGACAUU GCCCGCUGUGGAGAUA\nACUGCGCAAGCUACUGCCUUGCUAG\n'

editted string:

'Musmusculuslet-7gstem-loop\nCCAGGCUGAGGUAGUAGUUUGUACAGUUUGAGGGUCUAUGAUAC CACCCGGUACAGGAGAUAACUGUACAGGCCACUGCCUUGCCAGG\n
Musmusculuslet-7istem-loop\nCUGGCUGAGGUAGUAGUUUGUGCUGUUGGUCGGGUUGUGACAUU GCCCGCUGUGGAGAUAACUGCGCAAGCUACUGCCUUGCUAG\n'

I tried the following code but I got a syntax error:

Expand|Select|Wrap|Line Numbers
  1. for a in chunk3:
  2.     if ([ACGU]'\n'[ACGU]):
  3.                chunk3 = chunk3.replace('\n','')
  4.  
Thanks,

Mark
Try this:
Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. s = "Musmusculuslet-7gstem-loop\nCCAGGCUGAGGUAGUAGUUUGUACAGUUUGAGGGUCUAUGAUACCACCCGGUACAGGAGA\nUAACUGUACAGGCCACUGCCUUGCCAGG\n"
  4.  
  5. patt = re.compile(r"[ACGU]\n[ACGU]")
  6. matches = patt.findall(s)
  7.  
  8. for m in matches:
  9.     s = s.replace(m, m[0] + m[2])
  10.  
Aug 8 '07 #2

bvdet
Expert Mod 2.5K+
P: 2,851
Hello everyone, I've got a simple one today. I have a string and I want to remove all carriage returns ('\n') between the characters [ACGU] and [ACGU] and preserve the other ones. For example:

'Musmusculuslet-7gstem-loop\nCCAGGCUGAGGUAGUAGUUUGUACAGUUUGAGGGUCUAUGAUAC CACCCGGUACAGGAGA\nUAACUGUACAGGCCACUGCCUUGCCAGG\n
Musmusculuslet-7istem-loop\nCUGGCUGAGGUAGUAGUUUGUGCUGUUGGUCGGGUUGUGACAUU GCCCGCUGUGGAGAUA\nACUGCGCAAGCUACUGCCUUGCUAG\n'

editted string:

'Musmusculuslet-7gstem-loop\nCCAGGCUGAGGUAGUAGUUUGUACAGUUUGAGGGUCUAUGAUAC CACCCGGUACAGGAGAUAACUGUACAGGCCACUGCCUUGCCAGG\n
Musmusculuslet-7istem-loop\nCUGGCUGAGGUAGUAGUUUGUGCUGUUGGUCGGGUUGUGACAUU GCCCGCUGUGGAGAUAACUGCGCAAGCUACUGCCUUGCUAG\n'

I tried the following code but I got a syntax error:

Expand|Select|Wrap|Line Numbers
  1. for a in chunk3:
  2.     if ([ACGU]'\n'[ACGU]):
  3.                chunk3 = chunk3.replace('\n','')
  4.  
Thanks,

Mark
It looks like you are trying to implement a regex solution without understanding how it works. This seems to do what you want:
Expand|Select|Wrap|Line Numbers
  1. import re
  2. patt = re.compile(r'[ACGU]\n[ACGU]')
  3. s = 'Musmusculuslet-7gstem-loop\nCCAGGCUGAGGUAGUAGUUUGUACAGUUUGAGGGUCUAUGAUACCACCCGGUACAGGAGA\nUAACUGUACAGGCCACUGCCUUGCCAGG\nMusmusculuslet-7istem-loop\nCUGGCUGAGGUAGUAGUUUGUGCUGUUGGUCGGGUUGUGACAUUGCCCGCUGUGGAGAUA\nACUGCGCAAGCUACUGCCUUGCUAG\n'
  4. s1 = s
  5. for item in patt.findall(s):
  6.     s1 = s1.replace(item, (item.replace('\n', '')))
Output:
Expand|Select|Wrap|Line Numbers
  1. >>> Musmusculuslet-7gstem-loop
  2. CCAGGCUGAGGUAGUAGUUUGUACAGUUUGAGGGUCUAUGAUACCACCCGGUACAGGAGAUAACUGUACAGGCCACUGCCUUGCCAGG
  3. Musmusculuslet-7istem-loop
  4. CUGGCUGAGGUAGUAGUUUGUGCUGUUGGUCGGGUUGUGACAUUGCCCGCUGUGGAGAUAACUGCGCAAGCUACUGCCUUGCUAG
  5.  
  6. >>> 
Step by step:
1. Import the 're' module
2. Define and compile the pattern to match substrings in your string: a character in the set 'ACGU' followed by '\n' followed by a character in the set 'ACGU'.
3. Use the compiled pattern object to find all occurrences of the matched patterns.
>>> patt.findall(s)
['A\nU', 'A\nA']
>>>
4. Use string method replace to replace each '\n' with "".
HTH
Aug 8 '07 #3

bvdet
Expert Mod 2.5K+
P: 2,851
You beat me to it ilikepython! :(
Aug 8 '07 #4

P: 19
Thanks for the guidance. My question is why are you creating a second string (s1), and running the pattern and replacing in that string? I'm still working my way around understanding regex's.

Mark
Aug 9 '07 #5

bvdet
Expert Mod 2.5K+
P: 2,851
Thanks for the guidance. My question is why are you creating a second string (s1), and running the pattern and replacing in that string? I'm still working my way around understanding regex's.

Mark
No reason other that to do further manipulations at the interactive prompt.
Aug 9 '07 #6

Post your reply

Sign in to post your reply or Sign up for a free account.