Hi,
I have a string e.g. 'C6 H12 O6' that I wish to split up to give 2
strings
'C H O' and '6 12 6'. I have played with string.split() and the re module -
but can't quite get there.
Any help would be greatly appreciated.
Thanks,
Mark. 7 4272
Mark Light wrote: Hi, I have a string e.g. 'C6 H12 O6' that I wish to split up to give 2 strings 'C H O' and '6 12 6'. I have played with string.split() and the re module - but can't quite get there.
Any help would be greatly appreciated.
Thanks,
Mark.
I'm, assuming that these are chemical compounds, so you're not limited to
one-character symbols.
Here's how I'd do it
import re
re_pat = re.compile('([A-Z]+)(\d+)')
text = 'C6 H12 O6'
# find each component, returns list of tuples (e.g. [('C', '6'), ...]
component = re_pat.findall(text)
#split into separate lists
symbols, counts = zip(*component)
# create the strings
symbols = ' '.join(symbols)
counts = ' '.join(counts)
--Andy
that works great - many thanks.
"trp" <tr*@smyrncable.net> wrote in message
news:vg************@corp.supernews.com... Mark Light wrote:
Hi, I have a string e.g. 'C6 H12 O6' that I wish to split up to give 2 strings 'C H O' and '6 12 6'. I have played with string.split() and the re
module - but can't quite get there.
Any help would be greatly appreciated.
Thanks,
Mark.
I'm, assuming that these are chemical compounds, so you're not limited to one-character symbols.
Here's how I'd do it
import re
re_pat = re.compile('([A-Z]+)(\d+)') text = 'C6 H12 O6'
# find each component, returns list of tuples (e.g. [('C', '6'), ...] component = re_pat.findall(text)
#split into separate lists symbols, counts = zip(*component)
# create the strings symbols = ' '.join(symbols) counts = ' '.join(counts)
--Andy
Mark Light wrote: Hi, I have a string e.g. 'C6 H12 O6' that I wish to split up to give 2 strings 'C H O' and '6 12 6'. I have played with string.split() and the re module - but can't quite get there.
Any help would be greatly appreciated.
import re
molecule_re = re.compile("(.+?)([0-9]+)")
def processMolecule(molecule):
elements=[]
numbers=[]
for item in molecule.split():
element, number = molecule_re.findall(item)[0]
elements.append(element)
numbers.append(number)
elements = ' '.join(elements)
numbers = ' '.join(numbers)
return (elements, numbers)
print processMolecule('C6 H12 O6')
trp: I'm, assuming that these are chemical compounds, so you're not limited to one-character symbols.
The problem is underspecified. Usually 2-character (or 3-character for some
elements with high atomic number, and not assuming the newer IUPAC names
like "Dubnium", which was also called Unnilpentium (Unp) or, depending on
your political persuasion, Joliotium (Jl) or Hahnium (Ha)) have the first
letter
capitalized and the rest in lower case.
re_pat = re.compile('([A-Z]+)(\d+)')
So this should be written ([A-Z][A-Za-z]*)(\d+), where I explicitly allow
both lower and upper case trailing letters to be more accepting. (In some
systems, "CU" is "1 carbon + 1 uranium" and in others it's an alternate way
to
write "1 copper". Though I suspect it's not allowed in the OP's problem.)
Andrew da***@dalkescientific.com
Anton Vredegoor: The issue seems to be resolved already, but I haven't seen the split and strip combination:
from string import letters,digits
Use "ascii_letters" instead of "letters". The latter is based on the locale
so
might not work on some machines where "C" (or rather, byte 67) isn't
a letter in the local alphabet.
Andrew da***@dalkescientific.com
trp: I'm, assuming that these are chemical compounds, so you're not limited to one-character symbols.
The problem is underspecified. Usually 2-character (or 3-character for some
elements with high atomic number, and not assuming the newer IUPAC names
like "Dubnium", which was also called Unnilpentium (Unp) or, depending on
your political persuasion, Joliotium (Jl) or Hahnium (Ha)) have the first
letter
capitalized and the rest in lower case.
re_pat = re.compile('([A-Z]+)(\d+)')
So this should be written ([A-Z][A-Za-z]*)(\d+), where I explicitly allow
both lower and upper case trailing letters to be more accepting. (In some
systems, "CU" is "1 carbon + 1 uranium" and in others it's an alternate way
to
write "1 copper". Though I suspect it's not allowed in the OP's problem.)
Andrew da***@dalkescientific.com
Anton Vredegoor: The issue seems to be resolved already, but I haven't seen the split and strip combination:
from string import letters,digits
Use "ascii_letters" instead of "letters". The latter is based on the locale
so
might not work on some machines where "C" (or rather, byte 67) isn't
a letter in the local alphabet.
Andrew da***@dalkescientific.com This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Piet |
last post by:
Hello,
I have a very strange problem with regular expressions. The problem
consists of analyzing the properties of columns of a MySQL database.
When I request the column type, I get back a string...
|
by: Aaron Walker |
last post by:
I have a feeling this going to end up being something so stupid, but
right now I'm confused as hell.
I'm trying to code a function, that given a string and a delimiter char,
returns a vector of...
|
by: Dr. StrangeLove |
last post by:
Greetings,
Let say we want to split column 'list' in table lists
into separate rows using the comma as the delimiter.
Table lists
id list
1 aa,bbb,c
2 e,f,gggg,hh
3 ii,kk
4 m
|
by: fatted |
last post by:
I'm trying to write a function which splits a string (possibly multiple
times) on a particular character and returns the strings which has been
split. What I have below is kind of (oh dear!)...
|
by: Trint Smith |
last post by:
Ok,
My program has been formating .txt files for input into sql server and
ran into a problem...the .txt is an export from an accounting package
and is only supposed to contain comas (,) between...
| |
by: Opettaja |
last post by:
I am new to c# and I am currently trying to make a program to retrieve
Battlefield 2 game stats from the gamespy servers. I have got it so I
can retrieve the data but I do not know how to cut up...
|
by: Pedro Pinto |
last post by:
Hi there.
I'm trying to do the following.
I have a string, and i want to separate it into other halves.
This is how it should be:
char string = "test//test2//test3";
were // is the part...
|
by: shadow_ |
last post by:
Hi i m new at C and trying to write a parser and a string class.
Basicly program will read data from file and splits it into lines then
lines to words. i used strtok function for splitting data to...
|
by: techusky |
last post by:
I am making a website for a newspaper, and I am having difficulty
figuring out how to take a string (the body of an article) and break
it up into three new strings so that I can display them in the...
|
by: Eyes Of Madness |
last post by:
I'm doing a program for a class of mine and I am having trouble splitting my strings up. I know you can do something like:
a = '012345'
a
returns 012
but I am inputing strings of varying...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
| |
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
|
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...
| |