Hello all,
I have this line of numbers:
04242005 18:20:42-0.000002, 271.1748608, [-4.119873046875,
3.4332275390625 , 105.06225585937 5], [0.0937805175781 25, 0.041015625,
-0.9606628417968 75], [0.0155639648437 5, 0.01220703125,
0.0106811523437 5]
repeated several times in a text file and I would like each element to
be part of a vector. how do I do this ? I am not very capable in using
regexp as you can see.
Thanks in advance,
Jake. 7 1727
"se*******@gmai l.com" <se*******@gmai l.com> writes: Hello all,
I have this line of numbers:
04242005 18:20:42-0.000002, 271.1748608, [-4.119873046875, 3.4332275390625 , 105.06225585937 5], [0.0937805175781 25, 0.041015625, -0.9606628417968 75], [0.0155639648437 5, 0.01220703125, 0.0106811523437 5]
repeated several times in a text file and I would like each element to be part of a vector. how do I do this ? I am not very capable in using regexp as you can see.
You don't need a regexp to do that.
Use the split string method. It will split on spaces by default. If you want
to keep the values inside "[]" together, remove the spaces before splitting or
split on the "[" char first and then split the first item using spaces as a
separator.
Be seeing you,
--
Jorge Godoy <go***@ieee.org >
Hello,
I am not understanding your answer, but I probably asked the wrong
question :-)
I want to remove the commas, and square brackets [ and ] characters and
rewrite this whole line (and all the ones following in a text file
where only space would be a delimiter. How do I do this ?
I have tried this:
f = open(name3,'r')
r = r"\d+\.\d*"
for line in f:
cols = line.split()
data1 = re.findall(r,li ne)
and then I don't know what to do with either cols nor data1
Jake.
On Wed, 27 Apr 2005 07:56:11 -0700, se*******@gmail .com wrote: Hello all,
I have this line of numbers:
04242005 18:20:42-0.000002, 271.1748608, [-4.119873046875, 3.4332275390625 , 105.06225585937 5], [0.0937805175781 25, 0.041015625, -0.9606628417968 75], [0.0155639648437 5, 0.01220703125, 0.0106811523437 5]
repeated several times in a text file and I would like each element to be part of a vector. how do I do this ? I am not very capable in using regexp as you can see.
I think, based on the responses you've gotten so far, that perhaps you
aren't being clear enough.
Some starter questions:
* Is that all on one line in your file?
* Are there ever variable numbers of the [] fields?
* What do you mean by "vectors"?
If the line format is stable (no variation in numbers), and especially if
that is all one line, given that you are not familiar with regexp I
wouldn't muck about with it. (For me, I'd still say it's borderline if I
would go with that.) Instead, follow along in the following and it'll
probably help, though as I don't precisely know what you're asking I can't
give a complete solution:
Python 2.3.5 (#1, Mar 3 2005, 17:32:12)
[GCC 3.4.3 (Gentoo Linux 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)] on linux2
Type "help", "copyright" , "credits" or "license" for more information. x = "04242005 18:20:42-0.000002, 271.1748608, [-4.119873046875, 3.4332275390
625, 105.06225585937 5], [0.0937805175781 25, 0.041015625, -0.9606628417968 75], [0
..0155639648437 5, 0.01220703125, 0.0106811523437 5]" x.split(',', 2)
['04242005 18:20:42-0.000002', ' 271.1748608', ' [-4.119873046875, 3.43322753906
25, 105.06225585937 5], [0.0937805175781 25, 0.041015625, -0.9606628417968 75], [0.
01556396484375, 0.01220703125, 0.0106811523437 5]'] splitted = x.split(',', 2) splitted[2]
' [-4.119873046875, 3.4332275390625 , 105.06225585937 5], [0.0937805175781 25, 0.04
1015625, -0.9606628417968 75], [0.0155639648437 5, 0.01220703125, 0.0106811523437 5
]' import re safetyChecker = re.compile(r"^[-\[\]0-9,. ]*$") if safetyChecker.m atch(splitted[2]):
.... eval(splitted[2], {}, {})
....
([-4.119873046875, 3.4332275390625 , 105.06225585937 5], [0.0937805175781 25,
0.041015625, -0.9606628417968 75], [0.0155639648437 5, 0.01220703125,
0.0106811523437 5]) splitted[0].split()
['04242005', '18:20:42-0.000002'] splitted[0].split()[1].split('-')
['18:20:42', '0.000002']
I'd like to STRONGLY EMPHASIZE that there is danger in using "eval" as it
is very dangerous if you can't trust the source; *any* python code will
be run. That is why I am extra paranoid and double-check that the
expression only has the characters listed in that simple regex in it.
(Anyone who can construct a malicious string out of those characters will
get my sincere admiration.) You may do as you please, of course, but I
believe it is not helpful to suggest security holes on comp.lang.pytho n
:-) The coincidence of that part of your data, which is also the most
challenging to parse, exactly matching Python syntax is too much to pass
up.
This should give you some good ideas; if you post more detailed questions
we can probably be of more help.
Jake -
If regexp's give you pause, here is a pyparsing version that, while
verbose, is fairly straightforward . I made some guesses at what some
of the data fields might be, but that doesn't matter much.
Note the use of setResultsName( ) to give different parse fragments
names so that they are directly addressable in the results, instead of
having to count out "the 0'th group is the date, the 1'st group is the
time...". Also, there is a commented-out conversion action, to
automatically convert strings to floats during parsing.
Download pyparsing at http://pyparsing.sourceforge.net.
Good luck,
-- Paul
data = """04242005 18:20:42-0.000002, 271.1748608, [-4.119873046875,
3.4332275390625 , 105.06225585937 5], [0.0937805175781 25, 0.041015625,
-0.9606628417968 75], [0.0155639648437 5, 0.01220703125,
0.0106811523437 5]"""
from pyparsing import *
COMMA = Literal(",").su ppress()
LBRACK = Literal("[").suppress ()
RBRACK = Literal("]").suppress ()
# define a two-digit integer, we'll need a lot of them
int2 = Word(nums,exact =2)
month = int2
day = int2
yr = Combine("20" + int2)
date = Combine(month + day + yr)
hr = int2
min = int2
sec = int2
tz = oneOf("+ -") + Word(nums) + "." + Word(nums)
time = Combine( hr + ":" + min + ":" + sec + tz )
realNum = Combine( Optional("-") + Word(nums) + "." + Word(nums) )
# uncomment the next line and reals will be converted from strings to
floats during parsing
#realNum.setPar seAction( lambda s,l,t: float(t[0]) )
triplet = Group( LBRACK + realNum + COMMA + realNum + COMMA + realNum +
RBRACK )
entry = Group( date.setResults Name("date") +
time.setResults Name("time") + COMMA +
realNum.setResu ltsName("temp") + COMMA +
Group( triplet + COMMA + triplet + COMMA + triplet
).setResultsNam e("coords") )
dataFormat = OneOrMore(entry )
results = dataFormat.pars eString(data)
for d in results:
print d.date
print d.time
print d.temp
print d.coords[0].asList()
print d.coords[1].asList()
print d.coords[2].asList()
returns:
04242005
18:20:42-0.000002
271.1748608
['-4.119873046875' , '3.433227539062 5', '105.0622558593 75']
['0.093780517578 125', '0.041015625', '-0.9606628417968 75']
['0.015563964843 75', '0.01220703125' , '0.010681152343 75'] safetyChecker = re.compile(r"^[-\[\]0-9,. ]*$")
...doesn't the dot (.) in your character class mean that you are allowing
EVERYTHING (except newline?)
(you would probably want \. instead)
/Simon
Simon Dahlbacka wrote: >safetyChec ker = re.compile(r"^[-\[\]0-9,. ]*$")
..doesn't the dot (.) in your character class mean that you are allowing EVERYTHING (except newline?)
The re docs clearly say this is not the case:
'''
[]
Used to indicate a set of characters. Characters can be listed
individually, or a range of characters can be indicated by giving two
characters and separating them by a "-". Special characters are not
active inside sets.
'''
Note the last sentence in the above quotation...
-Peter
On Thu, 28 Apr 2005 20:53:14 -0400, Peter Hansen wrote: The re docs clearly say this is not the case:
''' [] Used to indicate a set of characters. Characters can be listed individually, or a range of characters can be indicated by giving two characters and separating them by a "-". Special characters are not active inside sets. '''
Note the last sentence in the above quotation...
-Peter
Aren't regexes /fun/?
Also from that passage, Simon, note the "-" right in front of
[-\[\]0-9,. ], another one that's tripped me up more than once.
Wheeee!
"Some people, when confronted with a problem, think ``I know, I'll use
regular expressions.'' Now they have two problems." - jwz http://www.jwz.org/hacks/marginal.html This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Lukas Holcik |
last post by:
Hi everyone!
How can I simply search text for regexps (lets say <a
href="(.*?)">(.*?)</a>) and save all URLs(1) and link contents(2) in a
dictionary { name : URL}? In a single pass if it could.
Or how can I replace the html &entities; in a string
"blablabla&blablabal&balbalbal" with the chars they mean using
re.sub? I found out they are...
|
by: Andrew E |
last post by:
Hi all
I've written a python program that adds orders into our order routing
simulation system. It works well, and has a syntax along these lines:
./neworder --instrument NOKIA --size 23 --price MARKET --repeats 20
etc
However, I'd like to add a mode that will handle, say:
|
by: Bill |
last post by:
If, for example, I retrieve a connectionstring from a config file using
something like:
Value = ConfigurationSettings.AppSettings;
This will return a string that is semi-colon delimited. If I want, say, to
retrieve the password from this string will I need to explicity parse it?
|
by: Jon Maz |
last post by:
Hi All,
I want to strip the accents off characters in a string so that, for example,
the (Spanish) word "práctico" comes out as "practico" - but ignoring case,
so that "PRÁCTICO" comes out as "PRACTICO".
What's the best way to do this?
TIA,
|
by: Douglas Crockford |
last post by:
There is a new version of JSON.parse in JavaScript. It is vastly
faster and smaller than the previous version. It uses a single call to
eval to do the conversion, guarded by a single regexp test to assure
that it is safe.
JSON.parse = function (text) {
return
(/^(\s|]|"(\\|)*"|-?\d+(\.\d*)?(?\d+)?|true|false|null)+$/.test(text))
&&...
| |
by: David Lozzi |
last post by:
Howdy,
I'm trying to get the values from a string of name/value pairs. I'm using a RegEx (I'm very new to RegEx) expression as seen below
Dim regExp As Regex
Dim m As Match
m = regExp.Match(strResult, "RESULT=((.|\n)*?)&")
|
by: mike |
last post by:
Hello, I am trying to write some code to parse a sentence and hyperlink
just the words in it. I used Aaron's code from an earlier question as
a start.
So far, all the code does below is hyperlink everything separated by a
space, which means stuff like "work." "happy." "Well;" "not." from the
sentence become hyperlinks (whereas im...
|
by: rupinderbatra |
last post by:
Hello everyone,
I am using a regular expression to parse a text string into various parts -- for ex: string "How do you do" will be changed to array with all the words and white spaces.
I am using the following code (which has been copied from internet)
<html>
<body>
<script type="text/javascript">
|
by: Matt |
last post by:
Hello all,
I have just discovered (the long way) that using a RegExp object with
the 'global' flag set produces inconsistent results when its test()
method is executed. I realize that 'global' is not an appropriate
modifier for the test() function - test() searches the entire string
by default.
However, I would expect it to degrade...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |