469,276 Members | 1,758 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,276 developers. It's quick & easy.

An Odd Delimeter

For some strange reason my file is split by the  character. I didn't choose the character, and it is, at the moment, unlikely to change. In Unix the character appears as ^[ inside a file, but any time I run attempt to run a command on the line to split the character, it deletes the character after the ^[ and fails to even find a ^[.

Line appears as follows. With the ^[ representing the character above.

123456^[SomeString^[AnotherString^[0.00000 sec

I am attempting to get the time before the sec, but so far my commands have failed because character tends to delete the first number before the .
My string ends up

123456omeStringnotherString.00000 sec

The Strings may contain * characters, Spaces, and Underscores making it difficult to split.

and I can not retrieve the correct number of seconds.
Standard Unix awk, grep, and other commands have been attempted.
Attempts to use Perl's Regular expression on just the ^ return no results
Expand|Select|Wrap|Line Numbers
  1. @cutline = split(/\^/, $line, 5);
splitting on \[ also produces no results.

If Anyone can offer some help, it would be appreciated.
(I've noticed the Character doesn't appear at all in the forum either...how fun. It looks a bit like this <- )
Feb 5 '08 #1
6 1281
4,059 Expert 2GB
post some of the lines from the file or better attach some of them to a post.
Feb 5 '08 #2
As requested here are a couple lines of the file as seen in WordPad

000000*XXX_XXXXX_XXXX0.305687 sec
000003xxx*XXXX_XXXX_XXXX_XXXX0.046740 sec

As seen in vi
000000^[*^[XXX_XXXXX_XXXX^[0.305687 sec

Replace the x's with whatever you'd like, it can be any length and include spaces. the X's have varying lengths as well. I'm trying to retrieve the time in seconds which can range from less then 1 second to more then 10 so I can't know exactly how many numbers to retrive prior to the space.

Once again, the delimter that is used, (which shows in WordPad) will probably not be seen. It is a Left pointing arrow, and I havn't been able to find an ASCII value of it yet.
Feb 5 '08 #3
971 Expert 512MB
Since I can not see the actual delimiter here is an example using [ as the delimiter. The regex I am using is greedy which can be fine tuned for your needs. Essentially what the regex does is looks for the last delimiter, then gets the data following it provided it starting with the numeric character until the end of the string.

Expand|Select|Wrap|Line Numbers
  1. my @sec_array;
  2. my @data = ('000000^[some string here^[000.00 sec', 
  3.             '000000^[some string here^[111.11 sec', 
  4.             '000000^[some string here^[222.22 sec',  
  5.             '000000^[some string here^[333.33 sec');
  7.     for(@data) {
  8.         push @sec_array, $_ =~ /\[(\d+.*)$/g;
  9.     }
  11. print join("\n", @sec_array);
000.00 sec
111.11 sec
222.22 sec
333.33 sec

Feb 6 '08 #4
I have been able to get the sequence after a delimiter in the past. The problem in this case is that I can not find a matching expression for the delimiter, since that doesn't seem possible at the moment (if I can't get the blasted thing to show); Is there was a way to retrieve the time going backwards? What I would like then is, from the final space before 'sec', to retrieve X many digits a '.' then 1 or 2 more digits going in reverse. Or if that isn't feasible, find the first digit then step back and pick up the double until the space. I've been looking through my books and online for the subject, but keep coming up short. I appreciate the help.
Feb 6 '08 #5
971 Expert 512MB
Using the code I posted above you can set the \d (digits) to a minimum and maximum number if you wish.

Expand|Select|Wrap|Line Numbers
  1. excerpt from perlre
  2. {n}    Match exactly n times
  3. {n,}   Match at least n times
  4. {n,m}  Match at least n but not more than m times
As an example if you changed the regex above to this then, you can tell it the minimum and maximum numeric characters to allow.

Expand|Select|Wrap|Line Numbers
  1. push @sec_array, $_ =~ /\[(\d{1,3}\.\d{2}).*$/g;
Feb 6 '08 #6
Thanks for the help. I've found that the delimiter is detected if I just have it detect everything other then a character or digit. Your code came in handy. Thanks again.
Feb 6 '08 #7

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

21 posts views Thread by tomasio | last post: by
2 posts views Thread by Ron | last post: by
4 posts views Thread by sushi | last post: by
10 posts views Thread by Claud Balls | last post: by
6 posts views Thread by tbh | last post: by
2 posts views Thread by Mr Shore | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.