By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,851 Members | 1,270 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,851 IT Pros & Developers. It's quick & easy.

An Odd Delimeter

P: 4
For some strange reason my file is split by the  character. I didn't choose the character, and it is, at the moment, unlikely to change. In Unix the character appears as ^[ inside a file, but any time I run attempt to run a command on the line to split the character, it deletes the character after the ^[ and fails to even find a ^[.

Line appears as follows. With the ^[ representing the character above.

123456^[SomeString^[AnotherString^[0.00000 sec

I am attempting to get the time before the sec, but so far my commands have failed because character tends to delete the first number before the .
My string ends up

123456omeStringnotherString.00000 sec

The Strings may contain * characters, Spaces, and Underscores making it difficult to split.

and I can not retrieve the correct number of seconds.
Standard Unix awk, grep, and other commands have been attempted.
Attempts to use Perl's Regular expression on just the ^ return no results
Expand|Select|Wrap|Line Numbers
  1. @cutline = split(/\^/, $line, 5);
  2.  
splitting on \[ also produces no results.

If Anyone can offer some help, it would be appreciated.
Edit:::
(I've noticed the Character doesn't appear at all in the forum either...how fun. It looks a bit like this <- )
Feb 5 '08 #1
Share this Question
Share on Google+
6 Replies


KevinADC
Expert 2.5K+
P: 4,059
post some of the lines from the file or better attach some of them to a post.
Feb 5 '08 #2

P: 4
As requested here are a couple lines of the file as seen in WordPad

000000*XXX_XXXXX_XXXX0.305687 sec
000003xxx*XXXX_XXXX_XXXX_XXXX0.046740 sec

As seen in vi
000000^[*^[XXX_XXXXX_XXXX^[0.305687 sec
000003^[xxx*^[XXXX_XXXX_XXXX_XXXX^[0.046740

Replace the x's with whatever you'd like, it can be any length and include spaces. the X's have varying lengths as well. I'm trying to retrieve the time in seconds which can range from less then 1 second to more then 10 so I can't know exactly how many numbers to retrive prior to the space.

Once again, the delimter that is used, (which shows in WordPad) will probably not be seen. It is a Left pointing arrow, and I havn't been able to find an ASCII value of it yet.
Feb 5 '08 #3

eWish
Expert 100+
P: 971
Since I can not see the actual delimiter here is an example using [ as the delimiter. The regex I am using is greedy which can be fine tuned for your needs. Essentially what the regex does is looks for the last delimiter, then gets the data following it provided it starting with the numeric character until the end of the string.

Expand|Select|Wrap|Line Numbers
  1. my @sec_array;
  2. my @data = ('000000^[some string here^[000.00 sec', 
  3.             '000000^[some string here^[111.11 sec', 
  4.             '000000^[some string here^[222.22 sec',  
  5.             '000000^[some string here^[333.33 sec');
  6.  
  7.     for(@data) {
  8.         push @sec_array, $_ =~ /\[(\d+.*)$/g;
  9.     }
  10.  
  11. print join("\n", @sec_array);
Prints
000.00 sec
111.11 sec
222.22 sec
333.33 sec

--Kevin
Feb 6 '08 #4

P: 4
I have been able to get the sequence after a delimiter in the past. The problem in this case is that I can not find a matching expression for the delimiter, since that doesn't seem possible at the moment (if I can't get the blasted thing to show); Is there was a way to retrieve the time going backwards? What I would like then is, from the final space before 'sec', to retrieve X many digits a '.' then 1 or 2 more digits going in reverse. Or if that isn't feasible, find the first digit then step back and pick up the double until the space. I've been looking through my books and online for the subject, but keep coming up short. I appreciate the help.
Feb 6 '08 #5

eWish
Expert 100+
P: 971
Using the code I posted above you can set the \d (digits) to a minimum and maximum number if you wish.

Expand|Select|Wrap|Line Numbers
  1. excerpt from perlre
  2. {n}    Match exactly n times
  3. {n,}   Match at least n times
  4. {n,m}  Match at least n but not more than m times
As an example if you changed the regex above to this then, you can tell it the minimum and maximum numeric characters to allow.

Expand|Select|Wrap|Line Numbers
  1. push @sec_array, $_ =~ /\[(\d{1,3}\.\d{2}).*$/g;
--Kevin
Feb 6 '08 #6

P: 4
Thanks for the help. I've found that the delimiter is detected if I just have it detect everything other then a character or digit. Your code came in handy. Thanks again.
Feb 6 '08 #7

Post your reply

Sign in to post your reply or Sign up for a free account.