473,386 Members | 1,973 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

extract repeating text segments

I have an archive file (PIDATA) that contains multiple (>30) segments of text like this:

Expand|Select|Wrap|Line Numbers
  1. Archive[0]: d:\archives\piarch.012  (500MB, Used: 9.0%)
  2.         PIarcfilehead[$Workfile: piarfile.cxx $ $Revision: 114 $]::
  3.           Version: 5 Path: d:\archives\piarch.012
  4.           State: 4 Type: 0 (fixed) Write Flag: 1 Shift Flag: 1
  5.           Record Size: 1024 Count: 512000  Add Rate/Hour: 4118.3
  6.           Offsets: Primary: 25853/128000 Overflow: 491596/512000
  7.                Start Time: 1-Apr-08 22:02:38
  8.                  End Time: Current Time
  9.               Backup Time: 2-Apr-08 02:01:07
The program repeats this over and over, naming each segment "Archive[1,2,3..]" and I need to extract the bold sections, and print them on one line... for example, I'd like THIS:

Archive[0]: d:\archives\piarch.012 (500MB, Used: 9.0%) ..... Start Time: 1-Apr-08 22:02:38 ..... End Time: Current Time

ALL on one line.

Here's my PERL script, but it doesn't seem to work.

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2. while(<PIDATA>){
  3.  
  4.         if (m/Archive.[\d+].*/) {
  5.                 $m1 =~ "$MATCH";
  6. }
  7.         if (m/Start\sTime.*/) {
  8.                 $m2 =~ "$MATCH";
  9. }
  10.         if (m/End\sTime.*/) {
  11.                 $m3 =~ "$MATCH";
  12. }
  13. print "$m1 \s $m2 \s $m3\n\n";
  14.  
  15. }
  16.  
I tried to loop over the text file, and redirect the output, but the file is empty after running this.

HELP!
Apr 3 '08 #1
5 1536
nithinpes
410 Expert 256MB
The square brackets inside pattern need to be escaped, else it will be mistaken for character class. Also, what is the $MATCH that you are trying to match after matching the desired pattern.
From your description, I feel all you need is to print out those emphasised lines. Try this:
Expand|Select|Wrap|Line Numbers
  1. while(<PIDATA>){
  2.      print $_ if((/^\s*Archive\[\d+\].*/)||(/^\s*Start\s+Time.*/)||(/^\s*End\s+Time.*/)) ;
  3. }
  4.  
Apr 3 '08 #2
I've removed my code and placed yours into the file piarchive.pl.... NO errors, but when I redirect output, the file is empty. Something's wrong...
I run this:

perl piarchive.pl > output

And I get the file "output" but it's empty.
Apr 3 '08 #3
nithinpes
410 Expert 256MB
I've removed my code and placed yours into the file piarchive.pl.... NO errors, but when I redirect output, the file is empty. Something's wrong...
I run this:

perl piarchive.pl > output

And I get the file "output" but it's empty.
For the given sample data, I got the desired output.
Expand|Select|Wrap|Line Numbers
  1. open(PIDATA,"data.txt") or die "open failed:$!";
  2. while(<PIDATA>){
  3.  print $_ if((/^\s*Archive\[\d+\].*/)||(/^\s*Start\s+Time.*/)||(/^\s*End\s+Time.*/)) ;
  4. }
  5.  
  6.  
On command line, I executed the following line:
Expand|Select|Wrap|Line Numbers
  1. perl archive.pl > C:\\output.txt
  2.  
The file output.txt contained:
Expand|Select|Wrap|Line Numbers
  1. Archive[0]: d:\archives\piarch.012  (500MB, Used: 9.0%)
  2.                Start Time: 1-Apr-08 22:02:38
  3.                  End Time: Current Time
  4.  
If you want this in one line, modify
Expand|Select|Wrap|Line Numbers
  1. while(<PIDATA>){
  2.  print $_ if((/^\s*Archive\[\d+\].*/)||(/^\s*Start\s+Time.*/)||(/^\s*End\s+Time.*/)) ;
  3. }
  4.  
  5.  
to:

Expand|Select|Wrap|Line Numbers
  1. while(<PIDATA>){
  2.  chomp;
  3.  print "$_ ..." if((/^\s*Archive\[\d+\].*/)||(/^\s*Start\s+Time.*/)||(/^\s*End\s+Time.*/)) ;
  4. }
  5.  
  6.  
Apr 4 '08 #4
nithinpes
410 Expert 256MB
Alternately, you can write into the output file within the script:
Expand|Select|Wrap|Line Numbers
  1. open(PIDATA,"data.txt") or die "open failed:$!";
  2. open(OUT,"output.txt") or die "create failed:$!";
  3. while(<PIDATA>){
  4.  chomp;
  5.  print OUT "$_ ..." if((/^\s*Archive\[\d+\].*/)||(/^\s*Start\s+Time.*/)||(/^\s*End\s+Time.*/)) ;
  6. }
  7.  
Apr 4 '08 #5
Thanks for the help!
Apr 7 '08 #6

Sign in to post your reply or Sign up for a free account.

Similar topics

0
by: dino07 | last post by:
Hi All, I am currently trying to do the following: 1. insert a value into a repeating table from a drop-down list( secondary storage) when the user click the "Add" button positioned next to the...
11
by: Christoph Boget | last post by:
When building a form using Infopath, you can define a repeating section and stick form fields in that section. I'm curious if ASP.NET has a similar control to make it easy to design something...
9
by: Scott Reynolds | last post by:
Hello! Could someone please provide me a sample, how to extract filename from url? http://www.mydomain.com/eng/Default.aspx -> Default.aspx Thanks Scott
6
by: Ben | last post by:
Hi We have a Dataset that has been populated from the output parameter of a Stored Procedure (@Output). I understand that I can extract a single item when the dataset is populated by a table...
7
by: kr | last post by:
Hi All, Suppose I consider a sample program as given below:- #include<stdio.h> #include<stdlib.h> int i; int main() { char *test(int i); char *tmp = NULL;
1
by: mark4asp | last post by:
Apologies, I just can't get my head around xslt but I need to do this. I have an xml file with two attributes per product. One of the attributes repeats to produce several groups (3 in the...
0
by: wbw | last post by:
I am trying to extract capitalized words from text in Excel. I have a list of a combination of brands and products and I am trying to extract out the product attribute from the text. Since the text...
5
by: Steve | last post by:
Hi all Does anybody please know a way to extract an Image from a pdf file and save it as a TIFF? I have used a scanner to scan documents which are then placed on a server, but I need to...
0
by: aaragon | last post by:
Hello everybody, I have an interesting problem for which I still don't have a solution. Imagine that you're working with points in two-dimesional space, so the point class should be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.