Help | Site Map
Connecting Tech Pros Worldwide
Reply
 
LinkBack Thread Tools
  #1  
Old August 6th, 2008, 12:48 AM
Newbie
 
Join Date: Aug 2008
Posts: 3
Default Erasing or Skipping lines in a data file

Hi there,
I just started programming with PERL and am trying to put together my first little data manipulation program. I am working on a MAC with OSX.

I have a data file with the following header that has been created on a Windows XP machine:

Quote:
------ Begin Next Fly'm ------


"Begin Fly'm (Thu Jul 24 01:22:02 2008)"
"Ions Flown Separately, Comp Quality(100)"
"Number of Ions to Fly = 200000"
"Changes:","Mass","Charge","X","Y","Z","KE","Azm", "Elv","Time of Birth"
" ","YES","NO ","NO ","NO ","NO ","YES","YES","YES","NO "


"Ion N","Events","TOF","Mass","X","KE","KE Error"

1,1,0,2.01402,9.53,1,1.28051e-009
1,4,0.725546,2.01402,178,3181.59,0.542952


My goal is to get rid of the header and the empty lines to finally have an output file with only the number entries.

I've put together some lines that -IMO- should work, but they don't and I don't really get it why they don't do their work.


Expand|Select|Wrap|Line Numbers
  1. # read file that is given in STDIN
  2. $dat = shift or die "Need input file with data. \n";
  3.  
  4. # open files for read and for write
  5. open DAT,"< $dat" or die "Cannot read $dat\n";
  6. open OUT, "> output.txt" or die "Cannot open file!\n";
  7.  
  8.  
  9. # loop for reading input file
  10. while($dat = <DAT>){
  11.  
  12. # skip lines beginning with ", - and empty
  13.      next if $dat =~ /^\s*$|\"|-/;
  14.  
  15.     # next if $dat =~ /^\"/; 
  16.     # s/^-//;
  17.     # s/^"//;
  18.     #s/^\s*//;
  19.      chomp($dat);   
  20.  
  21. print OUT <DAT>; #print the contents of the input file into the ouput file
  22. }
  23.  
  24. # close files for reading and writing
  25. close DAT;
  26. close OUT;

The bold part is the line where the skipping /erasing should take place. I tried several different combinations of the command, put m/ before the expression and /i after and even had it split into three seperate commands that should skip ", - and whitespace:

Expand|Select|Wrap|Line Numbers
  1. next if $dat =~ /^\"/; 
for just erasing the " and the same for - and whitespace.

My output file looks like this when I run the program:

Quote:


"Begin Fly'm (Thu Jul 24 01:22:02 2008)"
"Ions Flown Separately, Comp Quality(100)"
"Number of Ions to Fly = 200000"
"Changes:","Mass","Charge","X","Y","Z","KE","Azm", "Elv","Time of Birth"
" ","YES","NO ","NO ","NO ","NO ","YES","YES","YES","NO "


"Ion N","Events","TOF","Mass","X","KE","KE Error"

1,1,0,2.01402,9.53,1,1.28051e-009
1,4,0.725546,2.01402,178,3181.59,0.542952
There are some empty lines in the beginning and then the usual header minus the very first line.

I even tried to completely erase the header using the commands in line 16 and 17 of my code, but usually I get the following error for that attempt:

Use of uninitialized value in substitution (s///) at ./foo.pl line 16, <DAT> line 1.
Use of uninitialized value in substitution (s///) at ./foo.pl line 17, <DAT> line 1.


The skipping of lines worked to some extent with a simple little textfile I made up (except for getting rid of the empty lines), but it more or less fails when I try the same on my datafile.

Hopefully someone can help me with this problem.

Thanks a lot.
Reply
  #2  
Old August 6th, 2008, 12:56 AM
Newbie
 
Join Date: Aug 2008
Posts: 3
Default

I just saw that the very first line of my data file gets erased whenever I run the program. No idea how this can happen.
Reply
  #3  
Old August 6th, 2008, 05:30 AM
nithinpes's Avatar
Expert
 
Join Date: Dec 2007
Age: 24
Posts: 365
Default

Quote:
Originally Posted by BibI
I just saw that the very first line of my data file gets erased whenever I run the program. No idea how this can happen.

Modify the while() loop as below:
Expand|Select|Wrap|Line Numbers
  1. while($dat = <DAT>){
  2.  # skip lines beginning with ", - and empty
  3.      next if $dat =~ /(^\s*$)|(^\")|(^-)/;
  4.      print OUT $dat; #print the contents of the input file into the ouput file
  5. }
  6.  
The regex is changed to suit your need (skip lines beginning with ", - and empty), the regex you used would skip blank lines, lines containing " and - (not just the lines beginning with).
The chomp() line was removed to obtain lines of output. If you want all the lines having numbers in a single line output, you can include that.
The first line of data was getting removed because of this:

Expand|Select|Wrap|Line Numbers
  1. print OUT <DAT>;
  2.  
When you have already assigned <DAT> to $dat, using <DAT> again will read the next line and print it. You should be using $dat in this line.
Reply
  #4  
Old August 6th, 2008, 05:32 AM
KevinADC's Avatar
Expert
 
Join Date: Jan 2007
Posts: 3,639
Default

Expand|Select|Wrap|Line Numbers
  1. while(my $dat = <DAT>) {
  2.    next if ($dat =~ /^([-"])|^\s*$/);
  3.    print OUT $dat;
  4. }    
Reply
  #5  
Old August 6th, 2008, 08:22 PM
Newbie
 
Join Date: Aug 2008
Posts: 3
Default

Hey thanks very much.

That totally solved my problem. :)
Reply
Reply

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over network members.
Post your question now . . .
It's fast and it's free

Popular Articles