By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
426,115 Members | 919 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 426,115 IT Pros & Developers. It's quick & easy.

Erasing or Skipping lines in a data file

P: 3
Hi there,
I just started programming with PERL and am trying to put together my first little data manipulation program. I am working on a MAC with OSX.

I have a data file with the following header that has been created on a Windows XP machine:

------ Begin Next Fly'm ------


"Begin Fly'm (Thu Jul 24 01:22:02 2008)"
"Ions Flown Separately, Comp Quality(100)"
"Number of Ions to Fly = 200000"
"Changes:","Mass","Charge","X","Y","Z","KE","Azm", "Elv","Time of Birth"
" ","YES","NO ","NO ","NO ","NO ","YES","YES","YES","NO "


"Ion N","Events","TOF","Mass","X","KE","KE Error"

1,1,0,2.01402,9.53,1,1.28051e-009
1,4,0.725546,2.01402,178,3181.59,0.542952


My goal is to get rid of the header and the empty lines to finally have an output file with only the number entries.

I've put together some lines that -IMO- should work, but they don't and I don't really get it why they don't do their work.


Expand|Select|Wrap|Line Numbers
  1. # read file that is given in STDIN
  2. $dat = shift or die "Need input file with data. \n";
  3.  
  4. # open files for read and for write
  5. open DAT,"< $dat" or die "Cannot read $dat\n";
  6. open OUT, "> output.txt" or die "Cannot open file!\n";
  7.  
  8.  
  9. # loop for reading input file
  10. while($dat = <DAT>){
  11.  
  12. # skip lines beginning with ", - and empty
  13.      next if $dat =~ /^\s*$|\"|-/;
  14.  
  15.     # next if $dat =~ /^\"/; 
  16.     # s/^-//;
  17.     # s/^"//;
  18.     #s/^\s*//;
  19.      chomp($dat);   
  20.  
  21. print OUT <DAT>; #print the contents of the input file into the ouput file
  22. }
  23.  
  24. # close files for reading and writing
  25. close DAT;
  26. close OUT;

The bold part is the line where the skipping /erasing should take place. I tried several different combinations of the command, put m/ before the expression and /i after and even had it split into three seperate commands that should skip ", - and whitespace:

Expand|Select|Wrap|Line Numbers
  1. next if $dat =~ /^\"/; 
for just erasing the " and the same for - and whitespace.

My output file looks like this when I run the program:



"Begin Fly'm (Thu Jul 24 01:22:02 2008)"
"Ions Flown Separately, Comp Quality(100)"
"Number of Ions to Fly = 200000"
"Changes:","Mass","Charge","X","Y","Z","KE","Azm", "Elv","Time of Birth"
" ","YES","NO ","NO ","NO ","NO ","YES","YES","YES","NO "


"Ion N","Events","TOF","Mass","X","KE","KE Error"

1,1,0,2.01402,9.53,1,1.28051e-009
1,4,0.725546,2.01402,178,3181.59,0.542952
There are some empty lines in the beginning and then the usual header minus the very first line.

I even tried to completely erase the header using the commands in line 16 and 17 of my code, but usually I get the following error for that attempt:

Use of uninitialized value in substitution (s///) at ./foo.pl line 16, <DAT> line 1.
Use of uninitialized value in substitution (s///) at ./foo.pl line 17, <DAT> line 1.


The skipping of lines worked to some extent with a simple little textfile I made up (except for getting rid of the empty lines), but it more or less fails when I try the same on my datafile.

Hopefully someone can help me with this problem.

Thanks a lot.
Aug 5 '08 #1
Share this Question
Share on Google+
4 Replies


P: 3
I just saw that the very first line of my data file gets erased whenever I run the program. No idea how this can happen.
Aug 5 '08 #2

nithinpes
Expert 100+
P: 410
I just saw that the very first line of my data file gets erased whenever I run the program. No idea how this can happen.

Modify the while() loop as below:
Expand|Select|Wrap|Line Numbers
  1. while($dat = <DAT>){
  2.  # skip lines beginning with ", - and empty
  3.      next if $dat =~ /(^\s*$)|(^\")|(^-)/;
  4.      print OUT $dat; #print the contents of the input file into the ouput file
  5. }
  6.  
The regex is changed to suit your need (skip lines beginning with ", - and empty), the regex you used would skip blank lines, lines containing " and - (not just the lines beginning with).
The chomp() line was removed to obtain lines of output. If you want all the lines having numbers in a single line output, you can include that.
The first line of data was getting removed because of this:

Expand|Select|Wrap|Line Numbers
  1. print OUT <DAT>;
  2.  
When you have already assigned <DAT> to $dat, using <DAT> again will read the next line and print it. You should be using $dat in this line.
Aug 6 '08 #3

KevinADC
Expert 2.5K+
P: 4,059
Expand|Select|Wrap|Line Numbers
  1. while(my $dat = <DAT>) {
  2.    next if ($dat =~ /^([-"])|^\s*$/);
  3.    print OUT $dat;
  4. }    
Aug 6 '08 #4

P: 3
Hey thanks very much.

That totally solved my problem. :)
Aug 6 '08 #5

Post your reply

Sign in to post your reply or Sign up for a free account.