423,131 Members | 2,127 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 423,131 IT Pros & Developers. It's quick & easy.

Skipping null/empty fields caught by split()

sicarie
Expert Mod 2.5K+
P: 4,677
I am attempting to parse a CSV, but am not allowed to install the CSV parsing module because of "security reasons" (what a joke), so I'm attempting to use 'split' to break up a comma-delimited file.

My issue is that as soon as an "empty" field comes up (two commas in a row), split seems to think the line is done and goes to the next one.

Everything I've read online says that split will return a null field, but I don't know how to get it to go to the next element and not just skip to the next line.

Expand|Select|Wrap|Line Numbers
  1. while (<INFILE>) {
  2.    # use 'split' to avoid module-dependent functionality
  3.    # split line on commas, OS info in [3] (4th group, but
  4.    # counting starts first element at 0)
  5.  
  6.    # line = <textonly>,<text+num>,<ip>,<whatIwant>,
  7.    chomp($_); 
  8.    @a_splitLine = split (/,/, $_);
  9.  
  10.    # move OS info out of string to avoid accidentally
  11.    # parsing over stuff
  12.    $s_info = $a_splitLine[3];
  13.  
Could anyone see either a better way to accomplish what I'm trying to do, or help get split to capture all the elements?

I was thinking I could run a simple substitution before parsing of a known string (something ridiculous that'll never show up in my data - like &^%$#), then split, and then when printing, if that matches the current item, just print some sort of whitespace, but that doesn't sound like the best method to me - like I'm overcomplicating it.
Jun 19 '09 #1
Share this Question
Share on Google+
5 Replies


Expert Mod 100+
P: 588
My issue is that as soon as an "empty" field comes up (two commas in a row), split seems to think the line is done and goes to the next one.
No it doesn't. You have a flawed impression of what's happening.

Expand|Select|Wrap|Line Numbers
  1. C:\TEMP>type test.pl
  2. #!/usr/bin/perl
  3.  
  4. use strict;
  5. use warnings;
  6. use Data::Dumper;
  7.  
  8. my $str = 'a,,,b,,,,6,,';
  9. my @fields = split /,/, $str;
  10. print Dumper @fields;
  11.  
Expand|Select|Wrap|Line Numbers
  1. C:\TEMP>test.pl
  2. $VAR1 = 'a';
  3. $VAR2 = '';
  4. $VAR3 = '';
  5. $VAR4 = 'b';
  6. $VAR5 = '';
  7. $VAR6 = '';
  8. $VAR7 = '';
  9. $VAR8 = '6';
  10.  
Expand|Select|Wrap|Line Numbers
  1. C:\TEMP>perldoc -f split
  2.     split /PATTERN/,EXPR,LIMIT
  3.     split /PATTERN/,EXPR
  4.     split /PATTERN/
  5.     split   Splits the string EXPR into a list of strings and returns that
  6.             list. By default, empty leading fields are preserved, and empty
  7.             trailing ones are deleted. (If all fields are empty, they are
  8.             considered to be trailing.)
  9. ....
  10. ....
  11. ....
  12.  
Jun 19 '09 #2

sicarie
Expert Mod 2.5K+
P: 4,677
Interesting, so then how would I access the b or the 6?

Expand|Select|Wrap|Line Numbers
  1. #!/bin/perl
  2.  
  3. use strict;
  4. use warnings;
  5. use Data::Dumper;
  6.  
  7. my $str = 'a,,,b,,,,6,,';
  8. my @fields = split /,/, $str;
  9. my $n = 0;
  10. print Dumper @fields;
  11. while ($fields[$n]) {
  12.    print "$n: $fields[$n]\n";
  13.    $n++;
  14. }
  15. print "done!\n";
  16.  
Expand|Select|Wrap|Line Numbers
  1. $ ./splitTest.pl
  2. $VAR1 = 'a';
  3. $VAR2 = '';
  4. $VAR3 = '';
  5. $VAR4 = 'b';
  6. $VAR5 = '';
  7. $VAR6 = '';
  8. $VAR7 = '';
  9. $VAR8 = '6';
  10. 0: a
  11. done!
  12.  
In the above, my attempt to print with a while loop stops as soon as the first empty set is reached. I'm guessing I'd have to check each one to see which are valid and which are not, but what am I looking for - null?
Jun 19 '09 #3

Expert Mod 100+
P: 588
If you know which field/index you want, then simply print that field.

If you want/need to loop over the array elements, then use a for or foreach loop, not a while loop.
Expand|Select|Wrap|Line Numbers
  1. for my $i ( 0..$#fields ) {
  2.     # only print fields that have a value
  3.     print "induce $i = '$fields[$i]'\n" if length $fields[$i];
  4. }
  5.  
Jun 19 '09 #4

numberwhun
Expert Mod 2.5K+
P: 3,503
I have to agree with Ron. Since this is a csv file, you should already know which field is what. All you would have to do is reference it by its index. Otherwise, you can use the code above to iterate through each one and pull out the variables with values other than null.

Regards,

Jeff
Jun 20 '09 #5

sicarie
Expert Mod 2.5K+
P: 4,677
Cool, thanks. I am really only interested in one of those fields, but then have to make sure once I edit that field, I re-append all the others back on, so I will play around with that.

Thanks again!
Jun 23 '09 #6

Post your reply

Sign in to post your reply or Sign up for a free account.