473,385 Members | 1,693 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Skipping null/empty fields caught by split()

sicarie
4,677 Expert Mod 4TB
I am attempting to parse a CSV, but am not allowed to install the CSV parsing module because of "security reasons" (what a joke), so I'm attempting to use 'split' to break up a comma-delimited file.

My issue is that as soon as an "empty" field comes up (two commas in a row), split seems to think the line is done and goes to the next one.

Everything I've read online says that split will return a null field, but I don't know how to get it to go to the next element and not just skip to the next line.

Expand|Select|Wrap|Line Numbers
  1. while (<INFILE>) {
  2.    # use 'split' to avoid module-dependent functionality
  3.    # split line on commas, OS info in [3] (4th group, but
  4.    # counting starts first element at 0)
  5.  
  6.    # line = <textonly>,<text+num>,<ip>,<whatIwant>,
  7.    chomp($_); 
  8.    @a_splitLine = split (/,/, $_);
  9.  
  10.    # move OS info out of string to avoid accidentally
  11.    # parsing over stuff
  12.    $s_info = $a_splitLine[3];
  13.  
Could anyone see either a better way to accomplish what I'm trying to do, or help get split to capture all the elements?

I was thinking I could run a simple substitution before parsing of a known string (something ridiculous that'll never show up in my data - like &^%$#), then split, and then when printing, if that matches the current item, just print some sort of whitespace, but that doesn't sound like the best method to me - like I'm overcomplicating it.
Jun 19 '09 #1
5 11896
RonB
589 Expert Mod 512MB
My issue is that as soon as an "empty" field comes up (two commas in a row), split seems to think the line is done and goes to the next one.
No it doesn't. You have a flawed impression of what's happening.

Expand|Select|Wrap|Line Numbers
  1. C:\TEMP>type test.pl
  2. #!/usr/bin/perl
  3.  
  4. use strict;
  5. use warnings;
  6. use Data::Dumper;
  7.  
  8. my $str = 'a,,,b,,,,6,,';
  9. my @fields = split /,/, $str;
  10. print Dumper @fields;
  11.  
Expand|Select|Wrap|Line Numbers
  1. C:\TEMP>test.pl
  2. $VAR1 = 'a';
  3. $VAR2 = '';
  4. $VAR3 = '';
  5. $VAR4 = 'b';
  6. $VAR5 = '';
  7. $VAR6 = '';
  8. $VAR7 = '';
  9. $VAR8 = '6';
  10.  
Expand|Select|Wrap|Line Numbers
  1. C:\TEMP>perldoc -f split
  2.     split /PATTERN/,EXPR,LIMIT
  3.     split /PATTERN/,EXPR
  4.     split /PATTERN/
  5.     split   Splits the string EXPR into a list of strings and returns that
  6.             list. By default, empty leading fields are preserved, and empty
  7.             trailing ones are deleted. (If all fields are empty, they are
  8.             considered to be trailing.)
  9. ....
  10. ....
  11. ....
  12.  
Jun 19 '09 #2
sicarie
4,677 Expert Mod 4TB
Interesting, so then how would I access the b or the 6?

Expand|Select|Wrap|Line Numbers
  1. #!/bin/perl
  2.  
  3. use strict;
  4. use warnings;
  5. use Data::Dumper;
  6.  
  7. my $str = 'a,,,b,,,,6,,';
  8. my @fields = split /,/, $str;
  9. my $n = 0;
  10. print Dumper @fields;
  11. while ($fields[$n]) {
  12.    print "$n: $fields[$n]\n";
  13.    $n++;
  14. }
  15. print "done!\n";
  16.  
Expand|Select|Wrap|Line Numbers
  1. $ ./splitTest.pl
  2. $VAR1 = 'a';
  3. $VAR2 = '';
  4. $VAR3 = '';
  5. $VAR4 = 'b';
  6. $VAR5 = '';
  7. $VAR6 = '';
  8. $VAR7 = '';
  9. $VAR8 = '6';
  10. 0: a
  11. done!
  12.  
In the above, my attempt to print with a while loop stops as soon as the first empty set is reached. I'm guessing I'd have to check each one to see which are valid and which are not, but what am I looking for - null?
Jun 19 '09 #3
RonB
589 Expert Mod 512MB
If you know which field/index you want, then simply print that field.

If you want/need to loop over the array elements, then use a for or foreach loop, not a while loop.
Expand|Select|Wrap|Line Numbers
  1. for my $i ( 0..$#fields ) {
  2.     # only print fields that have a value
  3.     print "induce $i = '$fields[$i]'\n" if length $fields[$i];
  4. }
  5.  
Jun 19 '09 #4
numberwhun
3,509 Expert Mod 2GB
I have to agree with Ron. Since this is a csv file, you should already know which field is what. All you would have to do is reference it by its index. Otherwise, you can use the code above to iterate through each one and pull out the variables with values other than null.

Regards,

Jeff
Jun 20 '09 #5
sicarie
4,677 Expert Mod 4TB
Cool, thanks. I am really only interested in one of those fields, but then have to make sure once I edit that field, I re-append all the others back on, so I will play around with that.

Thanks again!
Jun 23 '09 #6

Sign in to post your reply or Sign up for a free account.

Similar topics

9
by: Chuck Anderson | last post by:
I have a function with 7 inputs. The last three have default values. I want to call that function specifying the first four, skip two and then specify the last. I thought I could write this...
17
by: Douglas Alan | last post by:
Is there a canonical way of iterating over the lines of a file that are null-separated rather than newline-separated? Sure, I can implement my own iterator using read() and split(), etc., but...
3
by: Peter Neumaier | last post by:
Hi! I got a column, with different dates in it (Ddata type "nvarchar")... when running a SELECT on this column, I'm trying to filter those recordsets out, WHERE this column is NULL (I checked...
12
by: AFN | last post by:
I am running the code below to generate XML from a data table. But some fields in the data table are Null for every record. Suppose field5 has a null database value. I would expect to see: ...
19
by: David Logan | last post by:
We need an additional function in the String class. We need the ability to suppress empty fields, so that we can more effectively parse. Right now, multiple whitespace characters create multiple...
6
by: john | last post by:
Last week I posted about making a unique index on multiple fields to prevent importing identical records twice. I still have trouble with the nulls in the index. The only way that I can make it...
23
by: sandy | last post by:
I need (okay, I want) to make a dynamic array of my class 'Directory', within my class Directory (Can you already smell disaster?) Each Directory can have subdirectories so I thought to put these...
3
gingawarrior
by: gingawarrior | last post by:
Hi All, I've written some code to act as a validator for a set of form fields - it accepts a string of '~' delimited item names and then loops through to evaluate if they are either empty or =='999'...
4
by: BibI | last post by:
Hi there, I just started programming with PERL and am trying to put together my first little data manipulation program. I am working on a MAC with OSX. I have a data file with the following...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.