469,365 Members | 1,769 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,365 developers. It's quick & easy.

Perl script to parse Excel file into mysql database

Hi everybody,

I am a real newbie in both perl and relational databases like mysql, and I have been banging my head on the wall trying to understand how to populate a mysql database using an Excel file (.csv).

I constructed a my sql database (called “test”), set up the DBD::mysql module, read a book on perl, but I still cannot figure out how to approach this problem, so I resort to the experts…Could you please help me understand how to approach this?

The database on mysql has tables where each one is related to the other through foreign keys, so for example table_2 is:
Expand|Select|Wrap|Line Numbers
  1. CREATE TABLE table_2 (
  2.   table_2_id int(10) unsigned NOT NULL AUTO_INCREMENT,
  3.   table_1_id int(10) NOT NULL, 
  4.   binary_assign varchar(10) NOT NULL,
  5.   reference tinyint(1) NOT NULL,
  7.   PRIMARY KEY (table_2_id),
  8. );
Now, my Excel file has 4 fields, with the first 2 fields that should go into the table_1, and the next two columns that should BOTH go into table_2 (table_2 is related to table_1 through the foreign key “ tabke_1_id”), but I am also looking for a way to record which column these values came from, by filling in another field in mysql with 0 if they came from the column “REF” and 1 if they came from “ALT”.

-----Table_1----------- ----Table_2---
1234 syd G C
1235 brux C T

The first 2 field go into table_1, and the REF and ALT values go into table_2, but also record whether they came from the column “REF” or from the column “ALT” (if REF then the value of “reference” in mysql table is 0, while if ALT the value of “reference” is 1).

And the issue becomes even more complicated since the next columns contain information of the sample_id’s, one column for each sample_id, and each has a specific value that I need to insert specific for each of these fields…

Anyway if you could help me with the initial part that would be a great start, I am really stuck! Thank you so much!!
This is what I have done until now:
Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2. use strict;
  3. use warnings;
  4. use DBI();
  5. # Declare varaibles
  6. my $dbname = "test";
  7. my $user = "root";
  8. my $pass = "francy";
  10. #Connect to database or die
  11. my $dbh = DBI->connect("DBI:mysql:$dbname", "$user", "$pass") 
  12.             || die "Could not connect to database: $DBI::errstr";
  14. my $insert_table_2= $dbh->prepare(q{INSERT INTO table_2 (location, name) VALUES (?, ?)}) or die $dbh->errstr;
  16. #Open the file using filehandle
  17. my $file = shift(@ARGV);
  18. open (FILE, $file) or die "Couldn't read $file: $!"; 
  20. while (<FILE>)
  21.     {
  22.         chomp;
  23.         my @fields = split(',', $_);
  25.         my $loc = shift(@fields);
  26.         my $name = shift(@fields);
  28.        $insert_table_2->execute($loc, $name) or die $dbh->errstr;
  29.     }
  31. close (FILE);  
  32. $dbh->disconnect();
Feb 23 '11 #1
4 7685
3,503 Expert Mod 2GB
While a lot of people will tell you to probably use the Spreadsheet:ParseExcel module from CPAN, I typically take a different approach.

I like to export the spreadsheet to csv format, then just import that directly into the table I have predefined. Now, this assumes that the fields you have defined in the table match field for field. This is done right through the mysql interface.

Either way you choose, we will do our best to help you.


Feb 23 '11 #2
Hi all,

Thank you Jeff for your reply! Actually with the help of perl experts I have gone further and was able to run a script to parse the first 5 lines, so now I have location and name inserted in table1, and ref and alt in table2! The trick was actually to retrieve the id from table_1 using "selectrow_array", and enter the ref and alt values with 0 and 1 defined! Amazing! But still long way before I get to the end of this. Now I need to insert the values for the next columns, with each column representing a cell in Excel with information on location, name, and ref and alt for each sample_id...
I am open to suggestions of course while I will be trying to figure out this next step (I am still not at the benging my head with this one quite yet:))
Feb 24 '11 #3
(Sorry I meant the first 4 lines of my Excel, not 5)..
Feb 24 '11 #4
1,089 Expert 1GB
I am a real newbie in both perl and relational databases like mysql, and I have been banging my head on the wall trying to understand how to populate a mysql database using an Excel file (.csv).
If you really are using an excel file, then I second numberwhen's thought that you should be using the cpan module Spreadsheet::ParseExcel. There's no good reason why you should reinvent the wheel in order to manually parse such a file.

However, you stated that your file already has the extention .csv, which implies to me that it should be a comma separated values file, not .xls. If it really is such a file, then you should use Text::CSV to parse it.

However, as numberwhun stated, it ultimately is easier to deal with true csv files, so it might be nice if you just opened the file in excel and saved it as csv and then used Text::CSV as stated above.

Good luck,
- Miller
Feb 24 '11 #5

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

2 posts views Thread by sympatico | last post: by
6 posts views Thread by sejal17 | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.