By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,692 Members | 1,998 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,692 IT Pros & Developers. It's quick & easy.

How to get a duplicate rows based on the particular cell value in excel using Perl

P: 5
Hi,

I am currently working on excel validation using Perl. I am new to Excel validation but not for Perl. I have a question regarding one of the validation. Could anyone please tell me how to get the number of duplicate rows based on a particular cell value of each these duplicate rows. I mean all the cell values of a row will not be duplicated but a individual columns cell value will be duplicated and I need to create a separate excel with all these duplicates.

Eg :

Name Org Dep Place

RavindraN TCS Development Blore
Ravi TCS Scripting blore
Vimal AOL Development blore
Ram Aol Development blore
Priya aol Scripting blore

Here, I need to get the duplicate rows based on Dep. The output should be :

RavindraN TCS Development Blore
Vimal AOL Development blore
Ram Aol Development blore

and

Ravi TCS Scripting blore
Priya aol Scripting blore

Thanks
Ravi
Nov 10 '08 #1
Share this Question
Share on Google+
4 Replies


nithinpes
Expert 100+
P: 410
What have you tried so far?
Nov 10 '08 #2

P: 5
Hi,

Since I am new to excel validation, I am trying out the how to get the value and display them. I tried to get all the values of a particular cell and store it in a array so that I can try to get a duplicated from that cell.

I need to get the duplicate records based on the column "Dep".

Expand|Select|Wrap|Line Numbers
  1. #!/usr/local/bin/perl
  2.  
  3. use strict;
  4. use Spreadsheet::ParseExcel;
  5.  
  6. my %UniqueCell_Row = ();
  7. my $excel_name; 
  8.  
  9. $excel_name = "D:\\RAVINDRAN\\Excel-Email-sending\\name.xls";
  10. print "Excel Name -> <$excel_name> \n";
  11.  
  12. my $workbook = Spreadsheet::ParseExcel::Workbook->Parse($excel_name);
  13.  
  14. print "Book -> <$workbook> \n";
  15. my $cont_col;
  16. my $cell_value;
  17.  
  18. ### To get the particular columns all the value
  19. foreach my $sheet (@{$workbook->{Worksheet}}) {
  20.     printf("Sheet: %s\n", $sheet->{Name});
  21.     my @cols;
  22.  
  23.     foreach my $col ($sheet->{MinCol} .. $sheet->{MaxCol}) {
  24.         print "Column -> <$col> \n";
  25.         print "$sheet->{Cells}[0][$col]->{Val} \n";
  26.         if ($sheet->{Cells}[0][$col]->{Val} eq "Dep") {
  27.             $cont_col = $col;
  28.             print "Column Number -> $cont_col\n";
  29.             last;
  30.         }
  31.     }
  32.  
  33.     for(
  34.         my $iR = $sheet->{MinRow};
  35.         defined $sheet->{MaxRow} && $iR <= $sheet->{MaxRow}; 
  36.         $iR++
  37.     ){
  38.  
  39.         # fetch columns
  40.         my $cell = $sheet->{Cells}[$iR][$cont_col];
  41.         print "sheet->{Cells}[$iR][$cont_col] \n";
  42.         print "cell->{Val} ----> <$cell->{Val}> \n";
  43.         $cell_value = $cell->{Val};
  44.         push(@cols,$cell_value);
  45.         #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  46.     }
  47.  
  48.     print "Colllummmn : <@cols> \n";
  49. }
  50.  
  51. ## This is just to print all the columns value
  52. foreach my $sheet (@{$workbook->{Worksheet}}) {
  53.     printf("Sheet: %s\n", $sheet->{Name});
  54.     foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) {
  55.         foreach my $col ($sheet->{MinCol} ..  $sheet->{MaxCol}) {
  56.  
  57.             my $cell = $sheet->{Cells}[$row][$col];
  58.             my $cell_value = $cell->{Val};
  59.  
  60.             if ( $cell ) {
  61.                 printf("( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  62.                 $UniqueCell_Row{$row}{$col} = $cell_value;
  63.             }
  64.             if ( $cell && $cell_value eq "Dep" ) {
  65.                 printf("Department Cell Value is : ( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  66.                 #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  67.             }
  68.         }
  69.     }
  70. }
  71.  
  72. ## This is just to get the values of the record
  73. my @row_cont = ();
  74. foreach my $row ( sort keys %UniqueCell_Row ) {
  75.     foreach my $cell ( sort keys %{$UniqueCell_Row{$row}} ) {
  76.         print "Rows and Column -----> <$row> | <$cell> | $UniqueCell_Row{$row}{$cell} \n";
  77.     }
  78. }
Could you please help me on this.

Thanks
Nov 11 '08 #3

P: 5
Hi,

I have taken the duplicate records based on the particular cell value. I need to created excel file for all the duplicate records. Here I am facing problem with writing the header only once for all the new excels created here.

Here is the updated code :

Expand|Select|Wrap|Line Numbers
  1. #!/usr/local/bin/perl
  2.  
  3. use strict;
  4. use Spreadsheet::ParseExcel;
  5. use Spreadsheet::WriteExcel;
  6.  
  7. my %UniqueCell_Row = ();
  8. my $excel_name; 
  9.  
  10. $excel_name = "D:\\RAVINDRAN\\Excel-Email-sending\\name.xls";
  11. print "Excel Name -> <$excel_name> \n";
  12.  
  13. my $workbook = Spreadsheet::ParseExcel::Workbook->Parse($excel_name);
  14.  
  15. print "Book -> <$workbook> \n";
  16. my $cont_col;
  17. my $cell_value;
  18. my @cols;
  19.  
  20. ### To get the particular columns all the value
  21. foreach my $sheet (@{$workbook->{Worksheet}}) {
  22.     printf("Sheet: %s\n", $sheet->{Name});
  23.  
  24.     foreach my $col ($sheet->{MinCol} .. $sheet->{MaxCol}) {
  25.         print "Column -> <$col> \n";
  26.         print "$sheet->{Cells}[0][$col]->{Val} \n";
  27.         if ($sheet->{Cells}[0][$col]->{Val} eq "Dep") {
  28.             $cont_col = $col;
  29.             print "Column Number -> $cont_col\n";
  30.             last;
  31.         }
  32.     }
  33.  
  34.     for(
  35.         my $iR = $sheet->{MinRow};
  36.         defined $sheet->{MaxRow} && $iR <= $sheet->{MaxRow}; 
  37.         $iR++
  38.     ){
  39.  
  40.         # fetch columns
  41.         my $cell = $sheet->{Cells}[$iR][$cont_col];
  42.         print "sheet->{Cells}[$iR][$cont_col] \n";
  43.         print "cell->{Val} ----> <$cell->{Val}> \n";
  44.         $cell_value = $cell->{Val};
  45.         push(@cols,$cell_value);
  46.         #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  47.     }
  48.  
  49.     print "Colllummmn : <@cols> \n";
  50. }
  51.  
  52. ## This is just to print all the columns value
  53. foreach my $sheet (@{$workbook->{Worksheet}}) {
  54.     printf("Sheet: %s\n", $sheet->{Name});
  55.     foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) {
  56.         foreach my $col ($sheet->{MinCol} ..  $sheet->{MaxCol}) {
  57.  
  58.             my $cell = $sheet->{Cells}[$row][$col];
  59.             my $cell_value = $cell->{Val};
  60.  
  61.             if ( $cell ) {
  62.                 printf("( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  63.                 $UniqueCell_Row{$row}{$col} = $cell_value;
  64.             }
  65.             if ( $cell && $cell_value eq "Dep" ) {
  66.                 printf("Department Cell Value is : ( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  67.                 #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  68.             }
  69.         }
  70.     }
  71. }
  72.  
  73. ## This is just to get the unique record
  74. my @row_cont = ();
  75. foreach my $row ( sort keys %UniqueCell_Row ) {
  76.     foreach my $cell ( sort keys %{$UniqueCell_Row{$row}} ) {
  77.         print "Rows and Column -----> <$row> | <$cell> | $UniqueCell_Row{$row}{$cell} \n";
  78.         #if ( $UniqueCell_Row{$row}{$cont_col}
  79.  
  80.     }
  81. }
  82.  
  83. print "========================================= \n";
  84.  
  85. my @Dup;
  86. my %uniqISU = ();
  87. my $duplicate;
  88. my $unique;
  89.  
  90. foreach my $row ( sort keys %UniqueCell_Row ) {
  91.     foreach my $cell ( sort keys %{$UniqueCell_Row{$row}} ) {
  92.         print "Rows and Column -----> <$row> | <$cell> | $UniqueCell_Row{$row}{$cont_col} \n";
  93.         if ( grep( /$UniqueCell_Row{$row}{$cont_col}/, @Dup ) ) {
  94.             print "Present : <$UniqueCell_Row{$row}{$cont_col}> | <@cols> \n";
  95.             $duplicate = $UniqueCell_Row{$row}{$cont_col};
  96.             $uniqISU{$duplicate}{$row}{$cell} = $UniqueCell_Row{$row}{$cell};
  97.         }
  98.         else {
  99.             print "NOT PRESENT : <$UniqueCell_Row{$row}{$cont_col}> \n";
  100.             push @Dup, $UniqueCell_Row{$row}{$cont_col};
  101.             $duplicate = $UniqueCell_Row{$row}{$cont_col};
  102.             $uniqISU{$duplicate}{$row}{$cell} = $UniqueCell_Row{$row}{$cell};
  103.         }
  104.     }
  105. }
  106.  
  107. print "***************************************************** \n";
  108. foreach my $cell_val ( sort keys %uniqISU ) {
  109.     print "Cell Value ---> <$cell_val> \n";
  110.     foreach my $rows ( sort keys %{$uniqISU{$cell_val}} ) {
  111.         foreach my $cell ( sort keys %{$uniqISU{$cell_val}{$rows}} ) {
  112.             print "Cell_val -> <$cell_val> | rows -> <$rows> | columns -> <$cell> | Col Val -> <$uniqISU{$cell_val}{$rows}{$cell}> \n";
  113.         }
  114.     }
  115. }
  116.  
  117. my $format;
  118. my ($header1,$header2,$header3);
  119. my %header;
  120.  
  121. foreach my $cell_val ( sort keys %uniqISU ) {
  122.     my $header = 1;
  123.     my $workbook;
  124.     my $sheet1;
  125.  
  126.     if ( $cell_val ne "Dep" ) {
  127.         $workbook = Spreadsheet::WriteExcel->new("$cell_val.xls");
  128.         $sheet1 = $workbook->add_worksheet("$cell_val");
  129.  
  130.         # Add a Format
  131.         $format = $workbook->add_format();
  132.         $format->set_bold();
  133.         $format->set_size(12);
  134.         $format->set_color('blue');
  135.         $format->set_align('center');
  136.  
  137.         $sheet1->set_column(0, 0);
  138.     }
  139.  
  140.  
  141.     foreach my $rows ( sort keys %{$uniqISU{$cell_val}} ) {
  142.         foreach my $cell ( sort keys %{$uniqISU{$cell_val}{$rows}} ) {
  143.             if ( $header == 1 && $cell_val eq "Dep" ) {
  144.                 print "Header Detail ************ $rows, $cell, $uniqISU{$cell_val}{$rows}{$cell}\n";
  145.                 $header{$cell_val}{$rows}{$cell} = $uniqISU{$cell_val}{$rows}{$cell};
  146.             }
  147.             elsif ( $cell_val ne "Dep" ) {
  148.                 print "$header{$cell_val},$header{$cell_val}{$rows},$header{$cell_val}{$rows}{$cell} \n";
  149.                 $sheet1->write($header{$cell_val},$header{$cell_val}{$rows},$header{$cell_val}{$rows}{$cell}) if ( $header == 1 );
  150.                 $sheet1->write($rows, $cell, $uniqISU{$cell_val}{$rows}{$cell});
  151.             }
  152.             $header++;
  153.         }
  154.     }
  155. }
Nov 12 '08 #4

P: 1
hi ,

Just want to let you know that the code that you have pasted helped me understand how to solve one of my problem..i was new to perl and wondering where/how to start and it gave me a good starting point..

thanks,
vijay
Mar 27 '09 #5

Post your reply

Sign in to post your reply or Sign up for a free account.