473,372 Members | 1,000 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,372 software developers and data experts.

How to get a duplicate rows based on the particular cell value in excel using Perl

Hi,

I am currently working on excel validation using Perl. I am new to Excel validation but not for Perl. I have a question regarding one of the validation. Could anyone please tell me how to get the number of duplicate rows based on a particular cell value of each these duplicate rows. I mean all the cell values of a row will not be duplicated but a individual columns cell value will be duplicated and I need to create a separate excel with all these duplicates.

Eg :

Name Org Dep Place

RavindraN TCS Development Blore
Ravi TCS Scripting blore
Vimal AOL Development blore
Ram Aol Development blore
Priya aol Scripting blore

Here, I need to get the duplicate rows based on Dep. The output should be :

RavindraN TCS Development Blore
Vimal AOL Development blore
Ram Aol Development blore

and

Ravi TCS Scripting blore
Priya aol Scripting blore

Thanks
Ravi
Nov 10 '08 #1
4 5405
nithinpes
410 Expert 256MB
What have you tried so far?
Nov 10 '08 #2
Hi,

Since I am new to excel validation, I am trying out the how to get the value and display them. I tried to get all the values of a particular cell and store it in a array so that I can try to get a duplicated from that cell.

I need to get the duplicate records based on the column "Dep".

Expand|Select|Wrap|Line Numbers
  1. #!/usr/local/bin/perl
  2.  
  3. use strict;
  4. use Spreadsheet::ParseExcel;
  5.  
  6. my %UniqueCell_Row = ();
  7. my $excel_name; 
  8.  
  9. $excel_name = "D:\\RAVINDRAN\\Excel-Email-sending\\name.xls";
  10. print "Excel Name -> <$excel_name> \n";
  11.  
  12. my $workbook = Spreadsheet::ParseExcel::Workbook->Parse($excel_name);
  13.  
  14. print "Book -> <$workbook> \n";
  15. my $cont_col;
  16. my $cell_value;
  17.  
  18. ### To get the particular columns all the value
  19. foreach my $sheet (@{$workbook->{Worksheet}}) {
  20.     printf("Sheet: %s\n", $sheet->{Name});
  21.     my @cols;
  22.  
  23.     foreach my $col ($sheet->{MinCol} .. $sheet->{MaxCol}) {
  24.         print "Column -> <$col> \n";
  25.         print "$sheet->{Cells}[0][$col]->{Val} \n";
  26.         if ($sheet->{Cells}[0][$col]->{Val} eq "Dep") {
  27.             $cont_col = $col;
  28.             print "Column Number -> $cont_col\n";
  29.             last;
  30.         }
  31.     }
  32.  
  33.     for(
  34.         my $iR = $sheet->{MinRow};
  35.         defined $sheet->{MaxRow} && $iR <= $sheet->{MaxRow}; 
  36.         $iR++
  37.     ){
  38.  
  39.         # fetch columns
  40.         my $cell = $sheet->{Cells}[$iR][$cont_col];
  41.         print "sheet->{Cells}[$iR][$cont_col] \n";
  42.         print "cell->{Val} ----> <$cell->{Val}> \n";
  43.         $cell_value = $cell->{Val};
  44.         push(@cols,$cell_value);
  45.         #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  46.     }
  47.  
  48.     print "Colllummmn : <@cols> \n";
  49. }
  50.  
  51. ## This is just to print all the columns value
  52. foreach my $sheet (@{$workbook->{Worksheet}}) {
  53.     printf("Sheet: %s\n", $sheet->{Name});
  54.     foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) {
  55.         foreach my $col ($sheet->{MinCol} ..  $sheet->{MaxCol}) {
  56.  
  57.             my $cell = $sheet->{Cells}[$row][$col];
  58.             my $cell_value = $cell->{Val};
  59.  
  60.             if ( $cell ) {
  61.                 printf("( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  62.                 $UniqueCell_Row{$row}{$col} = $cell_value;
  63.             }
  64.             if ( $cell && $cell_value eq "Dep" ) {
  65.                 printf("Department Cell Value is : ( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  66.                 #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  67.             }
  68.         }
  69.     }
  70. }
  71.  
  72. ## This is just to get the values of the record
  73. my @row_cont = ();
  74. foreach my $row ( sort keys %UniqueCell_Row ) {
  75.     foreach my $cell ( sort keys %{$UniqueCell_Row{$row}} ) {
  76.         print "Rows and Column -----> <$row> | <$cell> | $UniqueCell_Row{$row}{$cell} \n";
  77.     }
  78. }
Could you please help me on this.

Thanks
Nov 11 '08 #3
Hi,

I have taken the duplicate records based on the particular cell value. I need to created excel file for all the duplicate records. Here I am facing problem with writing the header only once for all the new excels created here.

Here is the updated code :

Expand|Select|Wrap|Line Numbers
  1. #!/usr/local/bin/perl
  2.  
  3. use strict;
  4. use Spreadsheet::ParseExcel;
  5. use Spreadsheet::WriteExcel;
  6.  
  7. my %UniqueCell_Row = ();
  8. my $excel_name; 
  9.  
  10. $excel_name = "D:\\RAVINDRAN\\Excel-Email-sending\\name.xls";
  11. print "Excel Name -> <$excel_name> \n";
  12.  
  13. my $workbook = Spreadsheet::ParseExcel::Workbook->Parse($excel_name);
  14.  
  15. print "Book -> <$workbook> \n";
  16. my $cont_col;
  17. my $cell_value;
  18. my @cols;
  19.  
  20. ### To get the particular columns all the value
  21. foreach my $sheet (@{$workbook->{Worksheet}}) {
  22.     printf("Sheet: %s\n", $sheet->{Name});
  23.  
  24.     foreach my $col ($sheet->{MinCol} .. $sheet->{MaxCol}) {
  25.         print "Column -> <$col> \n";
  26.         print "$sheet->{Cells}[0][$col]->{Val} \n";
  27.         if ($sheet->{Cells}[0][$col]->{Val} eq "Dep") {
  28.             $cont_col = $col;
  29.             print "Column Number -> $cont_col\n";
  30.             last;
  31.         }
  32.     }
  33.  
  34.     for(
  35.         my $iR = $sheet->{MinRow};
  36.         defined $sheet->{MaxRow} && $iR <= $sheet->{MaxRow}; 
  37.         $iR++
  38.     ){
  39.  
  40.         # fetch columns
  41.         my $cell = $sheet->{Cells}[$iR][$cont_col];
  42.         print "sheet->{Cells}[$iR][$cont_col] \n";
  43.         print "cell->{Val} ----> <$cell->{Val}> \n";
  44.         $cell_value = $cell->{Val};
  45.         push(@cols,$cell_value);
  46.         #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  47.     }
  48.  
  49.     print "Colllummmn : <@cols> \n";
  50. }
  51.  
  52. ## This is just to print all the columns value
  53. foreach my $sheet (@{$workbook->{Worksheet}}) {
  54.     printf("Sheet: %s\n", $sheet->{Name});
  55.     foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) {
  56.         foreach my $col ($sheet->{MinCol} ..  $sheet->{MaxCol}) {
  57.  
  58.             my $cell = $sheet->{Cells}[$row][$col];
  59.             my $cell_value = $cell->{Val};
  60.  
  61.             if ( $cell ) {
  62.                 printf("( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  63.                 $UniqueCell_Row{$row}{$col} = $cell_value;
  64.             }
  65.             if ( $cell && $cell_value eq "Dep" ) {
  66.                 printf("Department Cell Value is : ( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  67.                 #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  68.             }
  69.         }
  70.     }
  71. }
  72.  
  73. ## This is just to get the unique record
  74. my @row_cont = ();
  75. foreach my $row ( sort keys %UniqueCell_Row ) {
  76.     foreach my $cell ( sort keys %{$UniqueCell_Row{$row}} ) {
  77.         print "Rows and Column -----> <$row> | <$cell> | $UniqueCell_Row{$row}{$cell} \n";
  78.         #if ( $UniqueCell_Row{$row}{$cont_col}
  79.  
  80.     }
  81. }
  82.  
  83. print "========================================= \n";
  84.  
  85. my @Dup;
  86. my %uniqISU = ();
  87. my $duplicate;
  88. my $unique;
  89.  
  90. foreach my $row ( sort keys %UniqueCell_Row ) {
  91.     foreach my $cell ( sort keys %{$UniqueCell_Row{$row}} ) {
  92.         print "Rows and Column -----> <$row> | <$cell> | $UniqueCell_Row{$row}{$cont_col} \n";
  93.         if ( grep( /$UniqueCell_Row{$row}{$cont_col}/, @Dup ) ) {
  94.             print "Present : <$UniqueCell_Row{$row}{$cont_col}> | <@cols> \n";
  95.             $duplicate = $UniqueCell_Row{$row}{$cont_col};
  96.             $uniqISU{$duplicate}{$row}{$cell} = $UniqueCell_Row{$row}{$cell};
  97.         }
  98.         else {
  99.             print "NOT PRESENT : <$UniqueCell_Row{$row}{$cont_col}> \n";
  100.             push @Dup, $UniqueCell_Row{$row}{$cont_col};
  101.             $duplicate = $UniqueCell_Row{$row}{$cont_col};
  102.             $uniqISU{$duplicate}{$row}{$cell} = $UniqueCell_Row{$row}{$cell};
  103.         }
  104.     }
  105. }
  106.  
  107. print "***************************************************** \n";
  108. foreach my $cell_val ( sort keys %uniqISU ) {
  109.     print "Cell Value ---> <$cell_val> \n";
  110.     foreach my $rows ( sort keys %{$uniqISU{$cell_val}} ) {
  111.         foreach my $cell ( sort keys %{$uniqISU{$cell_val}{$rows}} ) {
  112.             print "Cell_val -> <$cell_val> | rows -> <$rows> | columns -> <$cell> | Col Val -> <$uniqISU{$cell_val}{$rows}{$cell}> \n";
  113.         }
  114.     }
  115. }
  116.  
  117. my $format;
  118. my ($header1,$header2,$header3);
  119. my %header;
  120.  
  121. foreach my $cell_val ( sort keys %uniqISU ) {
  122.     my $header = 1;
  123.     my $workbook;
  124.     my $sheet1;
  125.  
  126.     if ( $cell_val ne "Dep" ) {
  127.         $workbook = Spreadsheet::WriteExcel->new("$cell_val.xls");
  128.         $sheet1 = $workbook->add_worksheet("$cell_val");
  129.  
  130.         # Add a Format
  131.         $format = $workbook->add_format();
  132.         $format->set_bold();
  133.         $format->set_size(12);
  134.         $format->set_color('blue');
  135.         $format->set_align('center');
  136.  
  137.         $sheet1->set_column(0, 0);
  138.     }
  139.  
  140.  
  141.     foreach my $rows ( sort keys %{$uniqISU{$cell_val}} ) {
  142.         foreach my $cell ( sort keys %{$uniqISU{$cell_val}{$rows}} ) {
  143.             if ( $header == 1 && $cell_val eq "Dep" ) {
  144.                 print "Header Detail ************ $rows, $cell, $uniqISU{$cell_val}{$rows}{$cell}\n";
  145.                 $header{$cell_val}{$rows}{$cell} = $uniqISU{$cell_val}{$rows}{$cell};
  146.             }
  147.             elsif ( $cell_val ne "Dep" ) {
  148.                 print "$header{$cell_val},$header{$cell_val}{$rows},$header{$cell_val}{$rows}{$cell} \n";
  149.                 $sheet1->write($header{$cell_val},$header{$cell_val}{$rows},$header{$cell_val}{$rows}{$cell}) if ( $header == 1 );
  150.                 $sheet1->write($rows, $cell, $uniqISU{$cell_val}{$rows}{$cell});
  151.             }
  152.             $header++;
  153.         }
  154.     }
  155. }
Nov 12 '08 #4
hi ,

Just want to let you know that the code that you have pasted helped me understand how to solve one of my problem..i was new to perl and wondering where/how to start and it gave me a good starting point..

thanks,
vijay
Mar 27 '09 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

13
by: Dixie | last post by:
How do I, in VBA from an access form module, add 5 rows to the top of a spreadsheet called MySpreadsheet.xls. The worksheet is called MyWorksheet and there is already data in the first 5 rows. I...
0
by: Peter | last post by:
I am having a problem reading an Excel file that is XML based. The directory I am reading contains Excel files that can be of two types. Either generic Microsoft based or XML based. I am reading...
0
by: Chris Becker | last post by:
I want to create a TableCell object once and add it to many tables: Dim cell As New TableCell cell.Style.Add("background-color", s) cell.Style.Add("cursor", "pointer") cell.Width =...
24
by: clare at snyder.on.ca | last post by:
I have a SQL query I need to design to select name and email addresses for policies that are due and not renewed in a given time period. The problem is, the database keeps the information for every...
18
by: Frank M. Walter | last post by:
Hello, I have made an small AddIn with udf for excel 2003. I use vs2003. The point of view is the function __T() I call it in excel sheet writing =__T() I am not able to set a value to a...
11
by: Tempo | last post by:
Hello. I am getting the error that is displayed below, and I know exactly why it occurs. I posted some of my program's code below, and if you look at it you will see that the error terminates the...
7
by: deep022in | last post by:
guys, I have written a perl script using Win32::OLe module. I was able to read it completly. but I have to specify the number of rows in excel sheet as command line option. I want to know how...
1
by: Terry | last post by:
Hi Language: VBA (Excel) I am wanting to execute different functions/subs dependent on a value picked up in a cell on Excel. For example if Cell A1 has the value "name" then I want to be able...
9
by: sitko | last post by:
Hi, I have an Order tracking spreadsheet that I need help with. I have a 2 worksheets "Open", and "Closed". I have entries on the "Open" sheet which may or may not be grouped together. I've...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.