473,608 Members | 2,287 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to get a duplicate rows based on the particular cell value in excel using Perl

5 New Member
Hi,

I am currently working on excel validation using Perl. I am new to Excel validation but not for Perl. I have a question regarding one of the validation. Could anyone please tell me how to get the number of duplicate rows based on a particular cell value of each these duplicate rows. I mean all the cell values of a row will not be duplicated but a individual columns cell value will be duplicated and I need to create a separate excel with all these duplicates.

Eg :

Name Org Dep Place

RavindraN TCS Development Blore
Ravi TCS Scripting blore
Vimal AOL Development blore
Ram Aol Development blore
Priya aol Scripting blore

Here, I need to get the duplicate rows based on Dep. The output should be :

RavindraN TCS Development Blore
Vimal AOL Development blore
Ram Aol Development blore

and

Ravi TCS Scripting blore
Priya aol Scripting blore

Thanks
Ravi
Nov 10 '08 #1
4 5435
nithinpes
410 Recognized Expert Contributor
What have you tried so far?
Nov 10 '08 #2
ravir81
5 New Member
Hi,

Since I am new to excel validation, I am trying out the how to get the value and display them. I tried to get all the values of a particular cell and store it in a array so that I can try to get a duplicated from that cell.

I need to get the duplicate records based on the column "Dep".

Expand|Select|Wrap|Line Numbers
  1. #!/usr/local/bin/perl
  2.  
  3. use strict;
  4. use Spreadsheet::ParseExcel;
  5.  
  6. my %UniqueCell_Row = ();
  7. my $excel_name; 
  8.  
  9. $excel_name = "D:\\RAVINDRAN\\Excel-Email-sending\\name.xls";
  10. print "Excel Name -> <$excel_name> \n";
  11.  
  12. my $workbook = Spreadsheet::ParseExcel::Workbook->Parse($excel_name);
  13.  
  14. print "Book -> <$workbook> \n";
  15. my $cont_col;
  16. my $cell_value;
  17.  
  18. ### To get the particular columns all the value
  19. foreach my $sheet (@{$workbook->{Worksheet}}) {
  20.     printf("Sheet: %s\n", $sheet->{Name});
  21.     my @cols;
  22.  
  23.     foreach my $col ($sheet->{MinCol} .. $sheet->{MaxCol}) {
  24.         print "Column -> <$col> \n";
  25.         print "$sheet->{Cells}[0][$col]->{Val} \n";
  26.         if ($sheet->{Cells}[0][$col]->{Val} eq "Dep") {
  27.             $cont_col = $col;
  28.             print "Column Number -> $cont_col\n";
  29.             last;
  30.         }
  31.     }
  32.  
  33.     for(
  34.         my $iR = $sheet->{MinRow};
  35.         defined $sheet->{MaxRow} && $iR <= $sheet->{MaxRow}; 
  36.         $iR++
  37.     ){
  38.  
  39.         # fetch columns
  40.         my $cell = $sheet->{Cells}[$iR][$cont_col];
  41.         print "sheet->{Cells}[$iR][$cont_col] \n";
  42.         print "cell->{Val} ----> <$cell->{Val}> \n";
  43.         $cell_value = $cell->{Val};
  44.         push(@cols,$cell_value);
  45.         #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  46.     }
  47.  
  48.     print "Colllummmn : <@cols> \n";
  49. }
  50.  
  51. ## This is just to print all the columns value
  52. foreach my $sheet (@{$workbook->{Worksheet}}) {
  53.     printf("Sheet: %s\n", $sheet->{Name});
  54.     foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) {
  55.         foreach my $col ($sheet->{MinCol} ..  $sheet->{MaxCol}) {
  56.  
  57.             my $cell = $sheet->{Cells}[$row][$col];
  58.             my $cell_value = $cell->{Val};
  59.  
  60.             if ( $cell ) {
  61.                 printf("( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  62.                 $UniqueCell_Row{$row}{$col} = $cell_value;
  63.             }
  64.             if ( $cell && $cell_value eq "Dep" ) {
  65.                 printf("Department Cell Value is : ( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  66.                 #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  67.             }
  68.         }
  69.     }
  70. }
  71.  
  72. ## This is just to get the values of the record
  73. my @row_cont = ();
  74. foreach my $row ( sort keys %UniqueCell_Row ) {
  75.     foreach my $cell ( sort keys %{$UniqueCell_Row{$row}} ) {
  76.         print "Rows and Column -----> <$row> | <$cell> | $UniqueCell_Row{$row}{$cell} \n";
  77.     }
  78. }
Could you please help me on this.

Thanks
Nov 11 '08 #3
ravir81
5 New Member
Hi,

I have taken the duplicate records based on the particular cell value. I need to created excel file for all the duplicate records. Here I am facing problem with writing the header only once for all the new excels created here.

Here is the updated code :

Expand|Select|Wrap|Line Numbers
  1. #!/usr/local/bin/perl
  2.  
  3. use strict;
  4. use Spreadsheet::ParseExcel;
  5. use Spreadsheet::WriteExcel;
  6.  
  7. my %UniqueCell_Row = ();
  8. my $excel_name; 
  9.  
  10. $excel_name = "D:\\RAVINDRAN\\Excel-Email-sending\\name.xls";
  11. print "Excel Name -> <$excel_name> \n";
  12.  
  13. my $workbook = Spreadsheet::ParseExcel::Workbook->Parse($excel_name);
  14.  
  15. print "Book -> <$workbook> \n";
  16. my $cont_col;
  17. my $cell_value;
  18. my @cols;
  19.  
  20. ### To get the particular columns all the value
  21. foreach my $sheet (@{$workbook->{Worksheet}}) {
  22.     printf("Sheet: %s\n", $sheet->{Name});
  23.  
  24.     foreach my $col ($sheet->{MinCol} .. $sheet->{MaxCol}) {
  25.         print "Column -> <$col> \n";
  26.         print "$sheet->{Cells}[0][$col]->{Val} \n";
  27.         if ($sheet->{Cells}[0][$col]->{Val} eq "Dep") {
  28.             $cont_col = $col;
  29.             print "Column Number -> $cont_col\n";
  30.             last;
  31.         }
  32.     }
  33.  
  34.     for(
  35.         my $iR = $sheet->{MinRow};
  36.         defined $sheet->{MaxRow} && $iR <= $sheet->{MaxRow}; 
  37.         $iR++
  38.     ){
  39.  
  40.         # fetch columns
  41.         my $cell = $sheet->{Cells}[$iR][$cont_col];
  42.         print "sheet->{Cells}[$iR][$cont_col] \n";
  43.         print "cell->{Val} ----> <$cell->{Val}> \n";
  44.         $cell_value = $cell->{Val};
  45.         push(@cols,$cell_value);
  46.         #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  47.     }
  48.  
  49.     print "Colllummmn : <@cols> \n";
  50. }
  51.  
  52. ## This is just to print all the columns value
  53. foreach my $sheet (@{$workbook->{Worksheet}}) {
  54.     printf("Sheet: %s\n", $sheet->{Name});
  55.     foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) {
  56.         foreach my $col ($sheet->{MinCol} ..  $sheet->{MaxCol}) {
  57.  
  58.             my $cell = $sheet->{Cells}[$row][$col];
  59.             my $cell_value = $cell->{Val};
  60.  
  61.             if ( $cell ) {
  62.                 printf("( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  63.                 $UniqueCell_Row{$row}{$col} = $cell_value;
  64.             }
  65.             if ( $cell && $cell_value eq "Dep" ) {
  66.                 printf("Department Cell Value is : ( %s , %s ) => %s\n", $row, $col, $cell->{Val});
  67.                 #push ( @{$UniqueCell_Row{$cell_value}},"$row,$col,$cell_value" );
  68.             }
  69.         }
  70.     }
  71. }
  72.  
  73. ## This is just to get the unique record
  74. my @row_cont = ();
  75. foreach my $row ( sort keys %UniqueCell_Row ) {
  76.     foreach my $cell ( sort keys %{$UniqueCell_Row{$row}} ) {
  77.         print "Rows and Column -----> <$row> | <$cell> | $UniqueCell_Row{$row}{$cell} \n";
  78.         #if ( $UniqueCell_Row{$row}{$cont_col}
  79.  
  80.     }
  81. }
  82.  
  83. print "========================================= \n";
  84.  
  85. my @Dup;
  86. my %uniqISU = ();
  87. my $duplicate;
  88. my $unique;
  89.  
  90. foreach my $row ( sort keys %UniqueCell_Row ) {
  91.     foreach my $cell ( sort keys %{$UniqueCell_Row{$row}} ) {
  92.         print "Rows and Column -----> <$row> | <$cell> | $UniqueCell_Row{$row}{$cont_col} \n";
  93.         if ( grep( /$UniqueCell_Row{$row}{$cont_col}/, @Dup ) ) {
  94.             print "Present : <$UniqueCell_Row{$row}{$cont_col}> | <@cols> \n";
  95.             $duplicate = $UniqueCell_Row{$row}{$cont_col};
  96.             $uniqISU{$duplicate}{$row}{$cell} = $UniqueCell_Row{$row}{$cell};
  97.         }
  98.         else {
  99.             print "NOT PRESENT : <$UniqueCell_Row{$row}{$cont_col}> \n";
  100.             push @Dup, $UniqueCell_Row{$row}{$cont_col};
  101.             $duplicate = $UniqueCell_Row{$row}{$cont_col};
  102.             $uniqISU{$duplicate}{$row}{$cell} = $UniqueCell_Row{$row}{$cell};
  103.         }
  104.     }
  105. }
  106.  
  107. print "***************************************************** \n";
  108. foreach my $cell_val ( sort keys %uniqISU ) {
  109.     print "Cell Value ---> <$cell_val> \n";
  110.     foreach my $rows ( sort keys %{$uniqISU{$cell_val}} ) {
  111.         foreach my $cell ( sort keys %{$uniqISU{$cell_val}{$rows}} ) {
  112.             print "Cell_val -> <$cell_val> | rows -> <$rows> | columns -> <$cell> | Col Val -> <$uniqISU{$cell_val}{$rows}{$cell}> \n";
  113.         }
  114.     }
  115. }
  116.  
  117. my $format;
  118. my ($header1,$header2,$header3);
  119. my %header;
  120.  
  121. foreach my $cell_val ( sort keys %uniqISU ) {
  122.     my $header = 1;
  123.     my $workbook;
  124.     my $sheet1;
  125.  
  126.     if ( $cell_val ne "Dep" ) {
  127.         $workbook = Spreadsheet::WriteExcel->new("$cell_val.xls");
  128.         $sheet1 = $workbook->add_worksheet("$cell_val");
  129.  
  130.         # Add a Format
  131.         $format = $workbook->add_format();
  132.         $format->set_bold();
  133.         $format->set_size(12);
  134.         $format->set_color('blue');
  135.         $format->set_align('center');
  136.  
  137.         $sheet1->set_column(0, 0);
  138.     }
  139.  
  140.  
  141.     foreach my $rows ( sort keys %{$uniqISU{$cell_val}} ) {
  142.         foreach my $cell ( sort keys %{$uniqISU{$cell_val}{$rows}} ) {
  143.             if ( $header == 1 && $cell_val eq "Dep" ) {
  144.                 print "Header Detail ************ $rows, $cell, $uniqISU{$cell_val}{$rows}{$cell}\n";
  145.                 $header{$cell_val}{$rows}{$cell} = $uniqISU{$cell_val}{$rows}{$cell};
  146.             }
  147.             elsif ( $cell_val ne "Dep" ) {
  148.                 print "$header{$cell_val},$header{$cell_val}{$rows},$header{$cell_val}{$rows}{$cell} \n";
  149.                 $sheet1->write($header{$cell_val},$header{$cell_val}{$rows},$header{$cell_val}{$rows}{$cell}) if ( $header == 1 );
  150.                 $sheet1->write($rows, $cell, $uniqISU{$cell_val}{$rows}{$cell});
  151.             }
  152.             $header++;
  153.         }
  154.     }
  155. }
Nov 12 '08 #4
kogantivijay
1 New Member
hi ,

Just want to let you know that the code that you have pasted helped me understand how to solve one of my problem..i was new to perl and wondering where/how to start and it gave me a good starting point..

thanks,
vijay
Mar 27 '09 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

13
7773
by: Dixie | last post by:
How do I, in VBA from an access form module, add 5 rows to the top of a spreadsheet called MySpreadsheet.xls. The worksheet is called MyWorksheet and there is already data in the first 5 rows. I simply need to insert 5 blank rows at the top and move the rest of the data down so it starts at row 6. dixie
0
4685
by: Peter | last post by:
I am having a problem reading an Excel file that is XML based. The directory I am reading contains Excel files that can be of two types. Either generic Microsoft based or XML based. I am reading the Microsoft based files with an OleDbDataAdapter. Then filling the contents of the first worksheet into a dataset. However when I try to add the XML based file to my dataset using an XmlTextReader I can never seem to get it to save to a...
0
403
by: Chris Becker | last post by:
I want to create a TableCell object once and add it to many tables: Dim cell As New TableCell cell.Style.Add("background-color", s) cell.Style.Add("cursor", "pointer") cell.Width = Unit.Pixel(CELLDIMENSION) cell.Height = Unit.Pixel(CELLDIMENSION) cell.HorizontalAlign = HorizontalAlign.Center cell.Text = "&nbsp;" cell.Attributes.Add("Title", s)
24
19896
by: clare at snyder.on.ca | last post by:
I have a SQL query I need to design to select name and email addresses for policies that are due and not renewed in a given time period. The problem is, the database keeps the information for every renewal in the history of the policyholder. The information is in 2 tables, policy and customer, which share the custid data. The polno changes with every renewal Renewals in 2004 would be D, 2005 S, and 2006 L. polexpdates for a given customer...
18
8276
by: Frank M. Walter | last post by:
Hello, I have made an small AddIn with udf for excel 2003. I use vs2003. The point of view is the function __T() I call it in excel sheet writing =__T() I am not able to set a value to a given cell. region.Value2="qwe"; //bumm! A exception will be trown. On all PCs with excel. HRESULT 0x800A03EC
11
2550
by: Tempo | last post by:
Hello. I am getting the error that is displayed below, and I know exactly why it occurs. I posted some of my program's code below, and if you look at it you will see that the error terminates the program pre-maturely. Becasue of this pre-mature termination, the program is not able to execute it's final line of code, which is a very important line. The last line saves the Excel spreadsheet. So is there a way to make sure the last line...
7
42654
by: deep022in | last post by:
guys, I have written a perl script using Win32::OLe module. I was able to read it completly. but I have to specify the number of rows in excel sheet as command line option. I want to know how to retrieve number of rows of the excel sheet and number of columns also. i tried to retrieve the properties using the each methode. see below
1
1627
by: Terry | last post by:
Hi Language: VBA (Excel) I am wanting to execute different functions/subs dependent on a value picked up in a cell on Excel. For example if Cell A1 has the value "name" then I want to be able to run the NewName() sub and send to it the value of Cell B1 (which will contain a name). However, if Cell A1 contains the value "Age" then I want to run the NewAge() sub and send it the value of cell B1 (which will contain the age). The reason...
9
16762
by: sitko | last post by:
Hi, I have an Order tracking spreadsheet that I need help with. I have a 2 worksheets "Open", and "Closed". I have entries on the "Open" sheet which may or may not be grouped together. I've called the rows :Parents and Children. There are basically 3 types of rows. Parents with Children, Parents without Children, and Children. I have a reference cell for each row, where I identify what type of row it is, "-1" = Child, "0" = Parent with...
0
8063
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8496
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8475
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8148
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
6013
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5475
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4024
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2474
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
1329
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.