By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,660 Members | 1,299 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,660 IT Pros & Developers. It's quick & easy.

search for duplicated via IPTC field (OS X)

P: 3
I am looking to filter a folder of images by their IPTC value, this way I can find the duplicates which riddle my collection, sometimes five versions of a single image exist, but all with different file names. The key IPTC field is the "Headline" (or "IPTC:By-line"), this has the duplicate value, when found, the images are shuttled to a separate folder from the main collection.

The following code works in unison with a program called ExifTool (http://www.sno.phy.queensu.ca/~phil/exiftool/), that is how the code is able to interrogate the IPTC data:

Expand|Select|Wrap|Line Numbers
  1.  
  2. #!/usr/bin/perl -w 
  3. use strict; 
  4. BEGIN { unshift @INC, '/usr/bin/lib' } 
  5. use Image::ExifTool; 
  6. print "Using ExifTool version $Image::ExifTool::VERSION\n"; 
  7.  
  8. @ARGV > 2 or die "Syntax: script TAG DIR FILE [FILE...]\n"; 
  9.  
  10. my $tag = shift; 
  11. my $dstdir = shift; 
  12.  
  13. my $exifTool = new Image::ExifTool; 
  14. my $moved = 0; 
  15. my ($file, %foundValue); 
  16. foreach $file (@ARGV) { 
  17.     my $info = $exifTool->ImageInfo($file, $tag); 
  18.     unless (%$info) { 
  19.         warn "$tag not found in $file\n"; 
  20.         next; 
  21.     } 
  22.     my ($val) = values %$info; 
  23.     if ($foundValue{$val}) { 
  24.         # duplicate value, so move to destination directory 
  25.         print "$tag is the same in $file as $foundValue{$val}\n"; 
  26.         my $dst = $file; 
  27.         $dst =~ s{.*/}{};  # remove directory name 
  28.         $dst = "$dstdir/$dst"; 
  29.         if (-e $dst) { 
  30.             warn "$dst already exists!\n"; 
  31.         } elsif (not rename($file, $dst)) { 
  32.             warn "Error moving $file to $dst\n"; 
  33.         } else { 
  34.             print "  --> moved\n"; 
  35.             ++$moved; 
  36.         } 
  37.     } else { 
  38.         $foundValue{$val} = $file;  # save first file with this value 
  39.     } 
  40. printf "%5d files processed\n", scalar(@ARGV); 
  41. printf "%5d files moved\n", $moved; 
  42. # end
  43.  

However there would be many to sort even still, so I was looking to have this code modified in such a way as to have the groups of images (all the ones with "Henry Ford Clinic") in the Headline field placed in a single folder and that folder named by the value of the IPTC field data, so the folder would be called "Henry Ford Clinic".

On top of which, some images are the same but on the end they have -framed in mahogany, of which these images have a faux framing but essentially are the same image. For this I thought then if I could restrict the data compared to say the first "15" chars then I would catch all the duplicates indeed. This also has not been programed in.
Aug 9 '10 #1
Share this Question
Share on Google+
4 Replies


numberwhun
Expert Mod 2.5K+
P: 3,503
What you want to do in the case of what you are doing is look through $info and find the IPTC field and then use a regular expression to grab the first 15 characters of its value. If you could post:

1. a print out of $info
2. what the IPTC entry itself looks like

we can help you. Otherwise you can try to do it yourself. Either way, after you get the value you wanted, you could use it to create a directory name and put the files into there.

Regards,

Jeff
Aug 10 '10 #2

P: 3
Thanks for the reply. I was looking into substr for the ability to look at the first n chars, so I was wondering if this would go from char 0 to the 15th?

Expand|Select|Wrap|Line Numbers
  1. my $oneName = substr($names, 0, 15);
  2.  

I'm not sure what you mean by the IPTC entry, all of the data in that By-line field will be different, so I probably just don't understand, do you mean to to print the info so as to use the info as the folder name?


Thanks.
Aug 10 '10 #3

numberwhun
Expert Mod 2.5K+
P: 3,503
Well, if you are sure that the name is 15 characters in length, the substr() function should do the job. I am just a regex hound so I tend to recommend them.

Test it and see if it works to create the directories you expect, but don't move files.
Aug 10 '10 #4

P: 3
Not entirely sure it will be 15 chars, but of course I can mod it, I got this example but not sure where to place it in my code:

Expand|Select|Wrap|Line Numbers
  1. $ perl  -e 'my $names="12345678901234567890"; my $oneName = substr($na
  2. +mes, 0, 15); print "$oneName\n";'
  3. 123456789012345
Aug 10 '10 #5

Post your reply

Sign in to post your reply or Sign up for a free account.