search for duplicated via IPTC field (OS X)

I am looking to filter a folder of images by their IPTC value, this way I can find the duplicates which riddle my collection, sometimes five versions of a single image exist, but all with different file names. The key IPTC field is the "Headline" (or "IPTC:By-line"), this has the duplicate value, when found, the images are shuttled to a separate folder from the main collection.

The following code works in unison with a program called ExifTool (http://www.sno.phy.queensu.ca/~phil/exiftool/), that is how the code is able to interrogate the IPTC data:

Expand|Select|Wrap|Line Numbers

  
#!/usr/bin/perl -w 

use strict; 

BEGIN { unshift @INC, '/usr/bin/lib' } 

use Image::ExifTool; 

print "Using ExifTool version $Image::ExifTool::VERSION\n"; 
 
@ARGV > 2 or die "Syntax: script TAG DIR FILE [FILE...]\n"; 
 
my $tag = shift; 

my $dstdir = shift; 
 
my $exifTool = new Image::ExifTool; 

my $moved = 0; 

my ($file, %foundValue); 

foreach $file (@ARGV) { 

    my $info = $exifTool->ImageInfo($file, $tag); 

    unless (%$info) { 

        warn "$tag not found in $file\n"; 

        next; 

    } 

    my ($val) = values %$info; 

    if ($foundValue{$val}) { 

        # duplicate value, so move to destination directory 

        print "$tag is the same in $file as $foundValue{$val}\n"; 

        my $dst = $file; 

        $dst =~ s{.*/}{};  # remove directory name 

        $dst = "$dstdir/$dst"; 

        if (-e $dst) { 

            warn "$dst already exists!\n"; 

        } elsif (not rename($file, $dst)) { 

            warn "Error moving $file to $dst\n"; 

        } else { 

            print "  --> moved\n"; 

            ++$moved; 

        } 

    } else { 

        $foundValue{$val} = $file;  # save first file with this value 

    } 

} 

printf "%5d files processed\n", scalar(@ARGV); 

printf "%5d files moved\n", $moved; 

# end

However there would be many to sort even still, so I was looking to have this code modified in such a way as to have the groups of images (all the ones with "Henry Ford Clinic") in the Headline field placed in a single folder and that folder named by the value of the IPTC field data, so the folder would be called "Henry Ford Clinic".

On top of which, some images are the same but on the end they have -framed in mahogany, of which these images have a faux framing but essentially are the same image. For this I thought then if I could restrict the data compared to say the first "15" chars then I would catch all the duplicates indeed. This also has not been programed in.

Aug 9 '10 #1

Subscribe Post Reply

1899

numberwhun

3,509

Expert Mod 2GB

What you want to do in the case of what you are doing is look through $info and find the IPTC field and then use a regular expression to grab the first 15 characters of its value. If you could post:

1. a print out of $info
2. what the IPTC entry itself looks like

we can help you. Otherwise you can try to do it yourself. Either way, after you get the value you wanted, you could use it to create a directory name and put the files into there.

Regards,

Jeff

Aug 10 '10 #2

Tim Wattings

Thanks for the reply. I was looking into substr for the ability to look at the first n chars, so I was wondering if this would go from char 0 to the 15th?

Expand|Select|Wrap|Line Numbers

my $oneName = substr($names, 0, 15);

I'm not sure what you mean by the IPTC entry, all of the data in that By-line field will be different, so I probably just don't understand, do you mean to to print the info so as to use the info as the folder name?

Thanks.

Aug 10 '10 #3

numberwhun

3,509

Expert Mod 2GB

Well, if you are sure that the name is 15 characters in length, the substr() function should do the job. I am just a regex hound so I tend to recommend them.

Test it and see if it works to create the directories you expect, but don't move files.

Aug 10 '10 #4

Tim Wattings

Not entirely sure it will be 15 chars, but of course I can mod it, I got this example but not sure where to place it in my code:

Expand|Select|Wrap|Line Numbers

 
$ perl  -e 'my $names="12345678901234567890"; my $oneName = substr($na

+mes, 0, 15); print "$oneName\n";'

123456789012345

Aug 10 '10 #5

Similar topics

human readable IPTC field names

by: Jonah Bossewitch | last post by:

Hi, I am using PIL's IptcImagePlugin to extract the IPTC tags from an image file. I was able to successfully retrieve the tags from the image (thanks!), but the results are keyed off of cryptic...

Python

Search for a field name in stored procedures

by: Eugene | last post by:

Hi, Is there any way to find all stored procedures that contain a given field Example: I want to find all stored procedures that work with the field ShipDate in tblOrder table Thanks, Eugene

Microsoft SQL Server

Combo Box To Search Subform

by: AW | last post by:

Hi all, I have a form (named Offices) with a subform (named Occupants) that connect with the Master/Child Field "office number". I have a combo box that allows a user to pull up a particular...

Microsoft Access / VBA

Search in specific field on form does not work.

by: AA Arens | last post by:

I have a few buttons on my form to search for text in a dedicated field: Private Sub CmdSearchA_Click() On Error GoTo Err_Find_Record_Click Me.CustomerID.SetFocus DoCmd.DoMenuItem...

Microsoft Access / VBA

how to search two array field value in two table

by: pushp | last post by:

in table1- loan_type=1,3 and table2- type_loan=3,4. If any value match with loan_type to type_loan then it give true result. pls help me.

MySQL Database

Search a table / field if a value is present

by: metalheadstorm | last post by:

God this is annoying me, im sorry if this has been asked before but ive looked though the forum and the net but my searches havent come up woth anything.... mainly due to that i dont really ...

Visual Basic 4 / 5 / 6

Search for a field with the same informations

by: Tine Müller | last post by:

In my table I have a field called "lat" and "lng" which have the coordinates for showing the markers on my map http://www.tinemuller.dk/alle_folkebiblioteker/dropdownmenu/PVII/. The problem is...

PHP

How do you search a datetime field order by months?

by: dhanu sahu | last post by:

Hi all I have a SQL database table with a datetime field in dd/mm/yyyy format. I have a web page that allows users to search this database. How do you search a datetime field order by...

.NET Framework

Combo Search pulling incorrect field

by: John Torres | last post by:

I am trying to create a combo box search on the form with the Part Number but for some reason it’s pulling the Part Number Table’s Primary Key. Can’t figure out why. Any ideas where to start? FYI-...

Microsoft Access / VBA

Setting focus to search page input field

by: Dave Rado | last post by:

Hi I have been the following code by Freefind to use on my search page: <form action="http://search.freefind.com/find.html" method="get" accept-charset="utf-8" target="_self"> <input...

Javascript

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA