By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,253 Members | 1,333 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,253 IT Pros & Developers. It's quick & easy.

extracting pdf files

P: 4
Hi,

i am learning perl now. I want to write a script in perl to takeup the TITLES of the research article in pdf format(i have folder contains 1000 pdf files,and i need to rename the files according to the title) and name the pdf file according to the tiltes. i am using the following module PDF::OCR::Thorough
==============================================
#!/usr/bin/perl -w

use strict;
use warnings;

use PDF::OCR::Thorough;

my $abs_pdf ='paper.pdf';

my $p = new PDF::OCR::Thorough($abs_pdf);

my $text = $p-->get_text;


__OUTPUTFILE-CREATED___

doc_data.txt
==============================================

Output file doc_data.txt is created after executing the script. In the created output file, if the article is bookmarked i can able to extract the tile exactly and name the files accordingly. I can able to extract texts, but how can i exactly extract titles, Because different journals having differnt format. Anyone can help.

regards
Suresh
Feb 26 '08 #1
Share this question for a faster answer!
Share on Google+

Post your reply

Sign in to post your reply or Sign up for a free account.