471,336 Members | 1,265 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,336 software developers and data experts.

split pdf chapters

Hello,
I am trying to split pdf chapters from a series of pdf files.
I am using pyPdf library.
The page range to extract comes from a text file.
My code works fine for just one file but I don't know is how to specify variable input and output filenames.
Sorry if my question is stupid but I am new to programming.
Here is the code. Thank you for any help.

# import pyPdf and open the input file
from pyPdf import PdfFileWriter, PdfFileReader
output = PdfFileWriter()
input1 = PdfFileReader(file("EEAL1977.pdf", "rb"))

#process the text file with the page rangs
fh = open('pages.txt')
for line in fh:
line = line.strip()
a,z = [int(x) for x in line.split()]
print a,z

#add pages to the output file
for n in range (a, z):
output.addPage(input1.getPage(n))

#write the file and close
outputStream = file("document.pdf", "wb")
output.write(outputStream)
outputStream.close()
Sep 27 '07 #1
2 6176
Well, I guess nobody was interested in the subject, so I answer myself, just in the case someone needs to use this in the future. The following is the correct code.
Now I only need to make it save with filenames that automatically increase by one. For now I just type the name for each extract I save

Expand|Select|Wrap|Line Numbers
  1. from pyPdf import PdfFileWriter, PdfFileReader
  2.  
  3. fh = open('pages.txt')
  4. for line in fh:
  5.                 linea = line.strip()
  6.                 a,z = (int(x) for x in linea.split())
  7.                 print a,z
  8.                 print 'otro'
  9.  
  10.                 output = PdfFileWriter()
  11.                 input1 = PdfFileReader(file("EEAL1977.pdf", "rb"))
  12.                 for e in range (a,z):
  13.                     output.addPage(input1.getPage(e))
  14.  
  15. #        x = output.addPage(input1.getPage(n))
  16. #        a = 1+a
  17. #        z = 'document' + str(a) + '.pdf'
  18.  
  19.  
  20.                 outputStream = file(raw_input("save as: "), "wb")
  21.                 output.write(outputStream)
  22.                 outputStream.close()
  23.                 outputStream = None
  24.  
Oct 1 '07 #2
Well, I guess nobody was interested in the subject, so I answer myself, just in the case someone needs to use this in the future. The following is the correct code.
Now I only need to make it save with filenames that automatically increase by one. For now I just type the name for each extract I save

Expand|Select|Wrap|Line Numbers
  1. from pyPdf import PdfFileWriter, PdfFileReader
  2.  
  3. fh = open('pages.txt')
  4. for line in fh:
  5.                 linea = line.strip()
  6.                 a,z = (int(x) for x in linea.split())
  7.                 print a,z
  8.                 print 'otro'
  9.  
  10.                 output = PdfFileWriter()
  11.                 input1 = PdfFileReader(file("EEAL1977.pdf", "rb"))
  12.                 for e in range (a,z):
  13.                     output.addPage(input1.getPage(e))
  14.  
  15. #        x = output.addPage(input1.getPage(n))
  16. #        a = 1+a
  17. #        z = 'document' + str(a) + '.pdf'
  18.  
  19.  
  20.                 outputStream = file(raw_input("save as: "), "wb")
  21.                 output.write(outputStream)
  22.                 outputStream.close()
  23.                 outputStream = None
  24.  
Did you add this module? I don't seem to have it as a default. If you did download it separately then thats probably why you didn't get any help - since theres not many people with pyPdf experience. Be aware that this code won't work on other computers with just a standard install of Python. Each computer will need this extra module installed too.

Anyway I'm glad you found a solution =)
Oct 1 '07 #3

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

5 posts views Thread by Stu Cazzo | last post: by
9 posts views Thread by Will McGugan | last post: by
6 posts views Thread by Andreas Prilop | last post: by
4 posts views Thread by Itzik | last post: by
5 posts views Thread by kurt sune | last post: by
3 posts views Thread by parag_paul | last post: by
1 post views Thread by =?Utf-8?B?amFtZXNjaGk=?= | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.