473,326 Members | 2,099 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

How to save a list of links as PDFs?

25
I have an HTML file that's a long, long list of links to important articles. I want to save each of those links locally, preferably as a PDF.

Any suggestions on the best approach? Thanks in advance.
Jan 18 '11 #1
3 1427
Oralloy
985 Expert 512MB
svendok,

If the link is to a file that's in a format other than .PDF, you're going to have to convert the file.

Assuming you're in windoze, what you can do is save each as a "Web Archive, Single File (*.mht)" file.

Then, you can go back over all the files and open them in MSWord.

Finally, you can "Print" each file to Acrobat format. (You may have to install a special print driver to do this step.)

Unfortunately lots of hand-crank work, but not an insurmountable obsticale.

Alternately, if you're really good with Perl or VBA, you can write a script which walks the page, down-loads each link, and then drives Word to perform the final processing.

Good Luck!
Oralloy
Jan 18 '11 #2
svendok
25
Thanks, Oralloy. Doing it manually is definitely out of the question, as there are thousands of links.

Want I want to do is save all my Delicious bookmarks to an HTML file and then somehow script each one being opened and saved as PDF. That way I can not only save them, but search their content locally using Google Desktop or whatever.

I'm okay at Perl and VBA, but I'm not familiar with the PDF functionality available in either. Maybe I need to post this question on a Perl and VBA forum and see what people say.

Mainly, I'm trying to see if there are any other, simpler solutions before I go this route.

Thanks.
Jan 21 '11 #3
Oralloy
985 Expert 512MB
svendok,

at two minutes each, you can do about 250 pages in a day.

in actuality, you'll be a lot faster, although you're going to be utterly bored to death.

perhaps hire a high-school student to grind away?

Ok, 'nuff said.

My approach would be to crack the file using Perl or VBA and create a flat file with a list of URLs.

Next step would be to build a program to process each URL in turn. Use Word VBA for this.

Then I'd modify the processor program to write a secondary file for the error cases.

Finally, the brains of the code - open word as an Application object under VBA, and then create new documents from each URL. [Probably needless to say, but if the document is already a PDF, or other non-word handlable document, you should either skip it or copy it directly.]

Once the document loads successfully, then print it using a PDF printer.

Successes are good.

Failures are logged.

It'll probably take you about a week to get it "right" and functioning smoothly.

The devil, as always, is in the details.

BTW, when you get it done, we'd all appreciate if you'd post a copy of the script. As you observed, you aren't the only person with these sorts of issues.

Cheers,
Oralloy
Jan 21 '11 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Stephanie S | last post by:
Hi. I need to take a bunch of PDFs that I have made using Appligent's FDFMerge and append them together. I know I could use Appligent's AppendPDF. But I was wondering if there were any free...
1
by: Carl Cross | last post by:
I have a site http://www.allaboutpets.org.uk/dogindex1.html Now on all of the cat doh horse index pages when using netscape browser ( all versions ) the download links in the left column do...
14
by: BlueDolphin | last post by:
Hello all. I'm looking for some opinions on using snapshot files or ..pdf for some reports. We are in the initial stages of creating some reports in access that will be pulled online through Cold...
1
by: Samantha Gross | last post by:
Hi! Anyone know how to save an Access form as a PDF? I want to use VB to automate the process of saving a few thousand forms as PDFs. Thanks!
3
by: Mike Kingscott | last post by:
Hi there, I'm writing an app in which a punter buys some PDFs online. After purchasing said PDFs, they will be given a token (bless them Guids) to go to a download .ASPX page from which they can...
1
by: spammy | last post by:
Hi all, Im having difficulty serving synamically generated PDFs to clients via asp.net. Im using crystal reports to generate them. In my page load I have the following: private void...
0
by: Stu | last post by:
Hi, I want to be able to come up with some way of navigating around a system that allows me to save certain screen info (Say combo box setting) so that when returning to the screen from a link...
85
by: | last post by:
List, I'm looking for C links for Standard C for a website targeting professional SW engineers. Intent is to have a convenient reference to Standard C, particularly for those who come from...
6
by: Jetus | last post by:
Is there a good place to look to see where I can find some code that will help me to save webpage's links to the local drive, after I have used urllib2 to retrieve the page? Many times I have to...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.