Before transferring to the Kindle, it might be worth tidying up the text file to make it more readable. This is especially easy using the command line in Linux -- I use the 'fmt' command to remove excess whitespace and line-breaks, and 'tr -d' to delete any annoying characters (in my case, asterisks) that one's browser saw fit to scatter throughout the saved text file. This is all taken care of by the single line:
cat oldfile.txt | tr -d \* | fmt -u -w 999 > newfile.txt
Perfect!
Update: a few further notes...
(1) Acrobat Reader is sometimes able to "save as text" even scanned-image PDFs. (I guess these must be "image+text" marked-up PDFs, rather than raw scanned images. But they look the same to the naked eye.)
(2) The method listed for JSTOR articles won't work for two-column scans (e.g. of books). Linux script 'unpnup' enables one to convert such files to single-column PDFs however. PaperCrop is a more powerful solution that works easily in Windows.
(3) Sometimes a book scan is of such bad quality that OCR just can't interpret it. In this case, one can use PDFread to cut up the images into kindle-sized bites, and assemble the images directly into a .prc or .mobi (Kindle-readable) file. This way one can read the scanned images themselves on one's Kindle, without them being shrunk to an illegible size. I've found that this works extremely well.
And just why didn't you buy an iPhone to do all this? Plus, the iPhone allows you to open pdfs from emails - no need to bother with Google's crappy HTML versions...
ReplyDeleteCan you save the .html version as a .doc file? And then send that to your kindle?
ReplyDeleteAnlamk - you miss the point. I'm talking about serious reading here, not just skimming a document on the go. The iPhone is far from an ideal reading device. I could read PDFs on my laptop if that was the only issue. But I do a lot of reading (of digitized content -- mostly PDFs of philosophy papers) and the Kindle's e-ink is much more comfortable to read than a backlit computer screen (which in turn is better than a tiny phone screen).
ReplyDeleteHi Gil - in my first attempt, I sent the .html file to Amazon to be directly converted into Kindle format (they offer this as a free online service; .doc files require conversion too).
ReplyDeleteUnfortunately, the formatting was not very good -- half of my Kindle screen ended up being wasted by the wide left margin, so that only a few words would fit on each line. So I think it is better to save the .html file in unformatted (plain text) form instead, so as to get rid of the wide margins.
If you use an iPhone you can simply copy over the PDF for reading at you leisure. I'd disagree that the iPhone is bad for reading. The advantage is that it's always in my pocket so I can read when I want. The Kindle would be nice except it's so large that it's a pain to pack around. Basically I'd rather just cart around a printout or my MacBook rather than a Kindle. (My personal preferences - I'm not making a general argument against Kindle lovers)
ReplyDeleteWell maybe this is just a difference in preference. I'd still prefer the iPhone to Kindle. I think the screen is as comfortable as any - plus you don't get to carry one less device. With Kindle, you have to have the phone plus the Kindle. (And Kindle is pretty big, too.)
ReplyDeleteThe only advantage seems the e-ink, the eye-strain you get from looking at the computer screen. As an engineer who lives and breathes computers, that's not an issue for me.
And iPhone's other features (mp3, phone, other apps, interface) more than compensate for the lack of the e-ink.
Well, again, I'm sure the iPhone is great for many purposes. But I don't see how any of this is relevant. You presumably wouldn't use it as your primary means of reading, which is what I'm concerned with here (not just 'on the go', as I said, but studying at home, etc.).
ReplyDeleteAnyway, it's not for everyone. But for those who like the Kindle, it's great to have a solution to what was previously (again, at least for my purposes) its greatest shortcoming.
I don't know about the Iphone, but my Nokia E90 has been my primary reading means for the last few years. I personally don't have any difficulty with this, and I love that I always have books and papers to read.
ReplyDeleteThere are plenty of easy ways to convert pdf's into txts or html there are open source tools for doing this on your desktop.
Cheers
David
Hi David, it's easy to convert text-based PDFs. Scanned-image PDFs (as from JSTOR) are an altogether different kettle of fish. For that you require Optical Character Recognition software -- and while some open source OCR is available, it isn't particularly easy to use, and my attempts yielded far worse results than Google's. So the new-found ability to use Gmail for OCR is actually pretty significant.
ReplyDeleteRichard,
ReplyDeleteI know you've already got the Kindle, but I think the e-reader from Plastic Logic will be a better choice for reading PDFs and the like, rather than ebooks. It's going to be larger, and seems to be geared more toward those who have their own documents than the Kindle, which seems to be geared more toward those who want to buy books from Amazon.
Corey
Wow, yeah that looks great. (May still be a ways off in the future, though...)
ReplyDeleteFor home wouldn't you be better off reading on a laptop? There are laptops with great screens. My MacBookPro is the best computer screen I've ever used.
ReplyDeleteI think the backlighting means that reading from a computer is never ideal (compared to paper/digital ink). It's also more comfortable to read from something hand-held. The whole point of the Kindle is that it imitates the comfort advantages of paper books, but for digital content.
ReplyDeleteThanks for the tip! I am looking forward to trying this method.
ReplyDelete