Resizing PDF’s in Ubuntu from 145Mib to ~700KiB
ø
Let me state this up front. I hate PDF
‘s. I’ve worked in graphic design and print design blah blah blah. And I’ve had to deal with every lame application’s specific intrepretation of PDF more times than I would like.
PDF’s don’t do well in web browsers, they are far larger than the componant text file, don’t really provide any security, are hard to flow text in (or out of), don’t work well with screen readers (depends), and very often are awful for professional CMYK printing.
Tonight, I discovered:
- PDFs function roughly the same way in GNU/Linux as in the Windows world (weird)
- That in FOSS there are AMAZINGLY powerful tools to do PDF’s
When I helping edit the release
notes for Hardy Heron (which were re-written in plain english?
) I found out that the latest version of Inkscape
could edit PDF’s natively I was estatic. I have since used Inkscape (which I now love more than life itself) on a spree of projects. Inkscape is FANTASTIC, download it now.
But Inkscape doesn’t handle image based PDF’s as well. Image based PDF’s (probably a better name for them) are basically just a big scanned image, usually a series of documents that have been cramed into a .PDF and then emailed back and forth. Working in real estate I’ve cleaned up an awful lot of these when they finally degrade beyond the point of legibility (at least it’s better than faxes).
Tonight I was forced to deal with the issue of an inproperly formatted scaned document (PDF) using only Free and Open Software! It’s True! And it CAN be done, quite handily I might add.
The scanned document at 300dpi (and quite awful quality I might add) was 145MiB and merely 6 pages of simple text. Being in a hurry, I was able to 7zip the document down to about 9MiB and email it to myself back home.
Once home, I unzipped, and started poking at it with Inkscape. Well inkscape didn’t quite know what to do with 6 pages, and doesn’t quite handle images natively. So I fired up The GIMP to see what it would do with 145MiB of PDF. The GIMP didn’t even hiccup. It ate that pdf and spit out 6 .eps’ at 150dpi like they were nothing. This cut the total filesize down to about 18MiB.
EPS’s are well and good. But I need a single PDF to send. So I use the pdftk (apt-get install pdftk) to convert and concatenate the six files.
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=contract.pdf -dBATCH Contract1.eps Contract2.eps Contract3.eps Contract4.eps Contract5.eps Contract6.eps
That left me with a single file called contract.pdf. That file? was only 700KiB in size. :D
…
There has to be an easier way to do this. But all the same, I’m glad that I learned how to do it this way first. Were I to do these in batches, The Gimp takes scripts very well, and I would likely even find a command line program (imagemagic?) that would do this even easier.
- previous:
- Blag down, blag to go
- next:
- Don’t touch grub unless you have a boot disk

