bloovis.com

10/06/2009 (1:42 am)

The Kindle DX and PDF metadata

Filed under: kindle, kindle dx, ruby ::

One of the most common complaints about the Amazon Kindle is its lack of support for “folders”. In Linux terms, this means that the Kindle flattens your “documents” directory tree when it displays the list of your books on its home screen. However, a directory-browsing UI would be much less flexible than a tagging system, because it would require you to impose an arbitrary hierarchy on your documents. Fortunately, there is a workaround for this that implements a kind of pseudo-tagging using annotations. The advantage of this workaround is that it’s performed on the Kindle and doesn’t require a separate computer.

Even without tagging, the Kindle allows your document list to be sorted either by author or title. Then you can jump to authors or titles starting with a particular letter by pressing that letter key and clicking the 5-way controller. I find this shortcut adequate for most purposes.

But PDF documents, which are supported in a limited fashion on the Kindle DX, present special problems. First, the tagging workaround can’t be used because annotations are not supported. Secondly, if the author and title aren’t set correctly in the PDF file’s metadata, the Kindle will fall back to using the filename as the title of the document. Even if the title is set correctly, the Kindle won’t display it in the home screen; it will display the filename instead, and only show the actual title when you highlight the file and then 5-way to the right. And finally, many PDF files, such as sheet music from the IMSLP, will have missing or incorrect metadata. (This is a problem with non-PDF ebooks too, such as those from the Baen free library.)

So in order to sort through large numbers of PDF files on the DX, it’s very important to set the author metadata correctly. The free calibre program supposedly can do this, but I choose not to use it because I wanted to use my own scripts and have more control over my directory structure. So instead, I’m using the pdftk utility to modify the metadata in my PDF files. In particular, I’m setting the Author and Title fields, which appear to be the only ones that the Kindle recognizes. (I had hoped that the Kindle would also recognize Keywords, but that doesn’t appear to be the case.)

However, pdftk is a little clumsy to use in this fashion. It reads and writes metadata in a format that’s not very convenient. You have to prepare a text file that looks like this:

InfoKey: Title
InfoValue: Six Piano Pieces Op. 118
InfoKey: Author
InfoValue: Brahms

Then you pass this text file to pdftk using its update_info subcommand. I found this usage annoying, so I wrote a wrapper script in Ruby called pdfmeta that lets you specify the author and title on a single command line. Using the above example, here’s how I would update the author and title for a piece of sheet music that I downloaded from the IMSLP:

pdfmeta -a Brahms -t "Six Piano Pieces Op. 118" op-118.pdf

There’s one catch with this: If you transfer a PDF file to the Kindle, then discover that you need to change the metadata, you’ll need to delete the old file on the Kindle, then rename the updated file (changing or adding one letter is enough) before copying it to the Kindle. Otherwise, the Kindle will continue to use the old, incorrect metadata; apparently it stores this information in a separate index keyed by filename, and doesn’t delete or update that information when you update the file.

10/04/2009 (3:03 am)

Using an Epson Perfection V30 scanner in Linux

Filed under: linux, linux mint, ubuntu ::

When I was shopping for an inexpensive flatbed scanner, it was not always easy to figure out which ones would work in Linux. Many manufacturers use proprietary protocols in their products and generally ignore Linux. I bought an Epson V30 because it was cheap and because there are drivers available for download here. The drivers work on Linux Mint 6 (Ubuntu 8.10) or later, and on several other Linux variants. Unfortunately source code is not provided, so if you don’t have one of the popular distributions, you may be out of luck.

One of the things I wanted to do with the V30 was scan books and convert them to plain text (for personal use; I’m not a pirate). For this I used Tesseract, an open-source OCR package that was developed at HP in the 90s and is now being maintained by Google. (The package name in Mint / Ubuntu is “tesseract-ocr”.) This program has a slightly funky command line interface (only reads TIFF files; the input filename must end in “.tif”, not “.tiff” or anything else; the output filename must be given without an extension). But it works surprisingly well, and I was able to integrate it into xsane (a decent scan utility) by writing a wrapper shell script that makes its command line interface identical to the gocr program that xsane uses by default.

I experimented with varying the dots per inch when scanning a book, to see how that affected tesseract’s error rate. At 100 DPI, the output was gibberish. At 200 DPI, the output was nearly perfect, with only about four errors per page that needed to be corrected manually. At 300 DPI, the output was marginally better, with perhaps one or two fewer errors per page. As I mentioned, this seems remarkably good. After I scan a page and convert it to text, I’ll fix the obvious errors and remove any end-of-line hyphenation. Then I’ll run aspell on the text to find additional errors. The most common errors are missing spaces (”ofthe”) and “l” change to “1″.

Another approach to OCR is to buy the commercial, closed-source Vuescan. I owned a copy six years ago, when xsane was not quite up to snuff, and it worked beautifully. I’m trying the latest trial version, and it has a number of improvements. The most interesting new feature is the integration of tesseract-ocr. This saves several steps in the process of scanning text, and the speed-up may be worth the price.

09/03/2009 (2:49 am)

Re-gluing hammer felts on a grand piano

Filed under: piano ::

Last year my 1994 Mason & Hamlin BB (a 7-foot grand) developed a very unusual and alarming problem: several of the hammer felts in the mid-bass section came unglued. along the front side of the hammer. Thankfully the felts remained glued along the back side; otherwise they would have fallen off completely. My piano technician called the M&H factory to ask for advice, but since the company changed ownership after my piano was built, and the piano was out of warranty, he wasn’t able to get any satisfaction. So he took the action away for a week and re-glued the felts with hide glue. This was a quick and dirty patch job. The ideal solution is to replace all the hammers, but that’s a very expensive, time-consuming operation.

After I moved to Vermont in June, I hired a local piano technician to install a humidity controller in the piano (more on that later). He noticed that several other hammers had come unglued since the last fix. I described the California technician’s solution, and he said that just plain old superglue would do the job and that I could do it myself. He reiterated the warning that the affected hammers were already permanently damaged and that the whole set (all 88 of them!) should be replaced eventually.

So over the next few days I removed the ailing hammers and reglued them. Superglue doesn’t set very quickly in a situation like this. The hammer felts are porous, but are also extremely stiff and dense, and can’t be pressed back into shape by hand. I used a medium sized spring clamp with plastic-coated jaws that I bought at the local hardware store. The spring in the clamp was strong enough to overcome the felt’s stiffness and keep it pressed into place for a few hours. This hack seems to have worked for now; it’ll be interesting to see how long it lasts.

06/11/2009 (1:05 pm)

Printing USPS Click-N-Ship labels in Firefox on Linux

Filed under: linux, software ::

I use the USPS web site to print shipping labels, and each time I upgrade to a new version of Firefox or LInux, I always run into the same problem: printing labels doesn’t work. As soon as I click the Pay and Print button, Firefox goes into some kind of infinite loop reading data from the USPS web site, and the PDF file containing the label is never seen.

The fix is to change how PDFs are handled by Firefox so that Adobe Reader is started as a separate process, rather than as an embedded window inside Firefox. Here’s how to do that in version 3.0 of Firefox:

Open Edit / Preferences, then click on the Applications tab. Enter PDF in the search box; a single entry with Content Type of “PDF document” should be displayed. Change the Action to “Use Adobe Reader”; make sure you don’t select “Use Adobe Reader (in Firefox)” accidentally.

06/11/2009 (7:13 am)

Paper: a display technology for the future?

Filed under: Uncategorized ::

What with all the talk about new portable devices such as netbooks and the Amazon Kindle, an older but still promising portable display technology called “paper” has been largely ignored. But that’s a shame, because paper has many advantages that electronic devices still don’t come close to equaling.

At first glance, paper might seem like an awfully crude technology; it’s made from flattened-out wood pulp, after all. But this allows it to take on shapes and sizes only dreamed of by conventional displays. Take newspapers, for example. Their display size and resolution are enormous by today’s standards; equivalent LCDs would cost thousands of dollars. Also, newspapers are light in weight, and can be folded up for portability.

For less temporary content, paper is used in devices known as “books” (if you’re wondering where the term “ebook” came from, now you know). Books consist of large numbers of small sheets of paper bound together at their left edge. A stiff cover, either of thicker paper or cardboard, is usually applied to the front and back of a book for protection. Because they are light and use no power, books can go places where laptops and even netbooks don’t dare: in the bathtub, on long backpacking trips, and to the tops of mountains. As with newspapers, their display size doubles when you unfold them, and they can take a beating. Throw them around or bend them, or even get them damp, and they’ll still be readable, albeit a bit less like new.

Another advantage to paper devices is that they are not encumbered by DRM protection schemes. You’re free to give away or sell books you’ve acquired at any time, though typically the amount you’ll receive for a used book is much less than you paid for it originally. Still, the freedom to use the book as you wish is a huge advantage.

So if paper reading devices are so great, what are the disadvantages? The one I notice the most is the lack of fast searching features. You have to flip through the pages manually and use visual matching to find words. Some non-fiction books have indexes, but these are often incomplete. The problem is most apparent with fiction: if you’re reading a Jane Austen novel and come across a character you don’t remember, it can take a long time to find out where she was first mentioned. On the other hand, flipping through the pages of a paper book often goes a lot faster than on an electronic display.

Another problem with paper is weight and bulk. An individual paper device can be fairly light and slim. But if you acquire a large library of books, lugging them around can be a pain, literally. I became acutely aware of this problem recently as I was packing for a move: my collection of books filled 20 heavy boxes.

So given its significant advantages, paper display technology is likely to stick around for quite a while, though future electronic display devices, as foreshadowed by the still-crude Kindle, are likely to give it a run for its money.

05/02/2009 (2:06 am)

Fixing microphone input on ThinkPad R61 in Ubuntu / Linux Mint

Filed under: linux mint, thinkpad, ubuntu ::

Yesterday I tried using the internal microphone on the ThinkPad R61 for the first time, in an attempt to make a Skype call. Skype kept saying there was an error in the sound configuration. After the usual Google searching and flailing about, I made the following changes to my system to fix the problem. It’s not clear whether all of these changes are necessary, but using them all certainly doesn’t hurt.

  • In Skype’s Options / Sound Devices, change “Sound In” to the raw hardware device; mine was “HDA Intel (hw:Intel:0)”, but it will take some experimentation and some test calls to figure out the correct setting. Do NOT set it to “pulse”; there is a known bug in Ubuntu’s implementation of PulseAudio that causes delays of many seconds on microphone input.
  • Also in Skype’s Options / Sound Devices, set both “Sound Out” and “Ringing” to “pulse”.
  • Also in Skype’s Options / Sound Devices, it may be necessary to uncheck “Allow Skype to automatically adjust my mixer levels”.
  • Right click on the task bar’s volume control (the speaker icon), and select Open Volume Control. Hit the Preferences button, and add the following two controls: Capture (Recording) and Input Source (Options). Then in the Volume Control dialog, in the Recording tab, bring up the Capture level to near full, and in the Options tab, set the Input Source to Internal Mic.
  • Using sudo, edit the file /etc/modprobe.d/alsa-base and append the line:
    options snd-hda-intel model=thinkpad

I didn’t find an easy way to reload the sound modules after the last change, so I had to reboot the system for it to take effect.

Note: these instructions apply to Linux Mint 6, and, presumably, Ubuntu 8.10. I can’t guarantee they’ll work on other versions of Linux or other machines.

03/02/2009 (5:41 am)

Three Bad Designs

Filed under: hardware, rants ::

Most modern computers, and especially laptops, are afflicted with three especially poor design choices. We seem to be stuck with these choices, because the market, in its infinite wisdom, has decided that they are somehow an improvement over the old ways, or at least different. I’ll start with the oldest problem first.

Caps Lock in the wrong location

Up until around the mid-80s all terminals and computer keyboards had a Control key placed where God intended it, next to the A key on the home row. This was extremely convenient when using programs like editors (especially Emacs and it imitators). Even Windows has plenty of keyboard shortcuts that use the Control key. So having that key within easy reach was a big plus.

Life was good until IBM introduced its third-generation PC, called the PS/2. For reasons known only to Big Blue, the perfection of the predecessor AT keyboard was destroyed by having the Control key moved down to the bottom left corner of the keyboard, which required a long pinky stretch to reach. In its place IBM put a Caps Lock key, which was entirely useless, except possibly for writing angry rants on Usenet. The clone market responded by imitating IBM’s keyboard, and now all keyboards and laptops sold today, even those made by Apple, have this tendinitis-inducing Control key placement.

Fortunately, this is the one design blunder of the three I’m describing here that can be easily fixed. In both Linux and Mac OS, the Caps Lock key can be made to function as Control Key through the operating system’s respective GUI control panels. In Windows it’s a little more work that involves editing the Registry manually; a Google search can uncover the method.

Touchpads

I started using laptop computers about ten years ago, and my first was a Toshiba that had a TrackPoint: a pointing device that looks like a nipple placed in between the G, H, and B keys. It took me a week or so to get used to the thing, but once I got past the learning curve, I fell in love with it. It was a huge improvement over earlier laptop pointing devices, which were mostly trackballs. The TrackPoint was brilliant because it required no motion, just pressure; and because it allowed the fingers to stay on the home row. Because of these advantages, it was also a great improvement over conventional mice, which require constant movement of one hand from the home row to the mouse and back again.

But for some reason, trackpads became popular on laptops in the last few years, and now it’s impossible to find a laptop that doesn’t have one. These have the primary disadvantage of a conventional mice (the need to take one or both hands off of the home row), coupled with the need to repeatedly flick the finger across the thing to get the mouse cursor to do a complete travel from one side of the screen and the other. To make matters worse, trackpads are placed right where the thumbs naturally want to rest, resulting in unintended mouse movements and clicks due to the thumbs inadvertently touching or grazing the pad.

Fortunately Lenovo (formerly IBM) still makes ThinkPads with the TrackPoint, but these also come with touchpads, whose only advantage now is the side-scrolling feature.

Wide Screens

In the last couple of years, wide screen monitors have taken over the market. There appears to be no good reason for this other than the existence of wide-format movies. In every other respect, these screens are a disaster, because they have taken pixels away from the vertical dimension and given them to the horizontal. Screens that used to have 1200 pixels vertically are now typically have only 900. This is a disaster because most information is presented on screen vertically.

This is terrible for programmers, who like to see as much text as possible when editing. But it’s also bad for ordinary users running web browsers. Take a loop at a typical screen. On Linux or Windows, from top down we might see the following visual elements: the browser’s title bar, menu bar, tool/address bar, bookmarks bar, tabs bar, content window, search bar, status bar, and OS task/system bar. Each of these takes up vertical space, ultimately limiting the most important (and only scrollable) element, the content window. Wide screens make this worse by taking away as many as 300 pixels from the content window.

There isn’t much that can be done about this problem if you use a laptop. The various visual elements mentioned above cannot be moved to the side of the screen, for the most part. The OS task bar can be moved, true, but it’s not nearly as useful on the side of the screen, because the labels are no longer readable. The only solution is to grab up conventional displays while they are still available. I found a used ThinkPad R50p on eBay in December. It’s slower and noisier than a new laptop, but it was one of the last ThinkPads made that had the beautiful 1600×1200 FlexView display. Now, alas, all new ThinkPads have wide screens, a truly sad situation that appears to be permanent.

Desktop computer users can sometimes fix this problem by rotating their monitors 90 degrees and setting their graphics drivers to use portrait mode. I’d estimate that at least half the programmers at work have done this. Unfortunately, the last time I checked, Linux graphics drivers didn’t support hardware-assisted acceleration in portrait mode. But I still have a 1600×1200 display, so I don’t have to worry about this problem until the display breaks.

02/24/2009 (8:32 pm)

Upgrading from Rails 2.1 to 2.2

Filed under: rails, ruby ::

I have a small project that I developed at work using Rails 2.1. After installing the latest version of Rails, 2.2.2, I had to do the following to get my project working again:

  • Edit config/environment.rb and change the value of RAILS_GEM_VERSION to ‘2.2.2′.
  • Edit config/environments/development.rb and comment out the config.action_view.cache_template_extensions line.
  • Run rake rails:upgrade. This added a new file, script/dbconsole, and modified the following files:
    • config/boot.rb
    • public/javascripts/controls.js
    • public/javascripts/dragdrop.js
    • public/javascripts/effects.js
    • public/javascripts/prototype.js

02/14/2009 (5:56 am)

Power-off fix for Linux Mint 6 / Ubuntu 8.10

Filed under: linux, linux mint, ubuntu ::

The older computer on which I installed Linux Mint recently wouldn’t power off properly after a shutdown. This used to work on Mandrake 10.2. Apparently the problem is due to the newer Linux kernels requiring ACPI by default for power management, and this machine’s BIOS doesn’t seem to provide a compatible ACPI implementation.

After the usual slogging through Google search results and numerous experiments, the fix was extremely simple: add the following line to /etc/modules:

apm power_off=1

Several forum postings I found with Google suggested other solutions, including boot parameters and commenting out lines in various configuration files, but these weren’t necessary on this machine (which uses a Biostar M7VKQ motherboard).

02/09/2009 (9:46 pm)

Fixing screen resolution in Linux Mint 6 / Ubuntu 8.10

Filed under: linux, linux mint, ubuntu ::

I installed Linux Mint on my parents’ computer today, replacing Mandrake 10.2. (Yes, grandparents can use Linux.) This older computer has a motherboard with a built-in VGA adapter by Trident, connected to an ancient CRT display with a maximum resolution of 1024×768. But for some reason, Linux Mint set the resolution to 800×600, and the Screen Resolution tool in the Control Center would not allow it to be set higher.

After some Google searching, I came across some Ubuntu forum posts that suggested various fixes that did not work, or which required programs that were not available on the live CD. Finally, the thing that worked was quite simple: I edited /etc/X11/xorg.conf, and in the “Monitor” section added the following line:

HorizSync 28 - 60

After restarting X with Ctrl-Alt-Backspace and logging back in, the Screen Resolution tool now showed a number of newly available screen resolutions, including the desired 1024×768. Apparently, the Trident display driver (or some other piece of X) wasn’t able to detect the monitor capabilities automatically (perhaps due to the monitor’s extreme antiquity), and the new line in xorg.conf provided just enough of the required information.

« Previous PageNext Page »