One of the Flickr engineers was nice enough to blog about their general process for adding capacity to their database backend. Seems the process went from a straight 20-hour table alter process to a more staggered but sleep fulfilling approach.
The newer approach does seem to require having a certain set of extra capacity to handle taking some machines offline and performing a pretty long offline operation while still continuing to run.