You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

Interview Done, Here Cometh the List

ø

Tuesday and Wednesday ended up being a write-off, in terms of research. First, I had my annual paragraph of letdowns to write for PlanktonTech, which I ended up doing successfully, though it cost some time and emotional investment, reading back over the lofty aspirations and lost time of the past year. And on Wednesday morning, the interview—which went OK, though it left me feeling pretty well exhausted, and after a careless mistake too many I decided to call it quits for the day and went home (and actually fell asleep in the middle of the afternoon).

Anyway. It’s been a bit of an uphill struggle regaining momentum after that break of focus, short though it was. Maybe it’s the midnight meowing of our feline houseguest, or the morning runs I am still getting used to, but I’m remarkably exhausted…

In any case. I emailed Sorhannus again and asked if he would send the complete tree file, so that I could calculate patristic distances. This involves figuring out how to read this file into R, and how to monkey with it once it’s there. It’s a .nwk file, which turns out stands for Newick, and is the standard file format for trees. It’s taxon (node) names hierarchically clustered by parentheses, with numbers denoting branch lengths. Seems straightforward enough.

I managed to do this, calculate the patristic distance (thanks to some more help from Allison) using the cophenetic() function in {ape}, and plot up the result (it’s slightly less well correlated, interestingly, than the direct distance, but not by much):

Since this plot looks similar to the last one I posted, it’s no surprise that the correlation between patristic and “direct” raw distance among sequences is high (r-squared of 0.79):