The Longest Now


Psych statistics wars: new methods are shattering old-guard assumptions
Thursday October 20th 2016, 12:51 pm
Filed under: %a la mod,chain-gang,citation needed,Glory, glory, glory,knowledge,meta,metrics

Recently, statistician Andrew Gelman has been brilliantly breaking down the transformation of psychology (and social psych in particular) through its adoption of and creative use of statistical methods, leading to an improved understanding of how statistics can be abused in any field, and of how empirical observations can be [unwittingly and unintentionally] flawed. This led to the concept of p-hacking and other methodological fallacies which can be observed in careless uses of statistics throughout scientific and public analyses. And, as these new tools were used to better understand psychology and improve its methods, existing paradigms and accepted truths have been rapidly changed over the past 5 years. This shocks and anguishes researchers who are true believers in”hypotheses vague enough to support any evidence thrown at them“, and have built careers around work supporting those hypotheses.

Here is Gelman’s timeline of transformations in psychology and in statistics, from Paul Meehl’s argument in the 1960s that results in experimental psych may have no predictive power, to PubPeer, Brian Nosek’s reprodicibility project, and the current sense that “the emperor has no clothes”.

Here is a beautiful discussion a week later, from Gelman, about how researchers respond to statistical errors or other disproofs of part of their work.  In particular, how co-authors handle such new discoveries, either together or separately.

At the end, one of its examples turns up a striking example of someone taking these sorts of discoveries and updates to their work seriously: Dana Carney‘s public CV includes inline notes next to each paper wherever significant methodological or statistical concerns were raised, or significant replications failed.

Carney makes an appearance in his examples because of her most controversially popular research, with Cuddy an Yap, on power posing.  A non-obvious result (that holding certain open physical poses leads to feeling and acting more powerfully) became extremely popular in the popular media, and has generated a small following of dozens of related extensions and replication studies — which starting in 2015 started to be done with large samples and at high power, at which point the effects disappeared.  Interest within social psychology in the phenomenon, as an outlier of “a popular but possibly imaginary effect”, is so great that the journal Comprehensive Results in Social Psychology has an entire issue devoted to power posing coming out this Fall.
Perhaps motivated by Gelman’s blog post, perhaps by knowledge of the results that will be coming out in this dedicated journal issue [which she suggests are negative], she put out a full two-page summary of her changing views on her own work over time, from conceiving of the experiment, to running it with the funds and time available, to now deciding there was no meaningful effect.  My hat is off to her.  We need this sort of relationship to data, analysis, and error to make sense of the world. But it is a pity that she had to publish such a letter alone, and that her co-authors didn’t feel they could sign onto it.

Update: Nosek also wrote a lovely paper in 2012 on Restructuring incentives to promote truth over publishability [with input from the estimable Victoria Stodden] that describes many points at which researchers have incentives to stop research and publish preliminary results as soon as they have something they could convince a journal to accept.



Archiving Web links: Building global layers of caches and mirrors
Sunday June 12th 2016, 4:23 pm
Filed under: international,knowledge,meta,metrics,popular demand,wikipedia

The Web is highly distributed and in flux; the people using it, even moreso.  Many projects exist to optimize its use, including:

  1. Reducing storage and bandwidth:  compressing parts of the web; deduplicating files that exist in many places, replacing many with pointers to a single copy of the file [Many browsers & servers, *Box]
  2. Reducing latency and long-distance bandwidth:  caching popular parts of the web locally around the world [CDNs, clouds, &c]
  3. Increasing robustness & permanence of links: caching linked pages (with timestamps or snapshots, for dynamic pages) [Memento, Wayback Machine, perma, amber]
  4. Increasing interoperability of naming schemes for describing or pointing to things on the Web, so that it’s easier to cluster similar things and find copies or versions of them [HvdS’s 15-year overview of advancing interop]

This week I was thinking about the 3rd point. What would a comprehensively backed-up Web of links look like?  How resilient can we make references to all of the failure modes we’ve seen and imagined?  Some threads for a map:

  1. Links should include timestamps, important ones should request archival permalinks.
    • When creating a reference, sites should notify each of the major cache-networks, asking them to store a copy.
    • Robust links can embed information about where to find a cache in the a tag that generates the link (and possibly a fuzzy content hash?).
    • Permalinks can use an identifier system that allows searching for the page across any of the nodes of the local network, and across the different cache-networks. (Browsers can know how to attempt to find a copy.)
  2. Sites should have a coat of amber: a local cached snapshot of anything linked from that site, stored on their host or a nearby supernode.  So as long as that site is available, snapshots of what it links to are, too.
    • We can comprehensively track whether sites have signalled they have an amber layer.  If a site isn’t yet caching what they link to, readers can encourage them to do so or connect them to a supernode.
    • Libraries should host amber supernodes: caches for sites that can’t host those snapshots on their host machine.
  3. Snapshots of entire websites should be archived regularly
    • Both public snapshots for search engines and private ones for long-term archives.
  4. A global network of mirrors (a la [C]LOCKSS) should maintain copies of permalink and snapshot databases
    • Consortia of libraries, archives, and publishers should commit to a broad geographic distribution of mirrors.
      • mirrors should be available within any country that has expensive interconnects with the rest of the world;
      • prioritization should lead to a kernel of the cached web that is stored in ‘seed bank‘ style archives, in the most secure vaults and other venues
  5. There should be a clear way to scan for fuzzy matches for a broken link. Especially handy for anyone updating a large archive of broken links.
    • Is the base directory there? Is the base URL known to have moved?
    • Are distant-timestamped versions of the file available?  [some robustlink implementations do this already]
    • Are there exact matches elsewhere in the web for a [rare] filename?  Can you find other documents with the same content hash? [if a hash was included in the link]
    • Are there known ways to contact the original owner of the file/directory/site?

Related questions: What other aspects of robustness need consideration? How are people making progress at each layer?  What more is needed to have a mesh of archived links at every scale? For instance, WordPress supports a chunk of the Web; top CDNs cache more than that. What other players can make this happen?  What is needed for them to support this?



Aaron Swartz hackfests this weekend around the world: honoring his work
Friday November 08th 2013, 7:04 pm
Filed under: Aasw,Glory, glory, glory,international,knowledge,meta,metrics,popular demand,wikipedia

Help continue projects Aaron believed in, in person or online.
I’ll be at the Cambridge event and aftermath throughout the long weekend.

Related project summaries:



Cambridge doggerel in celebration of her glorious sunsets
Friday October 18th 2013, 8:01 pm
Filed under: Aasw,Glory, glory, glory,indescribable,meta,Not so popular,poetic justice

140 characters, just like mom’s.

The sunset was pretty
in Cambridge. The ember
of Sun cast the city
in hues to remember.

When I tried to draw Rindge
and Latin, ’twas orange.



Annotation Notes from a recent discussion with this year’s Berkterns
Thursday June 13th 2013, 10:18 pm
Filed under: citation needed,knowledge,meta,popular demand,wikipedia

Anno-notes.  (thanks, piratepad)



One Weird Kernel Trick: from Zero to Stats Hero in only Twelve Days
Tuesday April 09th 2013, 7:35 pm
Filed under: Glory, glory, glory,knowledge,meta,metrics,poetic justice

From the “too good to be true (but it is)” dept: OneWeirdKernelTrick.com

YanZhu



Big Data Maven On Knowledge Topology: 9 Insightful Posts
Saturday March 30th 2013, 3:31 pm
Filed under: Glory, glory, glory,ideonomy,meta

Read the Big Data and the Topologist series, from the “low-dimensional topology” blog, written by 5+ budding topologicians.

They maintain a handy list of open problems they have discussed.
Michael Stone.



One man’s salvation from persistent madness to reasoned satirist
Saturday February 09th 2013, 4:27 am
Filed under: indescribable,meta,Seraphic

96 days of altered consciousness and recovering from a psychotic break. Told with humor and self-awareness, in an epic 18-part tale.

Let’s say that every time I see a yellow car, you actually see what I would call a green dragon, and we’ve just adapted to different driving styles… Now let’s assume we both see an object descended from the Model-T, and not the offspring of a bat fucking an iguana in a wood stove.* Except now I’m secretly attaching the symbol of car to dragon.

* I say natural selection demands that if you did this enough times, something would survive, and I bet that something would be a dragon. If there are any crazy people reading this right now, you have your mission.



Exploring science in ten hundred words or less, and similar gems
Tuesday January 29th 2013, 6:27 pm
Filed under: chain-gang,citation needed,indescribable,knowledge,meta,poetic justice,Uncategorized

try and grok science
try and make a gun
try Sheldrake’s homing dove thought experiments

For dessert, some fraud:
listed, retracted, pharmed, 11-jigen (x6),
chilled(snapshot, comments).



Now I remember the flush of despair: cold crisp inverted insight
Sunday January 27th 2013, 7:30 pm
Filed under: Aasw,knowledge,meta

Larry’s foresight to clear schedules seems fair, from that inverted space.



Mystery Hunting, 2013: Pulling off an epic Coin Heist
Friday January 25th 2013, 7:50 pm
Filed under: Aasw,chain-gang,indescribable,knowledge,meta,Uncategorized,zyzzlvaria

Mystery Hunt 2013 pitted teams against Enigma Valley to rescue the Hunt coins from a vault.

As usual, it was full of some of the best puzzle ideas in the world.   (more…)



From a sysadmin: the perils of reporting trouble (from MeFi)
Sunday January 13th 2013, 6:10 pm
Filed under: chain-gang,meta,null

As a former sysadmin at MIT, I was very curious about this case and eager for the facts to come out, and I guess they can, but not like this. Definitely not like this. I also had the job of chasing intruders out of a segment of MIT’s network (fairly light duty, actually), and having been there I will state the following publicly, because I am pissed off today. Seriously pissed off.

These over the top prosecution of nuisance intrusions makes sysadmins like me highly reluctant to initiate communication with the feds. The threat of criminal prosecution was enough to make Mr. Swartz back off from his actions. That’s why MIT and JSTOR backed off. Someone at DOJ decided to keep going, and he just made life harder for federal investigators in countless other cases, who will not be getting that first phone call from a sysadmin.

When an intruder is on my network, before I call the authorities, I want to know that the authorities will exercise judgement and prosecute accordingly. If he’s a criminal trying to use my resources for crimes, that’s one thing. If he’s a kid or a kook being a nuisance, then the authorities have a duty to exercise precisely enough muscle to scare him off my network and call it a day. If I have reason to think that the authorities will throw the book at a someone who is a mild nuisance, then I won’t make the phone call. I will investigate the intrusiion myself, kick him off myself, and keep my fucking mouth shut. These prosecutions are a waste of money, and today one of them became a waste of a life.



A personal note from MIT President L. Rafael Reif
Sunday January 13th 2013, 5:40 pm
Filed under: %a la mod,Glory, glory, glory,meta,popular demand

This just went out by email, from MIT President Reif, who was inaugurated president in September:

To the members of the MIT community:

Yesterday we received the shocking and terrible news that on Friday in New York, Aaron Swartz, a gifted young man well known and admired by many in the MIT community, took his own life. With this tragedy, his family and his friends suffered an inexpressible loss, and we offer our most profound condolences. Even for those of us who did not know Aaron, the trail of his brief life shines with his brilliant creativity and idealism.

Although Aaron had no formal affiliation with MIT, I am writing to you now because he was beloved by many members of our community and because MIT played a role in the legal struggles that began for him in 2011.

I want to express very clearly that I and all of us at MIT are extremely saddened by the death of this promising young man who touched the lives of so many. It pains me to think that MIT played any role in a series of events that have ended in tragedy.

I will not attempt to summarize here the complex events of the past two years. Now is a time for everyone involved to reflect on their actions, and that includes all of us at MIT. I have asked Professor Hal Abelson to lead a thorough analysis of MIT’s involvement from the time that we first perceived unusual activity on our network in fall 2010 up to the present. I have asked that this analysis describe the options MIT had and the decisions MIT made, in order to understand and to learn from the actions MIT took. I will share the report with the MIT community when I receive it.

I hope we will all reach out to those members of our community we know who may have been affected by Aaron’s death. As always, MIT Medical is available to provide expert counseling, but there is no substitute for personal understanding and support.

With sorrow and deep sympathy,

L. Rafael Reif



Aaron Swartz, scholar, activist, and Internet hero, is dead.
Saturday January 12th 2013, 3:28 pm
Filed under: Blogroll,knowledge,meta,null,Seraphic,wikipedia

Aaron took his life yesterday. I am still finding it hard to believe.

His ongoing court case overshadows his death, so let me get that out of the way: 
He was living through a two-year federal case which had only become more nightmarish since last year.  (JSTOR stated it did not want a trial, and has steadily been releasing the PD articles in question and more for free public use; yet the prosecution, continuing its outrageous abuse of discretion, declined to settle and tripled their felony charges to cover up to 35 years in prison.)

Friends and family were helping him plan a campaign to spread the word about the unreasonableness and inequity of the trial. Its uncertainty was intensely stressful, even for those of us who lived only the tiniest fraction of it.  As Lessig notes, the prosecutors – Stephen P. Heymann (and at times Scott L. Garland), working in Carmen M. Ortiz‘s Cybercrime unit – should be taking a long hard look in the mirror and asking themselves what they are doing with their lives.


Aaron was a dear friend, and one of the most decent men I have known.  The only times I have seen him truly angry was in response to some social wrong; and he actively looked for ways to find and eliminate injustice. He always considered how to act morally – even when this meant being at odds with local social norms – and regularly paused at forks in his life to think about how to live so as to benefit society.

He kindled ideas from those nearby, and freely passed on his own.  Made mistakes often and tried to learn from them, usually publicly. His transparency was a useful meterstick for me. Ages ago, when we first met, I remember him brainstorming ideas about community and wiki design with Zvi and me; about learning and unlearning, society and ideals, civics and collaboration.  Once his curiosity was piqued about a subject he would pursue it until he could write about and explain it.  

~ ~~~ ~

I spent last night with mutual friends who live now in his old apartment, in a room that was once his; remembering the many great projects he started and inspired – especially the little gems, the personal quirks and insights, the inspiring ideas that became single-purpose services, or calls to arms. (We never did start a dog-walking service for data, but the idea abides.) Rereading some of his writings, I remember the many opportunities missed for synthesis, reframing, and clarity – about how life works, and how to live it.

Everyone has idealized dreams — what would you do with an unlimited wish? — about long-term projects worth devoting one’s life to, to transform the world. Dreams cherished but rarely attempted.  Aaron was the only person I felt completely comfortable sharing mine with.  We had a little game: a couple times a year we would meet in a nameless cafe, and he would ask for ‘rabbinical’ advice on moral quandaries, and I would ask for ‘professional’ advice on realizing societal dreams. I don’t know that he needed my advice, but I always looked forward to his. There was usually at least one book suggestion from his endless reading list that answered an open question of mine. And no matter how grandiose the dream, he would understand, clarify, laugh, counterpoint, help tune mental models, and remind me to get to it. And we never had quite enough time.

I miss him very, very, very much.   Part of my own future has gone missing too.

Somewhere, celestials are being taught to tune the cosmos.

 

In Memoriam:
Quinn. TBL. Grimm. Cory. Larry (^2). Cyrus Farivar.

The court case.
Alex Stamos (on the wrongness of the case).
New York Times (front page).
The Guardian (front page + 4 more articles)
The WSJ.

In his own words:
How to work.
How we stopped SOPA.
On feeling low and key limes.

From the Boston Wikipedia Meetup on August 18, 2009, by Sage Ross:



Better knowledge graphs fit for Star Trek computers coming to Google
Monday December 31st 2012, 8:32 pm
Filed under: chain-gang,international,knowledge,meta,metrics,wikipedia

Last year Google acquired Metaweb, providing a reliable future to their many projects, including Refine and Freebase.

From earlier this year, here’s a quote from Amit Singhal, Google’s SVP responsible for their Knowledge Graph:

We hope this added intelligence will give you a more complete picture of your interest, provide smarter search results, and pique your curiosity on new topics. We’re proud of our first baby step—the Knowledge Graph—which will enable us to make search more intelligent, moving us closer to the “Star Trek computer” that I’ve always dreamt of building. Enjoy your lifelong journey of discovery, made easier by Google Search, so you can spend less time searching and more time doing what you love.

In the near future, I expect both Google’s knowledge graph, and the increasing awareness of the usefulness of such graphs, to change the structure and scope of industrial-scale knowledge processing. Thanks to all those working on these tools and solutions; see you in 2013!



Chinese Internet discovered to be full of memes: Top 10 Edition
Sunday December 09th 2012, 3:57 am
Filed under: %a la mod,chain-gang,Glory, glory, glory,international,meta,zyzzlvaria

via Global Voices, the Top 10 Chinese Internet Memes of 2012.



Three Copyright Myths and Where to Start to Fix it – a policy brief

A lovely short policy brief on designing a better copyright regime was published on Friday – before being quickly taken offline again.  I’ve reposted it here with light cleanup of its section headings.

If you care at all about copyright and its quirks, this is short and worth reading in full.




Bad Behavior has blocked 350 access attempts in the last 7 days.