Psych statistics wars: new methods are shattering old-guard assumptions
Recently, statistician Andrew Gelman has been brilliantly breaking down the transformation of psychology (and social psych in particular) through its adoption of and creative use of statistical methods, leading to an improved understanding of how statistics can be abused in any field, and of how empirical observations can be [unwittingly and unintentionally] flawed. This led to the concept of p-hacking and other methodological fallacies which can be observed in careless uses of statistics throughout scientific and public analyses. And, as these new tools were used to better understand psychology and improve its methods, existing paradigms and accepted truths have been rapidly changed over the past 5 years. This shocks and anguishes researchers who are true believers in”hypotheses vague enough to support any evidence thrown at them“, and have built careers around work supporting those hypotheses.
Here is Gelman’s timeline of transformations in psychology and in statistics, from Paul Meehl’s argument in the 1960s that results in experimental psych may have no predictive power, to PubPeer, Brian Nosek’s reprodicibility project, and the current sense that “the emperor has no clothes”.
Here is a beautiful discussion a week later, from Gelman, about how researchers respond to statistical errors or other disproofs of part of their work. In particular, how co-authors handle such new discoveries, either together or separately.
At the end, one of its examples turns up a striking example of someone taking these sorts of discoveries and updates to their work seriously: Dana Carney‘s public CV includes inline notes next to each paper wherever significant methodological or statistical concerns were raised, or significant replications failed.
Carney makes an appearance in his examples because of her most controversially popular research, with Cuddy an Yap, on power posing. A non-obvious result (that holding certain open physical poses leads to feeling and acting more powerfully) became extremely popular in the popular media, and has generated a small following of dozens of related extensions and replication studies — which starting in 2015 started to be done with large samples and at high power, at which point the effects disappeared. Interest within social psychology in the phenomenon, as an outlier of “a popular but possibly imaginary effect”, is so great that the journal Comprehensive Results in Social Psychology has an entire issue devoted to power posing coming out this Fall.
Perhaps motivated by Gelman’s blog post, perhaps by knowledge of the results that will be coming out in this dedicated journal issue [which she suggests are negative], she put out a full two-page summary of her changing views on her own work over time, from conceiving of the experiment, to running it with the funds and time available, to now deciding there was no meaningful effect. My hat is off to her. We need this sort of relationship to data, analysis, and error to make sense of the world. But it is a pity that she had to publish such a letter alone, and that her co-authors didn’t feel they could sign onto it.
Update: Nosek also wrote a lovely paper in 2012 on Restructuring incentives to promote truth over publishability [with input from the estimable Victoria Stodden] that describes many points at which researchers have incentives to stop research and publish preliminary results as soon as they have something they could convince a journal to accept.
Reader: Discover the effect of happiness on your health today
“When I was 5 years old, my mother always told me that happiness was the key to life. When I went to school, they asked me what I wanted to be when I grew up. I wrote down happy. They told me I didn’t understand the assignment, and I told them they didn’t understand life.” —Lennon
From the BODYWORLDS exhibit in Amsterdam, full of flayed and preserved human bodies.
Soft, distributed review of public spaces: Making Twitter safe
Successful communities have learned a few things about how to maintain healthy public spaces. We could use a handbook for community designers gathering effective practices. It is a mark of the youth of interpublic spaces that spaces such as Twitter and Instagram [not to mention niche spaces like Wikipedia, and platforms like WordPress] rarely have architects dedicated to designing and refining this aspect of their structure, toolchains, and workflows.
Some say that ‘overly’ public spaces enable widespread abuse and harassment. But the “publicness” of large digital spaces can help make them more welcoming in ways than physical ones – where it is harder to remove graffiti or eggs from homes or buildings – and niche ones – where clique formation and systemic bias can dominate. For instance, here are a few ‘soft’ (reversible, auditable, post-hoc) tools that let a mixed ecosystem review and maintain their own areas in a broad public space:
Allow participants to change the visibility of comments: Let each control what they see, and promote or flag it for others.
- Allow blacklists and whitelists, in a way that lets people block out harassers or keywords entirely if they wish. Make it easy to see what has been hidden.
- Rating (both average and variance) and tags for abuse or controversy can allow for locally flexible display. Some simple models make this hard to game.
- Allow things to be incrementally hidden from view. Group feedback is more useful when the result is a spectrum.
Increase the efficiency ratio of moderation and distribute it: automate review, filter and slow down abuse.
- Tag contributors by their level of community investment. Many who spam or harass try to cloak in new or fake identities.
- Maintain automated tools to catch and limit abusive input. There’s a spectrum of response: from letting only the poster and moderators see the input (cocooning), to tagging and not showing by default (thresholding), to simply tagging as suspect (flagging).
- Make these and other tags available to the community to use in their own preferences and review tools
- For dedicated abuse: hook into penalties that make it more costly for those committed to spoofing the system.
You can’t make everyone safe all of the time, but can dial down behavior that is socially unwelcome (by any significant subgroup) by a couple of magnitudes. Of course these ideas are simple and only work so far. For instance, in a society at civil war, where each half are literally threatened by the sober political and practical discussions of the other half, public speech may simply not be safe.
“BRB singularity” : A comic on love, death, and robots
XBRB – stories from the Singularity.
A Blue/Red/Brown production.
Dimple joust as sport: the perfect combination of skill and reflex
Dimple jousting is the purest form of duel. Everyone can play on equal footing, the winner is obvious, and there is no chance involved.
Edit by Edit: an Article Feedback Tool gets firmly tested
One of the Wikipedia projects that has been developing slowly over the past two years is the Article Feedback Tool. In its first incarnation, it let readers rate articles with a star system (1 to 5 stars for each of the areas of being Well-Sourced, Complete, Neutral, and Readable).
The latest version of the tool, version 5, shifts the focus of the person giving feedback to leaving a comment, and noting whether or not they found what they were looking for. After some interation and tweaking, including an additional abuse filter for comments, it has recently been turned on for 10% of the articles on the English Wikipedia.
This is generating roughly 1 comment per minute; or 10/min if it were running on all articles. In comparison, the project gets around 1 edit per second overall. So if turned on for 100% of articles, it would add 15-20% to the editing activity on the site. This is clearly a powerful channel for input, for readers who have something to share but aren’t drawn in by the current ‘edit’ tabs.
What is the community’s response? Largely critical so far. The primary criticism is that the ease of commenting encourages short, casual/random/non-useful comments; and that it tends to be one-way communication [because there’s no obvious place to find responses? this isn’t necessarily so; replies could auto-generate a notice on the talk page of the related IP]. Many specific suggestions and rebuttals of the initial implementation have been made, some heard more than others. The implementation was overall not quite sensitive to the implications for curation and followthrough.
A roadmap that included a timeframe for expanding the tool from 10% to 100% of articles was posted, without a community discussion; so a Request for Comments was started by an interested community member (rather than by the designers). This started in mid-January, and currently has a plurality of respondents asking to turn the tool off until it has addressed some of the outstanding issues.
The impression of the developers, here as with some other large organically-developing feature rollouts, was not that they had gotten thorough and firm testing, but that editors were fighting over every detail, making communication about what works and why hard. Likewise there has been a shortage of good facilitators to take in all varieties of feedback and generate an orderly summary and practical solutions.
So how did things go wrong? Pete gets to the heart of it in his comment, where he asks for a clearer presentation of the project hopes and goals, measures of success, and a framework for community engagement, feedback, and approval:
I think it’s a mere mistake, but it does get frustrating because WMF has made this same mistake in other big technical projects…
What I’m looking for is the kind of basic framework that would encompass possible objections, and establish a useful way of communicating about them…
WMF managed that really well with the Strategic Planning process, and with the TOU rewrite. The organization knows how to do it. I believe if it had been done in this case, things would look very different right now…
It is our technical projects that are most likely to stumble at that stage – sometimes for many months – despite putting significant energy into communication.
Can we do something about it now? Like most of the commenters on the RfC, including those opposing the current implementation, I see a great deal of potential good in this tool, while also seeing why it frustrates many active editors. It seems close to something that could be rolled out with success to the contentment of commenters and long-time editors alike; but perhaps not through the current process of defining and discussing features / feedback / testing (which begs for confrontational challenge/response discussions that are draining, time-consuming, and avoid actually resolving the issues raised!).
I’ll write more about this over the coming week.
*.MIT goes down; the Internet sees a Swartzite omen
TechCrunch and others noted that *.mit and the redirect doj.gov (not the treasury.gov website itself) were down for some time tonight, from roughly 7pm to 10pm.
MIT looked into the problem, and some reported a link to a router configuration bug that’s been happening sporadically in recent weeks. This didn’t stop many on the Internet from seeing an omen or intervention or DDOS attack related to Aaron’s death.
But there may be a connection. An hour ago, after access to most of the MIT network was restored, two specific MIT sites cogen.mit.edu and rledev.mit.edu) were hacked by Anonymous to display a page remembering Aaron. The MIT Tech has the most up to date coverage: (“Anonymous Hacks MIT“)
The Anonymous message said, in part:
“We tender apologies to the administrators at MIT for this temporary use of their websites. We understand that it is a time of soul-searching for all those within this great institution as much — perhaps for some involved even more so — than it is for the greater internet community.”
A personal note from MIT President L. Rafael Reif
This just went out by email, from MIT President Reif, who was inaugurated president in September:
To the members of the MIT community:
Yesterday we received the shocking and terrible news that on Friday in New York, Aaron Swartz, a gifted young man well known and admired by many in the MIT community, took his own life. With this tragedy, his family and his friends suffered an inexpressible loss, and we offer our most profound condolences. Even for those of us who did not know Aaron, the trail of his brief life shines with his brilliant creativity and idealism.
Although Aaron had no formal affiliation with MIT, I am writing to you now because he was beloved by many members of our community and because MIT played a role in the legal struggles that began for him in 2011.
I want to express very clearly that I and all of us at MIT are extremely saddened by the death of this promising young man who touched the lives of so many. It pains me to think that MIT played any role in a series of events that have ended in tragedy.
I will not attempt to summarize here the complex events of the past two years. Now is a time for everyone involved to reflect on their actions, and that includes all of us at MIT. I have asked Professor Hal Abelson to lead a thorough analysis of MIT’s involvement from the time that we first perceived unusual activity on our network in fall 2010 up to the present. I have asked that this analysis describe the options MIT had and the decisions MIT made, in order to understand and to learn from the actions MIT took. I will share the report with the MIT community when I receive it.
I hope we will all reach out to those members of our community we know who may have been affected by Aaron’s death. As always, MIT Medical is available to provide expert counseling, but there is no substitute for personal understanding and support.
With sorrow and deep sympathy,
L. Rafael Reif
Wikipedia gets visual editor in time for Christmas
One small step for an editor…
Huge props to the team working on this and the underlying parsoid. It’s still in Alpha, so it’s only on the English Wikipedia this week. And you have to turn it on via user prefs; and it wants good feedback, but it makes the old heart-cockles sing.
The White House supports Open Source, sharing Drupal modules it designed
On the power and community of open source, from the WH Blog.
This isn’t written to publish their Drupal code, which they’ve been doing for some time and will continue to do (though they do announce creation of their own space within the Drupal community), it’s primarily about how and when open source is awesome and why it is the way to go for many practices. A great message to send; a small step towards more open tools for society.