“Old Books” photo by flickr user Iguana Joe, used by permission (CC-by-nc)
“Old Books”
photo by flickr user Iguana Joe, used by permission (CC-by-nc)

Earlier this week, the Harvard Library announced its new open metadata policy, which was approved by the Library Board earlier this year, along with an initial two metadata releases. The policy is straightforward:

The Harvard Library provides open access to library metadata, subject to legal and privacy factors. In particular, the Library makes available its own catalog metadata under appropriate broad use licenses. The Library Board is responsible for interpreting this policy, resolving disputes concerning its interpretation and application, and modifying it as necessary.

The first releases under the policy include the metadata in the DASH repository. Though this metadata has been available through open APIs since early in the repository’s history, the open metadata policy makes clear the open licensing terms that the data is provided under.

The release of a huge percentage of the Harvard Library’s bibliographic metadata for its holdings is likely to have much bigger impact. We’ve provided 12 million records — the vast majority of Harvard’s bibliographic data — describing Harvard’s library holdings in MARC format under a CC0 license that requests adherence to a set of community norms that I think are quite reasonable, primarily calling for attribution to Harvard and our major partners in the release, OCLC and the Library of Congress.

OCLC in particular has praised the effort, saying it “furthers [Harvard’s] mandate from their Library Board and Faculty to make as much of their metadata as possible available through open access in order to support learning and research, to disseminate knowledge and to foster innovation and aligns with the very public and established commitment that Harvard has made to open access for scholarly communication. I’m pleased to say that they worked with OCLC as they thought about the terms under which the release would be made.” We’ve gotten nice coverage from the New York TimesLibrary Journal, and Boing Boing as well.

Many people have asked what we expect people to do with the data. Personally, I have no idea, and that’s the point. I’ve seen over and over that when data is made openly available with the fewest impediments — legal and technical — people are incredibly creative about finding innovative uses for the data that we never could have predicted. Already, we’re seeing people picking up the data, exploring it, and building on it.

  • The Digital Public Library of America is making the data available through an API that provides data in a much nicer way than the pure MARC record dump that Harvard is making available.
  • Within hours of release, Benjamin Bergstein had already set up his own search interface to the Harvard data using the DPLA API.
  • Carlos Bueno has developed code for the Harvard Library Bibliographic Dataset to parse its “wonky” MARC21 format, and has open-sourced the code.
  • Alf Eaton has documented his own efforts to work with the bibliographic dataset, providing instructions for downloading and extracting the records and putting up all of the code he developed to massage and render the data. He outlines his plans for further extensions as well.

(I’m sure I’ve missed some of the ways people are using the data. Let me know if you’ve heard of others, and I’ll update this list.)

As I’ve said before, “This data serves to link things together in ways that are difficult to predict. The more information you release, the more you see people doing innovative things.” These examples are the first evidence of that potential.

John Palfrey, who was really the instigator of the open metadata project, has been especially interested in getting other institutions to make their own collection metadata publicly available, and the DPLA stands ready to help. They’re running a wiki with instructions on how to add your own institution’s metadata to the DPLA service.

It’s hard to list all the people who make initiatives like this possible, since there are so many, but I’d like to mention a few major participants (in addition to John): Jonathan Hulbert, Tracey Robinson, David Weinberger, and Robin Wendler. Thanks to them and the many others that have helped in various ways.

“Majesty of Law” Statue in front of the Rayburn House Office Building in Washington, D.C., photo by flickr user NCinDC, used by permission (CC-by-nd)
“Majesty of Law”
Statue in front of the Rayburn House Office Building in Washington, D.C., photo by flickr user NCinDC, used by permission (CC-by-nd)

Here is my written testimony filed in association with my appearance yesterday at the hearing on “Federally Funded Research: Examining Public Access and Scholarly Publication Interests” before the Subcommittee on Investigations and Oversight of the House Committee on Science, Space and Technology. My thanks to Chairman Broun, ranking member Tonko, and the committee for allowing me the opportunity to speak with them today.

[Update 3/30/12: Coverage from Chronicle of Higher Education. Update 4/2/12: Video of the session is available from the House Science Committee as well.]
Read the rest of this entry »

Have scientists lost interest again?
“Have scientists lost interest again?”

The “Cost of Knowledge” boycott of Elsevier is in its seventh week. The boycott was precipitated by various practices of the journal publisher, most recently its support for the Research Works Act, a bill that would roll back the NIH public access policy and prevent similar policies by other federal funding agencies.

Early on, several hundred researchers a day were signing on to the pledge not to submit to or edit or review for Elsevier journals, but recently that rate had settled down to about a hundred per day. On February 11, I started tracking the daily totals by scraping the site through a simple scraper I set up at ScraperWiki. I’ve graphed the results in the attached graph, showing raw count of signatories with the blue line (left axis) and the number added since the previous day with the green bars (right axis).

As you can see from the chart, there seems to be a slight drop in activity around weekends, and Sunday February 26 and Monday February 27 had clearly been the slowest days since I’ve been keeping records, and likely since the effort started. On the 27th (red arrow), Elsevier issued its quasi-recantation of support for RWA. (“While we continue to oppose government mandates in this area, Elsevier is withdrawing support for the Research Work Act itself. We hope this will address some of the concerns expressed….”)

The day after Elsevier’s announcement saw a bit of a bump back to previous levels. Was this an instance of the Streisand effect or was the 26-27 dip an aberration? It’s hard to tell. However, since the 27th, it seems clear that the number of pledges is down considerably. It could well be that Elsevier’s tactical approach has worked and it has stanched the spate of boycott pledges, despite the fact that the community was generally unimpressed with Elsevier’s statement, as Peter Suber has cataloged. Alternatively, the current rate of new pledges may just reflect the natural reductions that had been happening over the last few weeks.

Elsevier has not changed its underlying stance. It still “continue[s] to oppose government mandates” for public access, as per RWA. It strongly opposes FRPAA. Have scientists lost interest again?

A bumpy road
“Note the surges…”

[Update 4/20/2012: Now that a few more weeks have passed, here’s an updated figure of the boycott growth. Note the surges around March 18 and April 10. As near as I can make out, these were the result of widely disseminated coverage in Slashdot and the Guardian, respectively. These surges show that the boycott hasn’t played itself out yet, and that continued discussion of the boycott is likely to lead to a continued steady rise in the number of signatures.

At the current rate, I expect the number of signatories to hit 10,000 around April 27 or so.]

[Update 4/24/2012: Well, my guess was wrong. A big bump of activity in the last few days meant that the boycott broke 10,000 signatures on April 23. I’m not sure who to blame for the renewed interest in the last couple of days. Anyone have any conjectures?]

An efficient journal

March 6th, 2012

...time to switch...
“You seem to believe in fairies.”
Photo of the Cottingley Fairies, 1917, by Elsie Wright via Wikipedia.

Aficionados of open access should know about the Journal of Machine Learning Research (JMLR), an open-access journal in my own research field of artificial intelligence, a subfield of computer science concerned with the computational implementation and understanding of behaviors that in humans are considered intelligent. The journal became the topic of some dispute in a conversation that took place a few months ago in the comment stream of the Scholarly Kitchen blog between computer science professor Yann LeCun and scholarly journal publisher Kent Anderson, with LeCun stating that “The best publications in my field are not only open access, but completely free to the readers and to the authors.” He used JMLR as the exemplar. Anderson expressed incredulity:

I’m not entirely clear how JMLR is supported, but there is financial and infrastructure support going on, most likely from MIT. The servers are not “marginal cost = 0” — as a computer scientist, you surely understand the 20-25% annual maintenance costs for computer systems (upgrades, repairs, expansion, updates). MIT is probably footing the bill for this. The journal has a 27% acceptance rate, so there is definitely a selection process going on. There is an EIC, a managing editor, and a production editor, all likely paid positions. There is a Webmaster. I think your understanding of JMLR’s financing is only slightly worse than mine — I don’t understand how it’s financed, but I know it’s financed somehow. You seem to believe in fairies.

Since I have some pretty substantial knowledge of JMLR and how it works, I thought I’d comment on the facts of the matter. Read the rest of this entry »

...time to switch...
“…the interpersonal processes that a student goes through…”
Harvard students (2008) by E>mar via flickr. Used by permission (CC by-nc-nd)

Is the pot calling the kettle black? Oh sure, journal prices are going up, but so is tuition. How can universities complain about journal price hyperinflation if tuition is hyperinflating too? Why can’t universities use that income stream to pay for the rising journal costs?

There are several problems with this argument, above and beyond the obvious one that two wrongs don’t make a right.

First, tuition fees aren’t the bulk of a university’s revenue stream. So even if it were true that tuition is hyperinflating at the pace of journal prices, that wouldn’t mean that university revenues were keeping pace with journal prices.

Second, a journal is a monopolistic good. If its price hyperinflates, buyers can’t go elsewhere for a substitute; it’s pay or do without. But a college education can be arranged for at thousands of institutions. Students and their families can and do shop around for the best bang for the buck. (Just do a search for “best college values” for the evidence.) In economists’ parlance, colleges are economic substitutes. So even if it were true that tuition at a given college is hyperinflating at the pace of journal prices, individual students can adjust accordingly. As the College Board says in their report on “Trends in College Pricing 2011”:

Neither changes in average published prices nor changes in average net prices necessarily describe the circumstances facing individual students. There is considerable variation in prices across sectors and across states and regions as well as among institutions within these categories. College students in the United States have a wide variety of educational institutions from which to choose, and these come with many different price tags.

Third, a journal article is a pure information good. What you buy is the content. Pure information goods include things like novels and music CDs. They tend to have high fixed costs and low marginal costs, leading to large economies of scale. But a college education is not a pure information good. Sure, you are paying in part to acquire some particular knowledge, say, by listening to a lecture. But far more important are the interpersonal processes that a student participates in: interacting with faculty, other instructional staff, librarians, other students, in their dormitories, labs, libraries, and classrooms, and so forth. It is through the person-to-person hands-on interactions that a college education develops knowledge, skills, and character.

This aspect of college education has high marginal costs. One would not expect it to exhibit the economies of scale of a pure information good. So even if it were true that tuition is hyperinflating at the pace of journal prices, that would not take the journals off the hook; they should be able to operate with much higher economies of scale than a college by virtue of the type of good they are.[1]

Which makes it all the more surprising that the claims about college tuition hyperinflating at the rate of journals are, as it turns out, just plain false.

Let’s look at what the average Harvard College student pays for his or her education. Read the rest of this entry »

...time to switch...
“…time to switch…”
A very old light switch (2008) by RayBanBro66 via flickr. Used by permission (CC by-nc-nd)

The journal Research in Learning Technology has switched its approach from closed to open access as of New Year’s 2012. Congratulations to the Association for Learning Technology (ALT) and its Central Executive Committee for this farsighted move.

This isn’t the first journal to make the switch. The Open Access Directory lists about 130 of them. In my own research field, the Association for Computational Linguistics (ACL) converted its flagship journal Computational Linguistics to OA as of 2009, and has just announced a new open-access journal Transactions of the Association for Computational Linguistics. Each such transition is a reminder of the trajectory that journal publishing ought to head.

The ALT has done lots of things right in this change. They’ve chosen the ideal licensing regime for papers, the Creative Commons Attribution (CC-BY) license. They’ve jettisoned one of the largest commercial subscription journal publishers, and gone with a small but dedicated professional open-access publisher, Co-Action Publishing. They’ve opened access to the journal retrospectively, so that the entire archive, back to 1993, is available from the publisher’s web site.

Here’s hoping that other scholarly societies are inspired by the examples of the ALT and ACL, and join the many hundreds of scholarly societies that publish their journals open access. It’s time to switch.

My friend and ex-colleague Matt Welsh has an interesting post supporting the Research Without Walls pledge, in which he talks about the Harvard open-access policies. He says:

Another way to fight back is for your home institution to require all of your work be made open. Harvard was one of the first major universities to do this. This ambitious effort, spearheaded by my colleague Stuart Shieber, required all Harvard affiliates to submit copies of their published work to the open-access Harvard DASH archive. While in theory this sounds great, there are several problems with this in practice. First, it requires individual scientists to do the legwork of securing the rights and submitting the work to the archive. This is a huge pain and most folks don’t bother. Second, it requires that scientists attach a Harvard-supplied “rider” to the copyright license (e.g., from the ACM or IEEE) allowing Harvard to maintain an open-access copy in the DASH repository. Many, many publishers have pushed back on this. Harvard’s response was to allow its affiliates to get an (automatic) waiver of the open-access requirement. Well, as soon as word got out that Harvard was granting these waivers, the publishers started refusing to accept the riders wholesale, claiming that the scientist could just request a waiver. So the publishers tend to win.

I wrote a response to his post, clarifying some apparent misconceptions about the policy, but it was too long for his blogging platform’s comment system, so I decided to post it here in its entirety. Here it is:

There’s a lot to like about your post, and I agree with much of what you say. But I’d like to clarify some specific issues about the Harvard open-access policies, which are in place at seven of the Harvard schools as well as MIT, Duke, Stanford, and elsewhere.

The policy has two aspects. First, the policy commits faculty to (as you say) “submitting the work to the archive”, that is, providing a copy of the final manuscript of each article, to be deposited into Harvard’s DASH open-access repository. Doing so involves filling out a web form with metadata about the article and uploading a file. But if that is too much trouble, we provide a simpler web form that is tantamount to just uploading the file. Or you can email the file to the OSC. Or one of our “open-access fellows” can make the deposit on your behalf. We also harvest articles from other repositories such as PubMed Central and arXiv. I can’t imagine that providing the articles is “a huge pain”.

Second, by virtue of the policy, Harvard faculty grant a nonexclusive transferable license to the university in all our scholarly articles. This license occurs as soon as copyright vests in the article, so it predates and therefore dominates any later transfer of copyright to a publisher. Since the policy license is transferable, the university can and does transfer it back to the author, so the author automatically retains rights in each article, without having to take any further action. Because of this policy, the “legwork of securing the rights” is actually eliminated. By doing nothing at all, the author retains rights in the article.

You mention attaching a rider to publication agreements. Although we provide an addendum generator to generate such riders, and we recommend that authors use them, attaching an addendum is not required to retain rights. The only point of the addendum is to alert the publisher that the author has already given Harvard non-exclusive rights to the article (though publishers undoubtedly are already aware of the fact; the policy and its license have been widely publicized).

Because we want the policy to work in the interest of faculty and guarantee the free choice of faculty as to the disposition of their works, the license is waivable at the sole discretion of the author. Thus, rights retention moves from an opt-in regime without the policy to an opt-out regime with the policy. The waiver aspect of the policy was not a response to publisher pushback, but has in fact been in the policies from the beginning. The waiver was intended to preserve complete freedom of choice for authors in rights retention.

As is found in many areas (organ donation, 401K participation), participation tends to be much higher with opt-out than opt-in systems, and that holds for rights retention as well. We have found that the waiver rate is extraordinarily low, contra your assumption. For FAS, we estimate it at perhaps 5% of articles. In total, the number of waivers we have issued is in the very low hundreds, out of the many thousands of articles that have been published by Harvard faculty since the policy was in force. MIT has tracked the waiver rate more accurately, and has reported a 1.5% waiver rate. So for well over 90% of articles, authors are retaining broad rights to use their articles.

The statement that “Many, many publishers have pushed back on this” is false. Less than a handful of publishers have established systematic policies to require waivers of the license, which accounts for the exceptionally low waiver rate. Indeed, over a third of all waivers are attributable to a single journal.

The Harvard approach to rights retention and open-access provision for articles is not a silver bullet to solve all problems in scholarly publishing. It has a limited goal: to provide an alternate venue for openly disseminating our articles and to retain the rights to do so. It is extremely successful at that goal. Many thousands of articles have been deposited in DASH, accounting for over half a million downloads. Nonetheless, other efforts need to be made to address the underlying market dysfunction in scholarly publishing, and we are actively engaged there too. For those interested in what we’re up to along those lines, I recommend taking a look at the various posts at my blog, The Occasional Pamphlet, which discusses issues of open access and scholarly communication more generally.

...dog-eared in thirty-one places...
“…dog-eared in thirty-one places…”

I’ve been reading Arthur Conan Doyle‘s first novel, The Narrative of John Smith, just published for the first time by the British Library. It’s no The Adventures of Sherlock Holmes, that’s for sure. For one thing, he seems to have left out any semblance of plot. But it does incorporate some entertaining pronouncements. Here’s one I identify with highly:

There should be a Society for the Prevention of Cruelty to Books. I hate to see the poor patient things knocked about and disfigured. A book is a mummified soul embalmed in morocco leather and printer’s ink instead of cerecloths and unguents. It is the concentrated essence of a man. Poor Horatius Flaccus has turned to an impalpable powder by this time, but there is his very spirit stuck like a fly in amber, in that brown-backed volume in the corner. A line of books should make a man subdued and reverent. If he cannot learn to treat them with becoming decency he should be forced.

If a bibliophile House of Commons were to pass a ‘Bill for the better preservation of books’ we should have paragraphs of this sort under the headings of ‘Police Intelligence’ in the newspapers of the year 2000: ‘Marylebone Police Court. Brutal outrage upon an Elzevir Virgil. James Brown, a savage-looking elderly man, was charged with a cowardly attack upon a copy of Virgil’s poems issued by the Elzevir press. Police Constable Jones deposed that on Tuesday evening about seven o’clock some of the neighbours complained to him of the prisoner’s conduct. He saw him sitting at an open window with the book in front of him which he was dog-earing, thumb-marking and otherwise ill using. Prisoner expressed the greatest surprise upon being arrested. John Robinson, librarian of the casualty section of the British Museum, deposed to the book, having been brought in in a condition which could only have arisen from extreme violence. It was dog-eared in thirty-one places, page forty-six was suffering from a clean cut four inches long, and the whole volume was a mass of pencil — and finger — marks. Prisoner, on being asked for his defence, remarked that the book was his own and that he might do what he liked with it. Magistrate: “Nothing of the kind, sir! Your wife and children are your own but the law does not allow you to ill treat them! I shall decree a judicial separation between the Virgil and yourself: and condemn you to a week’s hard labour.” Prisoner was removed, protesting. The book is doing well and will soon be able to quit the museum.’

Portrait of Arthur Conan Doyle by Sidney Paget, c. 1890
Portrait of Arthur Conan Doyle by Sidney Paget, c. 1890

What a wonderful, wonderful thing it is, though use has dulled our admiration of it! Here are all these dead men lurking inside my oaken case, ready to come out and talk to me whenever I may desire it. Do I wish philosophy? Here are Aristotle, Plato, Bacon, Kant and Descartes, all ready to confide to one their very inmost thoughts upon a subject which they have made their own. Am I dreamy and poetical? Out come Heine and Shelley and Goethe and Keats with all their wealth of harmony and imagination. Or am I in need of amusement on the long winter evenings? You have but to light your reading lamp and beckon to any one of the world’s great storytellers, and the dead man will come forth and prattle to you by the hour. That reading-lamp is the real Aladdin’s wonder for summoning the genii with. Indeed, the dead are such good company that one is apt to think too little of the living.

I know that there are those who think it is a sign of appreciation to write in, dog-ear, underline, highlight, and otherwise modify books — Anne Fadiman lauds such things as carnal acts — but I can’t bring myself to do so. I just can’t.

...a drop in the bucket. Drop I (2007) by Delox - Martin Deák via flickr. Used by permission (CC by-nc-nd)
“…a drop in the bucket.”
Drop I (2007) by Delox – Martin Deák via flickr. Used by permission (CC by-nc-nd)

At the recent Berlin 9 conference, there was much talk about the role of funding agencies in open-access publication, both through funding-agency-operated journals like the new eLife journal and through direct reimbursement of publication fees. I’ve written in the past about the importance of universities underwriting open-access publication fees, but only tangentially about the role of funding agencies. To correct that oversight, I provide in this post my thoughts on how best to organize a funding agency’s open-access underwriting system.

The motivation for underwriting publication fees is simple: Publishers provide valuable services to authors: management of peer review; production (copy-editing and typesetting); filtering, branding, and imprimatur. Although access to scholarly articles can now be provided at essentially zero marginal cost through digital networks, some means for paying for these so-called first-copy costs needs to be found in order to preserve these services. The natural business model is the open-access journal funded by article processing fees. (Although most current open-access journals charge no article processing fees, I will abuse the term “open-access journal” for this model.) Open-access (OA) journals are no longer an oddity, a fringe phenomenon. The largest scholarly journal on earth, PLoS ONE, is an OA journal. Major publishers — Springer, Elsevier, SAGE, Nature Publishing Group — are now publishing OA journals.

However, OA journals are currently at a significant disadvantage with respect to subscription journals, because universities and funding agencies subsidize the costs of subscription journals in such a way that authors do not need to trade off money used for the subsidy against money used for other purchases. In particular, subscription fees are paid by universities through their library budgets and by funding agencies through their overhead payments that fund those libraries. Authors do not see, let alone attend to, these costs. In such a situation, an author is inclined to publish in a subscription journal, where they do not need to use any moneys that could otherwise be applied to other uses, rather than an OA journal that requires payment of a publication fee. And if authors are unwilling to publish in open-access journals because of the fees, publishers — even those interested and motivated to switch to an OA revenue model — are unable to do so.

The solution is clear: universities and funding agencies should underwrite reasonable OA publication fees just as they do subscription fees. But how should this be done? Each kind of institution needs to provide its fair share of support.

As I’ve written about before, universities can underwrite processing fees on behalf of their faculty, and do so in a way that does not reintroduce a moral hazard, by reimbursing faculty for OA publication fees up to a fixed cap per year. Since these funds can only be used for open access fees, they can’t be traded off against other purchases, so they don’t provide a disincentive against open access journals. On the other hand, since these funds are limited (capped), they provide a market signal to motivate choosing among open access journals so that the economic incentives will militate toward low-cost high-service open access journals.

This is the argument for the Compact for Open-Access Publishing Equity (COPE), a commitment by universities to establish mechanisms for underwriting OA publication fees. COPE has grown well beyond its initial five signatories and is supported by a wide range of institutions and people. Harvard and other COPE signatories have already set up such OA funds, which work in just this way.

Many COPE-compliant OA funds don’t underwrite articles that were developed under research grants, under the view that such funding is the responsibility of the granting institutions. COPE calls for universities to do their fair share of paying OA fees, no less, but no more. Funding agencies need to underwrite their share of OA fees as well, and crucially should do so in a way that respects several important criteria:

  1. They level the playing field completely, at least for cost-efficient OA journals.
  2. They recognize that publication of research results often occurs after grants have ended.
  3. They provide incentive for publishers to switch revenue model to the OA publication fee model, or at least provide no disincentive.
  4. They avoid the moral hazard of insulating authors from the costs of their publishing.
  5. They don’t place an undue burden on funders that would require reducing the impact of research they fund.

Of course, many funders already allow grantees to pay for OA publication fees from their grants. But this method falls afoul of some of these criteria. With respect to criterion (1), grantees are forced to trade off uses of grant moneys to pay OA fees against uses to pay for other research expenses, providing incentive to publish in subscription-fee journals where these costs are hidden. This approach maintains the tilted playing field against OA journals. With respect to criterion (2), because the funds must be expended during the granting period, grantees must predict ahead of time how many articles they will be publishing in OA journals, where they will be publishing them, and those articles must be completed and accepted for publication by the end of the granting period.

The mechanism that satisfies these criteria is for funding agencies to provide non-fungible funds specifically for OA publication fees, funds that are not usable for purchasing other grant-related materials. Funders would establish a policy that grantees could be reimbursed for OA publication fees for articles based on grant-funded research at any time during or after the period of the grant. This satisfies criterion (1) because grantees would no longer have to pay publication fees out of pocket or from grant funds that could be used otherwise. It satisfies criterion (2) because payments can be provided after the end of the grant. (If desired, the delay after the grant ends can be limited to, say, a year or two.) A reasonable requirement for reimbursement of publication fees would be that the article explicitly acknowledge the grant as a source of research funding.

Wellcome Trust already uses a similar incremental funding system. However, they (inadvisably in my mind) allow the funds to apply to so-called hybrid publication fees, where an additional fee can be paid to make a single article available open access. These reimbursements should be limited to publication fees for true OA journals, not hybrid fees for subscription journals. Willingness to pay hybrid fees provides an incentive for a publisher to maintain the subscription revenue model for a journal, because the publisher can acquire these funds without converting the journal as a whole to open access. Eschewing hybrid fees is necessary to satisfy criterion (3).

If funders were willing to pay arbitrary amounts for publication fees without limit, a new moral hazard would be introduced into the publishing market. Authors would become price-insensitive and hyperinflation of publication fees would be possible. To retain a functioning market in publication fees, we must be careful in designing the reimbursement scheme for OA journals; we need to make sure that there is still some scarce resource that authors must manage. This can be achieved in a couple of ways, by capping reimbursements or by copayments. First, reimbursement of OA publication fees can be offered only up to a fixed percentage of the grant amount. By way of example, if an average NIH grant is $300,000 (excluding overhead[1]), a cap of, say, 2% would provide up to $6,000 available for OA fees. (Robert Kiley, Head of Digital Services at the Wellcome Trust, estimates that at present rates all funded papers of the Wellcome Trust could be underwritten for about 1.25% of their total granted funds. In the short run, nowhere near that level of underwriting is necessary, since the number of publication-fee-charging OA journals is so small. In the long run, as competition in the publication fee market increases, this number may well go down.) That would cover two PLoS Biology papers, three BMC papers, four or five PLoS ONE papers, eight or so Hindawi papers. A grantee would apply separately for these funds to reimburse reasonable OA fees. Some grantees might use all of these funds, some none, with most falling in the middle (and currently at the low end); but in any case they would not be usable for other purposes. Since these funds can only be used for OA publication fees, they can’t be traded off against other purchases, so there is no disincentive against selecting OA journals. On the other hand, since these funds are limited (capped), they provide a market signal to motivate choosing among open access journals so that the economic incentives will militate toward low-cost high-service OA journals.  (This can’t be repeated often enough.)

Alternatively, a copayment approach can be used to provide economic pressure to keep publication fees down. Reimbursement would cover only part of the fee, at least at the expensive end of such fees. It is important (criterion 1) that for cost-efficient OA journals, authors should not be out of pocket for any fees. Thus, reimbursement should be at 100% for journals charging less than some threshold amount, say, $1,500. (As publishers become more efficient, this threshold can and should be reduced over time.) Above that level, the funder might pay only a proportion of the fee, say, 50%, so that grantees have some “skin in the game” and are motivated to trade off publication fees against quality of publisher services. With these parameters, the payment schedule would provide for the following kinds of payments:

Publication fee Funder pays Author copays Examples
$700 $700 $0 typical Hindawi journal, SAGE Open
$1350 $1350 $0 PLoS ONE, Scientific Reports
$2000 $1750 $250 typical BMC journal
$2900 $2200 $700 PLoS Biology

(What the right parameters of such an approach are may depend on field and may change over time. I don’t propose these as the correct values, but merely provide an example of the workings of such a system.)

These two approaches are complementary. A policy could involve both a per-article copayment and a maximum per-grant outlay.

Finally, criterion (5) calls for implementing such an underwriting scheme as cost-effectively as possible, so that a funder’s research impact is not lessened by paying for publication fees. Indeed, one might expect that impact would be increased by such a move, given that the tiny percentage of funds going to OA fees would mean that those research results were freely and openly available to readers and to machine analysis throughout the world. I would think (and I recall a claim to this effect at Berlin 9) that the impact benefit of providing open access to a funder’s research results is greater than the impact of the marginal funded research grant. To the extent that this is so, it behooves funders to underwrite OA fees even at the expense of funding the incremental research. Nonetheless, there may be no need to forego funding research just to pay OA fees. Suppose that on the average grant incremental funds of $200 are used to pay OA publication fees. (With current availability and usage of OA journals, this is likely an overestimate of current demand for OA fees.) Where would this money come from? To the extent that faculty are publishing in OA journals, funders should not need to underwrite subscription journals, so that their overhead rates can be reduced accordingly. An overhead rate of 67% (Harvard’s current rate) would need to be reduced by a minuscule 0.067% to compensate. (This is not a typo. The number really is 0.067%, not 6.7%.) This constitutes a percentage reduction in overhead of one part in a thousand, a drop in the bucket. In the longer term over several years if usage of the funds rises to, say, $1000 per grant, the overhead rate would need to be reduced by a still tiny 0.33% for cost neutrality. As more OA journals become available and more funds are used, the overhead rate would be adjusted accordingly. If hypothetically all journals became OA, and all articles incurred these charges, the cost per grant might rise higher to Wellcome Trust’s predicted 1.25% (though by this point competition may have substantially reduced the fees), but then, larger reductions in overhead rates would be met by reduced university costs, since libraries would no longer need to pay subscription fees.

One of the nice properties of this approach is that it doesn’t require synchronization of the many actors involved. Each funding agency can unilaterally start providing OA fee reimbursement along these lines. Until a critical mass do so, the costs would be minimal. Once a critical mass is obtained, and journals feel confident enough that a sufficient proportion of their author pool will be covered by such a fund to switch to an open-access revenue model, subscription fees to libraries will drop, allowing for overhead rates to be reduced commensurately to cover the increasing underwriting costs. Each actor — author, funder, publisher, university, library — acts independently, with a market mechanism to move all towards a system based on open access.

It is time for funding agencies to take on the responsibility not only to fund research but its optimal distribution. Part of that responsibility is putting in place an economically sustainable system of underwriting open-access publication fees.

[1]The NIH Data Book reports average grant size for 2010 as around $450,000, which corresponds to something like $270,000 assuming a 67% overhead rate. $300,000 is thus likely on the high side.

Petrus Spronk, “Architectural Fragment”, 1992. Photo © 2005 Robert Laddish, used by permission.
Petrus Spronk, “Architectural Fragment”, 1992. Photo © 2005 Robert Laddish (www.laddish.net), used by permission.

I’ve just been at the conference in honor of the 30th anniversary of the University of Sao Paulo Integrated Library System (SIBi USP). David Palmer, one of the speakers at the conference, used in his presentation a picture of a wonderful sculpture that I had never seen before, which turned out to be a public art piece at the State Library of Victoria in Melbourne, Australia by Petrus Spronk entitled “Architectural Fragment”. I place a couple of pictures of it here in honor of Spronk’s 72nd birthday, which happens to be today. You can find more images here.

Petrus Spronk, "Architectural Fragment", 1992. Photo by flickr user madam3181, used by permission (CC by-nc-nd).
Petrus Spronk, “Architectural Fragment”, 1992. Photo by flickr user madam3181, used by permission (CC by-nc-nd).