What is arXiv and how can we get one?

After ckelty’s post on the SSRN/Elsevier merger fellow mind, Ryan Anderson, gave me a shout out in Twitter,

This is a pretty interesting idea. What would it entail taking arXiv as a role model?

What is arXiv?

Like SSRN, arXiv is a digital repository. They are both examples of Green OA — a type of open access where authors deposit versions of their work so that they can be accessed by readers for free. What version of an article makes it into the repository depends on which publisher you’re working with, but almost all of them allow authors to deposit the original submission: no peer review, no mark-up, no type setting. Others are more generous, a few even allow the post-print to be deposited. It just depends, if you want to go Green do some research on your publisher’s homepage or ask a company rep.

Green OA is frequently contrasted with Gold OA, where the author submits to a journal that makes the final product available to readers for free, examples include HAU and Cultural Anthropology. Again, there is great diversity among Gold OA publishers just as there is among Green repositories but we’re not getting into that here.

arXiv is Green OA, it is a pre-print repository but of a particular kind. If you’re at an elite or second tier R1 you probably already have access to a repository through your institution. However many of these institutional repositories (IRs) share a common problem, faculty participation is low. Some universities have attempted to address this with OA mandates, but this is not always sufficient to change faculty behavior. People are really busy, or maybe they don’t see the value in access. Perhaps they think someone else will do it for them, or are mistaken about their author’s rights. For whatever reason many people who can go Green choose not to.

The generally poor showings for institutional repositories has lead some in the digital libraries field to argue that IRs are not the way forward for Green OA. Instead they anticipate that disciplinary repositories (DRs), sometimes called subject repositories, will be more successful. Perhaps in our neoliberal world faculty are less tied to their institution than their discipline? Both SSRN and arXiv are DRs.

nb. There are other ways you can get your pre-prints out there without a repository. Social networks like academia.edu and researchgate.net exist to facilitate self-promotion and the sharing of information. Similarly with personal webpages. However, a major advantage to IRs and DRs over social networks and personal webpages is that the former are run by information professionals. Digital librarians who will work to insure that your metadata meet international standards and that technical stuff, like bit stream preservation, are well maintained thus keeping your work discoverable and accessible. To get the best of both worlds deposit your work in a repository, this will result in a URL that you can embed on your webpage or network.

So both SSRN and arXiv are Green OA, disciplinary repositories run by full-time professionals. But there’s a difference!

What makes arXiv special?

Three things make arXiv magic. It carries prestige among the community it is intended to serve, it has a very successful business plan, and it has a very successful governance model. All of this was achieved with only twenty-five years of hard work and sacrifice.

On its front page arXiv boasts 1,148,725 e-prints available for download, better than double SSRN’s 563,300. This is pretty remarkable when you consider that arXiv was originally intended as a repository just for physicists (today it has slightly expanded that scope) whereas SSRN aims to serve all of the social sciences and the humanities.

For some reason physicists have really bought into what arXiv has to offer. Maybe it is a result of the collaborative atmosphere of physics research? In my imagination, cultural anthropology romanticizes the lone genius — we’re either collecting ethnographic data in some out of the way place isolated from the rest of the world or perhaps cloistered in an solitary office channeling that experience into text only to emerge with the finished product. Maybe we just don’t do enough stuff as part of a team?

Aside: I think an integral part of the professional life of someone in the social sciences or humanities is the experience of being in a constant state of precarity. Not only do our colleagues in the STEM fields have more resources at their disposal, they are not constantly called upon to defend their very existence to powerful outsiders. Shoestring budgets and the unending barrage of existential threats make a lot of social science and humanities faculty “small-c conservative,” risk averse and skeptical about diverging from the well-worn path of professional success offered by conventional publications.

All of this is to say, there seems to be something about the “culture” of physicists that contributes to their high participation rate. If I was a physicist and I deposited an e-print in arXiv I could tell my buddies and everyone would pat me on the back. Good job! Another publication! If as an anthropologist I make a deposit to SSRN and I tell my buddies about it, everyone would say, What’s SSRN? and, When are you going to publish that? Raising the prestige and visibility of Green OA is one of the best possible outcomes of the SSRN/Elsevier merger. But more on that later

One more detail: arXiv does not rely on peer review. Instead, there is a system of vetting author submissions. Like peer reviewers, experts volunteer to vet submissions but instead of writing comments they are primarily interested in making sure that papers are acceptable and properly sorted into their appropriate sub-field. Many arXiv papers go on to have other lives as traditional publications and thus go through a peer review process eventually. My point here is that physicists think sharing unreviewed work is a notable accomplishment in a way that anthropologists (currently) do not.

The business plan for arXiv is perhaps the most crucial ingredient to its success. Born in the Los Alamos Laboratories in 1991, arXiv has since migrated with its founder Paul Ginsparg to Cornell. Cornell University Libraries, known as a leader even among the most elite libraries, has since 2011 provided the repository with infrastructural support and staffing. Annual funding from CUL is supplemented with the deep pockets of the Simons Foundation, both in the form of annual funding and challenge grants. Further contributing to this are voluntary pledges from about 200 institutions that represent arXiv’s heaviest users, these pledges range $1500-3000 per institution per year. The above are 2012 numbers.

Oya Yildirim Rieger writes in “Sustainability: Scholarly Repository as Enterprise” (Bulletin for the American Society for Information Science and Technology, Vol.39, No.1) that there are five sustainability principles CUL adheres to in planning the future of arXiv: (1) deep integration into the scholarly community and scholarly process — scientists take a leadership role in guiding arXiv, it reflects their values and their community as a result; (2) a clearly defined mandate and governance structure — if you’re running a digital archive then you’re playing a long game. The long game is the whole point of archiving things!; (3) technology platform stability and innovation — the data architecture and user interface game has to be top notch and always responsive to the constantly changing expectations of the users; (4) systematic development of content policies — be crystal clear about collection policies, submission guidelines, copyright status, etc.; (5) reliance on business planning strategies — you want big money for your repository? Then you better be able to talk the business talk and show value to your investors.

The long term viability of arXiv is sustained by its thoughtful governance structure. CUL upholds its end of the bargain from a managerial and administrative standpoint, they house the archive. But they are in constant communication with two boards: a Member Advisory Board, consisting of elected representatives from stakeholder institutions, and a Scientific Advisory Board that consists of researchers in the fields arXiv serves. These two boards are responsible for providing input on different aspects of arXiv’s development. The MAB is more concerned with implementing information standards, working towards interoperability, and planning. The SAB is concerned with intellectual oversight and the vetting process.

What about the future of SSRN?

As you can see arXiv has a lot to recommend it and achieving these goals will be a tall order. In the meantime, we already have the SSRN plus all manner of other institutional repositories and digital libraries. So, do we really need something like arXiv instead?

To be sure Elsevier’s acquisition of SSRN is disappointing to many open access activists because it represents another step towards the corporate enclosure of intellectual life. Elsevier is perceived by many as among the worst of the bunch because of its reputation for playing hardball, being litigious, gobbling up author’s rights, and even on a few occasions acquiring OA journals and then charging for access to them. What a bully!

Elsevier has also been the focus of past and ongoing boycotts with scholars refusing to cite their journals, submit work, or volunteer as reviewers or editors. No doubt some authors will feel that they should abstain from participating in SSRN as an extension of their OA activism. If this sounds like you, then you will no doubt be in good company. There are lots of other ways you can go Green OA. Vote with your feet and choose one of those instead.

I’m actually going to run the other way with this. I am not going to advocate for the boycotting of SSRN and this is why: open access is not free.

arXiv solved the problem of funding open access by aligning itself with an Ivy League school and the philanthropic arm of a crazy rich hedge fund manager. Plus it still manages to get hundreds of other libraries to send it checks on an annual basis, NPR-style. SSRN saw another way forward, selling out to corporate America. Did the professional staff at SSRN all get big bonuses and golden parachutes as a result? Probably not. They do, however, get to have salaries and benefits which is pretty cool considering they give their product away for free. They, SSRN, had reached the limit of their growth as an independent entity and needed more resources to advance their goal. Ckelty is probably right that Elsevier sees this as an opportunity to feast upon data, but if the service remains free to use while increasing in quality then that might be an acceptable trade off. As we have known since Darwin, trade offs are an integral part of distributing risk through populations living in complex ecosystems.

I suspect that some will not be persuaded. Contributing to an Elsevier property will be an ideological bridge too far and, to be honest, I am sympathetic with this position. An arXiv for the social sciences holds a lot of promise — but here’s the rub. Even if you had arXiv’s mad funding and governance skillz, you would still be missing a key ingredient: acceptance and prestige among the community the archive intends to serve. The physics community has embraced arXiv and Green OA in a way that anthropology and the social sciences has not. That’s on us.

So maybe we don’t need to replicate arXiv. Maybe we need to have a period of reflection, reflexivity in that classic ANTH 101 sense, about our “culture” as a discipline. Then conceptualize a repository that reflects that in a way that other anthropologists will think is valuable. That might result in something that is not identical to arXiv, but uniquely our own.

But the mad dough and Ivy library would probably be pretty helpful too.

Matt Thompson

Matt Thompson is Project Cataloger at The Mariners’ Museum in Newport News, Virginia, and currently working on a CLIR ‘hidden collections’ grant to describe the museum’s collection of early 20th Century photography. He has a doctorate in anthropology from the University of North Carolina and a Masters in information science from the University of Tennessee.

8 thoughts on “What is arXiv and how can we get one?

  1. It’s important to keep in mind the culture aspect: arXiv is successful because it grew out of a community of practice; if you look at SSRN, the most successful parts of it (Law, Economics) also grew out of a scholarly practice where sharing preprints was already the norm and the technology was just a way to facilitate what was already happening. There are lots of calls for arXiv for the social sciences as if arXiv’s success was simply technology+cash, but this is the wrong way of looking at things. I agree with Matt that what is needed is not just another subject repository (though if you want to put your Anthro stuff in one, there is still Mana’o: https://evols.library.manoa.hawaii.edu/handle/10524/1511), what anthropologists need to think about is what problems exist within current anthropology scholarly communication and then go from there. It may be that AnthroArXiv is not what’s needed, maybe it’s more platforms like culanth.org, or maybe something else entirely.

  2. It’s worth noting that SSRN’s rate of popular acceptance varies across fields. For legal scholars, for instance, SSRN (and, more dubiously, SSRN’s download counts) is/are an important part of the research ecosystem.

    The point being that I don’t think universal acceptance is a prerequisite to building a successor system; there just needs to be enough support from enough communities to get off the ground. FWIW, we (i.e., Authors Alliance) is doing some exploratory work on that front. Not that the possibility of a successor system should caution against building something unique and tailored to the needs of anthropology!

  3. I really dig the idea of an anthropology (or social sciences in general, really) arXiv.
    But something that I think should be pointed out is that (at least according to Traweek’s Beamtimes and Lifetimes) active (high-energy) physicists only ever read preprints – that might be why arXiv was so successful. Published articles (thus, after peer-review) are already old news, and are important in other senses (for citations, impact, legitimacy, etc.), but not for reading.
    On the other hand, we anthropologists have the disciplinary practice of reading journals – in fact, during your training as an anthropologist you develop the skill to know which journals are more aligned with your particular research theme, or dedicated to particular topics or geographical locations.
    So I agree that we should not try to replicate arXiv, but nevertheless we could think about the creation of a discipline-wide OA repository – although, as you said, mad dough and an Ivy library would be really helpful in that.

  4. Thanks for this great post Matt. I’d like to see this conversation keep going further. I think you nail it when you write:

    “Even if you had arXiv’s mad funding and governance skillz, you would still be missing a key ingredient: acceptance and prestige among the community the archive intends to serve. The physics community has embraced arXiv and Green OA in a way that anthropology and the social sciences has not. That’s on us.”

    I think for many anthros this issue isn’t even on the radar, at all. So I think you’re right that we need to reflect on our culture as a discipline–specifically our “publishing culture” if you will (and how it fits within our larger scholarly system). But this is going to require some work to really push this issue onto the radar. I’m game.

  5. I like Jason Jackson’s suggestion via twitter: might be possible to build a hybrid disciplinary repository that expands upon or connects institutional repositories. The conversation continues…

  6. Why has the physics community embraced arXiv and Green OA? Pure speculation, but I can think of two possible answers: (1) a discipline where state-of-the-art is not only constantly changing but also well defined, so that keeping up with the latest thinking is an absolute imperative, in combination with (2) intense competition between teams working on similar problems, making who finds a solution first vital to fame and fortune. In these circumstances there are strong incentives to not wait to publish until after a long period of peer review. 

    Am I nuts?

  7. I stepped out of the world of anthropology publishing years ago, and thus, my ethnographic sense of it is outdated, but I’d offer a few comments. First, although there is subfield variability, I’d argue the currency of value is not necessarily the research article, but the book, whether for the scholar, those who are collaborating with or evaluating the scholar, or readers of anthropology, whether professional or just interested others. A good number of book proposals that come to university presses are for reworked or elaborated upon ethnographic research written up in dissertations. A central disciplinary repository for those dissertations might be of value then. Similarly, a number of book proposals are for edited volumes of contributions that stem out of seminars, conferences, sessions, and the like. Could you imagine if SAR submitted the preprints of all their seminar papers to a disciplinary repository?

    Still, there is the question of whether Green or Gold OA is better. Funders take different approaches to this in the world of biomedicine. The NIH public access policy is a Green model in that they don’t pay the publishers to make the final version of a publication OA, but have mandated authors to submit the peer-reviewed manuscript to a disciplinary repository. Wellcome Trust, on the other hand, perhaps because they don’t have to go through Congress to develop a policy, pays publishers to make the final version OA (hence they are Gold). Publishers prefer Wellcome’s model. In the social sciences / humanities, NSF and NEH-funded papers will go Green soon. It would be nice if large funders of anthropology like Wenner-Gren, SAR, ACLS, and the like would pave a path for Gold OA, but unlike Wellcome, many of them already have traditional library subscription products, and will likely not want to take that risk.

    I can’t say I know much about the culture of physicists, but I do know a bit about the culture of biomedical researchers, and it feels quite similar to me to John’s description of physicists. I am not sure then why for biomedical researchers it took, in large part, funder mandates and funder initiatives to forge a path to OA content, easily accessible from a free disciplinary citation index (PubMed). I can say, there are ironies in this. Last week I did a lot of deep searching about the limbic system, and the articles behind paywalls (of mostly commercial publishers) are those those that describe stressors, environmental and sociocultural/economic, on this system that contribute to “mental disorders,” whereas those publicly available took a more innately genetic and biophysical approach. I found that deeply disturbing, as there are critical philosophical and political implications to our understanding of that phenomenon.

Comments are closed.