Skip to main content
Advertisement
Main content starts here

Researchers need ‘open’ bibliographic databases, new declaration says

Major platforms such as the Web of Science, widely used to generate metrics and evaluate researchers, are proprietary

An illustration of an oversize padlock. The front face has a key inserted into it and it is shown open like a door. The inside of the padlock is filled with pieces of paper and people in lab coats are around it.
Davide Bonazzi/Salzmanart

When universities are deciding whom to hire and promote, or grant organizations are selecting projects to fund, there’s a good chance they’re referencing more than just the application materials. Many organizations rely on databases that compile publication information including authors, affiliations, citations, and funding sources to create metrics meant to quantify a researcher’s productivity and the quality of their work.

Some of the best known databases, such as the Web of Science and Scopus, are proprietary and offer pay-to-access data and services supporting these and other metrics, including university rankings and journal impact factors. But in a declaration posted today, more than 30 research and funding organizations call for the community to commit to platforms that instead are free for all, more transparent about their methods, and without restrictions about how the data can be used.

“At a time when decision making in science is increasingly guided by indicators and analytics, addressing the problems of closed research information must be a top priority,” states the Barcelona Declaration on Open Research Information. Signatories so far include funders such as the Bill & Melinda Gates Foundation and the French National Research Agency, as well as more than a dozen academic institutions.

Sorbonne University—which discontinued its subscription to the Web of Science last year and switched to a newer, open platform called OpenAlex—said in a statement that “by signing the Declaration, we want to show that not only this move towards open research information should be an objective, but that it can be done.” The move may help remediate existing databases’ focus on English-language journals, advocates say. It could also help improve “circulation of scientific and local knowledge produced in different languages, formats, and in different geographic regions,” says another signatory, the Latin American Council of Social Sciences through the Latin American Forum for Scientific Evaluation.

The declaration is an “excellent development,” says Elizabeth Gadd, a scholarly communications expert and head of research and innovation culture and assessment at Loughborough University who was not involved in preparing it. “Many organizations have made public commitments to open research practices but continue to use closed and commercial bibliographic data sources for research analytics.” The announcement should “spur a wider range of organizations to ‘put their money where their mouth is.’”

To that end, the declaration’s supporters hope to establish a Coalition for Open Research Information to plan next steps, says coordinator Bianca Kramer, a scholarly communications expert with consultancy Sesame Open Science. “We want to make it easier for organizations to work towards that transition, among other things by benefiting from each other’s expertise and by exploring collective action.”

Establishing and maintaining a research information database is no easy feat. Although computer algorithms can gather a lot of data automatically, database owners additionally fix errors and fill in missing information, as well as provide search capabilities and analytical tools that allow users to navigate these vast resources.

Existing alternatives to proprietary databases include PubMed, Crossref, OpenCitations, and OpenAlex. The last of these, established in 2022 by nonprofit OurResearch with funding from U.K. charity Arcadia Fund, has recently attracted high-profile endorsements, striking up partnerships with organizations including the French Ministry of Higher Education and Research.

But some experts raise concerns about their quality compared with the proprietary databases at this stage. In a recent analysis, Frédérique Bordignon, a researcher in bibliometrics at the School of Bridges ParisTech, found that a large chunk of the thousands of articles returned by OpenAlex for her institution were incorrectly assigned. Errors included confusing conference papers with academic articles and mixing up institution names. Commercial databases such as the Web of Science had some of these problems at inception, she says, but have eliminated many thanks to corrections provided by the institutions—work that would have to be repeated with the new databases.

Jason Portenoy, senior data engineer at OurResearch, acknowledges this need for community input, saying it’s understandable institutions might balk at putting the work in again. “But the difference is that with OpenAlex, it is happening in the open,” he says. Any gaps in data quality are closing fast, adds OurResearch CEO Jason Priem. “OpenAlex is evolving very quickly, and so often issues uncovered by our community are fixed in a few months.” When one of the declaration’s coordinators—Ludo Waltman, scientific director at the Centre for Science and Technology Studies at Leiden University—and colleagues recently used OpenAlex to rank more than 1400 universities worldwide, they concluded that although it needs improvement, the approach yielded “surprisingly good data quality.”

Clarivate, which runs the Web of Science and helped fund OurResearch’s predecessor, ImpactStory, “broadly support[s] the aims of this declaration,” says Senior Vice President of Research and Analytics for Academia and Government Emmanuel Thiveaud. “However, we believe there is room and need for multiple perspectives. … No single approach nor single entity could address the most pressing challenges facing research.” Elsevier, which runs Scopus, says it has long backed open initiatives and welcomes “any projects that support research as we share the same goal.”

Commercial databases can still provide value in this new landscape, Waltman says. But, “Instead of monetizing their data, [they could] shift to a new way of working in which their data is open and users instead pay for services on top of the data.” 

Daniel Hook, CEO of Digital Science, which operates the proprietary database Dimensions, welcomes this idea. (Dimensions already offers a free version for noncommercial use, though users have to subscribe to access full features.) Still, such a transition will take time, Hook says. Proprietary databases need to recover the costs of establishing, enhancing, and maintaining such large data sets—a challenge that open equivalents such as OpenAlex could also face long term.


ScienceAdviser

Get Science’s award-winning newsletter with the latest news, commentary, and research, free to your inbox daily.