You searched for dsreports - Digital Science

Machine-First FAIR: Realigning Academic Data for the AI Research Revolution

Mark Hahnel — Mon, 17 Nov 2025 12:16:40 +0000

The best way for humankind to benefit from research is to prioritize machines over people when sharing data. Here’s why.

We push out the lines that academic research needs to be Findable, Accessible, Interoperable and Re-usable (FAIR) for humans and machines. This suggests humans and machines should get equal priority when it comes to FAIR. This is not the case, we should prioritize the machines. Machine-generated new knowledge will accelerate knowledge discovery.

While humans can infer insights from sparse information in academic literature and datasets – due to our ability to find more context online – the machines currently cannot. To go further, faster in knowledge discovery we need to move past human-powered knowledge discovery. To do this, the machines need structure and pattern. Every research-generating organization should be prioritizing this.

Academia is Ignoring Decades of Advancement

Academic research generates more than 6.5 million papers annually, and over 20 million datasets, each representing potential training signals for the artificial intelligence systems reshaping discovery. Yet most institutional data remains locked in formats optimized for human consumption rather than computational processing.

While most stakeholders know the theoretical merits of making data FAIR (Findable, Accessible, Interoperable, Reusable) for both humans and machines, the practical reality is starker: in an era where language models can process orders of magnitude more literature than any human researcher, we are still organizing our most valuable research assets for the wrong consumer.

The economic implications are substantial. Organizations like the Chan Zuckerberg Initiative (CZI) have committed over $3.4 billion toward AI-powered biology, funding projects ranging from their 1,024 GPU DGX SuperPOD cluster for computational biology research to the Virtual Cell Platform that aims to create predictive models of cellular behavior. The Navigation Fund, with its $1.3 billion endowment, has invested in AI infrastructure through their Voltage Park subsidiary, while simultaneously funding open science initiatives focused on machine-actionable intelligence and metadata enhancement. Astera Institute has deployed portions of its $2.5 billion endowment to support projects like their $200 million investment in Imbue’s AI agent research and their Science Entrepreneur-in-Residence program specifically targeting scientific publishing infrastructure. Meanwhile, the Allen Institute for AI demonstrates the practical returns on machine-first approaches through projects like their OLMo series of fully open language models, where complete training datasets, code, and methodologies are published in computational formats, and their Semantic Scholar platform, which processes millions of academic papers to extract structured, machine-readable knowledge graphs.

Chan Zuckerberg Initiative (CZI)

Yet the vast majority of academic institutions continue to publish their findings in PDFs or as poorly described datasets. While LLMs are getting better at ingesting multi-modal content, PDF is a format that remains surprisingly resistant to reliable automated extraction, despite decades of advancement in natural language processing. This is not merely a technical limitation. Modern large language models struggle with PDFs because these documents prioritize visual presentation over semantic structure. Critical information becomes trapped in figures, tables, and formatting that computational systems cannot reliably parse. A reaction scheme embedded as an image, a dataset described in paragraph form, or experimental parameters scattered across multiple tables represent precisely the kind of structured knowledge that could accelerate discovery if only machines could access it consistently.

The Architecture of Computational Research Infrastructure

The solution requires a fundamental reorientation toward machine-first data architecture. Rather than retrofitting human-readable outputs for computational consumption, we can take inspiration from pharma and industry writ large, who are designing their data flows to serve algorithms from the ground up, with human-friendly interfaces emerging as downstream products of this computational foundation.

Consider the transformation pathway implemented by teams working with Digital Science’s suite of computational research tools. We’re building workflows in our tools for automated knowledge extraction at scale. The extracted knowledge gains semantic coherence through integration into domain-specific knowledge graphs. Platforms like metaphacts (metaphactory) provide the infrastructure to align these signals with established ontologies while enforcing quality constraints through SHACL validation integrated into continuous deployment pipelines. The result is not merely a database of facts, but a queryable intelligence system that can answer novel questions through automated reasoning over validated relationships.

Simultaneously, the operational requirements of research continue through dedicated literature management systems. Tools like ReadCube maintain the audit trails and conflict resolution workflows that regulatory environments demand, while ensuring that every screening decision and data extraction connects to persistent identifiers. The curated evidence flows directly into the computational infrastructure rather than terminating in isolated spreadsheets.

The critical innovation lies in packaging. While human researchers expect PDFs and narrative summaries, machine learning pipelines require structured metadata that specifies exactly what each dataset contains, where to retrieve it, and how to interpret every field.

The Metadata Multiplier Effect on Repository Platforms

Academic data repositories like Figshare occupy a unique position in the machine-first FAIR ecosystem. We serve as the critical junction between human research practices and computational discovery. When researchers publish datasets with comprehensive, structured metadata, these platforms transform from simple storage services into computational assets that can feed directly into AI research pipelines. The difference lies entirely in how authors describe their work at the point of deposit.

The REAL (Real-world multi-center Endoscopy Annotated video Library) – colon dataset on Figshare: https://doi.org/10.25452/figshare.plus.22202866.v2

Consider two datasets published on the same platform: one uploaded with a generic title like “experiment_data_final.xlsx” and minimal description, the other with machine-readable field descriptions, standardized vocabulary terms, and explicit links to ontologies and methodologies. The first requires human interpretation before any computational system can make sense of its contents. The second can be discovered, validated, and integrated into training pipelines automatically. Figshare’s API can surface the rich metadata to computational systems, but only if researchers have provided it in the first place.

The platform infrastructure already supports the technical requirements for machine-first FAIR. Persistent DOIs ensure stable identifiers, while structured metadata fields can accommodate everything from ORCID researcher identifiers to detailed provenance information. When authors invest time in describing their data using controlled vocabularies, specifying units of measurement, documenting collection methodologies, and linking to relevant publications, they create computational assets rather than digital archives. The same dataset that might languish undiscovered with poor metadata becomes a valuable training resource when described with machine-readable precision.

This creates a powerful feedback loop. Datasets with excellent metadata get discovered and reused more frequently, driving citation counts and demonstrating impact. Meanwhile, poorly described data remains computationally invisible regardless of its scientific value. Platforms like Figshare could amplify this effect by providing better authoring tools that encourage structured metadata entry, perhaps even using AI to suggest appropriate ontology terms or validate metadata completeness before publication. The infrastructure for machine-first FAIR already exists, it simply requires researchers to embrace metadata as a first-class research output rather than an administrative afterthought. But this is an evolving field, new standards are emerging that repositories need to engage with.

The Croissant format, a lightweight JSON-LD descriptor based on schema.org, provides this computational bridge. A single Croissant file enables any training pipeline to hydrate datasets without custom loaders while simultaneously supporting discovery through standard web infrastructure.

Practical Implementation in Institutional Contexts

The transition to machine-first FAIR follows a predictable arc when properly resourced. Initial implementations focus on proving the fundamental workflow with narrowly scoped pilot projects. A team might select a single dataset and one sharply defined outcome, perhaps drug-target interaction prediction or materials property modeling and implement the complete pipeline from literature extraction through validated knowledge graph construction to machine-readable packaging.

The critical insight from successful implementations is the importance of automation as the second phase. Manual processes that work for pilot projects become bottlenecks at scale. The most effective teams invest heavily in converting their proven workflows into tested, continuous integration pipelines that enforce quality gates automatically. This includes SHACL validation for knowledge graphs, automated license checking, and provenance tracking.

Production deployment requires infrastructure investments that many academic institutions are not yet considering. Successful implementations provide stable, resolvable URLs for every dataset and descriptor, enable content negotiation so that both machines and humans receive appropriate formats, and implement comprehensive monitoring of data quality trends and usage patterns. This is the stack that Digital Science can provide.

Quantifying Institutional Success

Organizations can assess their progress toward machine-first FAIR through several concrete indicators. Successful implementations demonstrate that every significant dataset resolves to a persistent identifier that returns structured JSON-LD for computational consumers while maintaining readable landing pages for human users. Knowledge graphs pass automated validation, maintain stable URI schemes, and support catalogued query patterns rather than requiring ad hoc exploration.

Literature workflows leave complete audit trails with PRISMA-compliant reporting that can be generated automatically rather than assembled manually. Licensing and provenance information becomes verifiable through computational means rather than requiring human interpretation. Most importantly, the time taken from initial hypothesis to trained model decreases as institutional infrastructure matures and teams spend more of their time on discovery rather than data preparation.

The research organizations that define the next decade will not necessarily be those with the largest datasets, but rather those whose data infrastructure works most effectively at computational scale. Every day spent optimizing publishing workflows for human-readable reports while leaving data computationally inaccessible represents lost ground in an increasingly competitive landscape.

The funders backing this transformation, from CZI’s investments in computational biology to Astera’s focus on AI-native research infrastructure, are betting that machine-first approaches will determine which institutions can effectively leverage artificial intelligence for discovery. The technical architecture exists today. The standards are stable. The remaining barrier is institutional commitment to prioritizing computational accessibility over familiar but inefficient human-centered workflows.

Academic research stands at yet another technology-driven inflection point. The institutions that embrace machine-first FAIR will find themselves having more impact for their research and researchers.

The post Machine-First FAIR: Realigning Academic Data for the AI Research Revolution appeared first on Digital Science.

2024 Annual Report

jasonpiccin — Fri, 17 Oct 2025 07:23:00 +0000

Shaping the future of research

Welcome to Digital Science’s 2024 Annual Report, a comprehensive overview of our efforts to revolutionize the global research ecosystem. We empower researchers and institutions with innovative tools, including those leveraging AI, to drive collaboration, transparency, and impactful discoveries. The report highlights our achievements in advancing open research practices, supporting the academic community, fostering research integrity, and championing sustainability.

Download the report for insights into our vision and values, detailing contributions to improving research outcomes, driving innovation, and promoting open data standards. It also outlines our Environmental, Social, and Governance (ESG) commitments, showcasing efforts to reduce our carbon footprint. Through impactful partnerships and groundbreaking tools, Digital Science continues to lead in transforming how science is conducted and shared for the benefit of society.

Digital Science exists to support the research ecosystem.

Download the report and find out how

I am excited to share with you our wide and varied contributions to the needs of the research ecosystem and our communities”

Daniel Hook

CEO, Digital Science

Highlights

Launch of Research Transformation campaign

In 2024, Digital Science initiated the Research Transformation campaign, a global effort to understand and support the evolving research landscape. Through surveys and interviews with nearly 400 academics across 70 countries, the campaign explored themes like AI, openness, and research security, culminating in the publication of the report Research Transformation: Change in the Era of AI, Open and Impact.

Commitment to Open Data practices

Digital Science pledged support for the Barcelona Declaration on Open Research Information in 2024, launching its own Open Principles to promote inclusivity, reproducibility, and accessibility in research. The annual State of Open Data Report. revealed growing global recognition of open data practices, while highlighting disparities in resources that impede progress.

Advancing forensic scientometrics

Digital Science made significant strides in the emerging field of Forensic Scientometrics (FoSci) in 2024, developing tools like the Author Check tool to uncover errors and manipulations in scientific publications. This work strengthens trust in scholarly communication and addresses systemic vulnerabilities in research integrity.

Strengthening research in Sub-Saharan Africa

In partnership with the Training Centre in Communication (TCC Africa), Digital Science helped to trained over 570 early-career researchers across seven African nations in 2024. The collaboration enhanced open access adoption, expanded African scholarship in the Dimensions database, and advanced equitable scholarly publishing practices.

Environmental sustainability initiatives

Digital Science demonstrated its commitment to sustainability by setting net-zero targets aligned with the Paris Agreement goals. In 2024, the company reported its carbon emissions, purchased renewable electricity certificates, and invested in high-quality offsets to mitigate its environmental impact.

Driven by curiosity and guided by a strong sense of purpose, Digital Science champions a global research ecosystem that values integrity, inclusivity, and impact.”

Stefan von Holtzbrinck

CEO, Holtzbrinck

For Digital Science, Environmental, Social, and Governance (ESG) commitment isn’t just about compliance.

Download the report and learn more

Articles referenced in this report

See all

The state of Open Data 2024: Special report

A detailed and sustained study revealing the motivations, challenges, perceptions, and behaviors of researchers towards open data.

Get the report

Research transformation: Change in the era of AI, open and impact

Insights from our academic research community on how research transformation is experienced across different roles and responsibilities.

Read now

FoSci – The emerging field of forensic scientometrics

Our VP Research Integrity, Dr Leslie McIntosh, on the emerging field focused on inspecting and upholding the integrity of scientific research.

Learn more

The post 2024 Annual Report appeared first on Digital Science.

Launching our blog series on natural language processing (NLP)

Wed, 04 Mar 2020 15:25:30 +0000

Today we launch our blog series on Natural Language Processing, or NLP. A facet of artificial intelligence, NLP is increasingly being used in many aspects of our every day life, and its capabilities are being implemented in research innovation to improve the efficiency of many processes.

Over the next few months, we will be releasing a series of articles looking at NLP from a range of viewpoints, showcasing what NLP is, how it is being used, what its current limitations are, and how we can use NLP in the future. If you have any burning questions about NLP in research that you would like us to find answers to, please email us or send us a tweet. As new articles are released, we will add a link to them on this page.

Our first article is an overview from Isabel Thompson, Head of Data Platform at Digital Science. Her day job is also her personal passion: understanding the interplay of emerging technologies, strategy and psychology, to better support science. Isabel is on the Board of Directors for the Society of Scholarly Publishing (SSP), and won the SSP Emerging Leader Award in 2018. She is on Twitter as @IsabelT5000

NLP is here, it’s now – and it’s useful

I find Natural Language Processing (NLP) to be one of the most fascinating fields in current artificial intelligence. Take a moment to think about everywhere we use language: reading, writing, speaking, thinking – it permeates our consciousness and defines us as humans unlike anything else. Why? Because language is all about capturing and conveying complex concepts using symbols and socially agreed contracts – that is to say: language is the key means of transferring knowledge. It is therefore foundational to science.

We are now in the dawn of a new era. After years of promise and development, the latest NLP algorithms now regularly score more highly than humans on structured language analysis and comprehension tests. There are of course limitations, but these should not blind us to the possibilities. NLP is here, it’s now – and it’s useful.

NLP’s new era is already impacting our daily lives: we are seeing much more natural interactions with our computers (e.g. Alexa), better quality predictive text in our emails, and more accurate search and translation. However, this is just the tip of the iceberg. There are many applications beyond this – many areas where NLP makes the previously impossible, possible.

Perhaps most exciting for science at present is the expansion of language processing into big data techniques. Until now, the processing of language has been almost entirely dependent on the human mind – but no longer. Machines may not currently understand language in the same way that we do (and, let’s be clear, they do not), but they can analyse it and extract deep insights from it that are broader in nature and greater in scale than humans can achieve.

For example, NLP offers us the ability to do a semantic analysis on every bit of text written in the last two decades, and to get insight on it in seconds. This means we can now find relationships in corpuses of text today that it would previously have taken a PhD to discover. To be able to take this approach to science is powerful, and this is but one example – given that so much of science and its infrastructure is rooted in language, NLP opens up the possibility for an enormous range of new tools to support the development of scientific knowledge and insight.

Google’s free NLP sentence parsing tool

NLP is particularly interesting for the research sector because these techniques are – by all historical comparisons – highly accessible. The big players have been making their ever-increasingly good algorithms available to the public, ready for tweaking into specific use cases. Therefore, for researchers, funding agencies, publishers, and software providers, there’s a lot of opportunity to be had without (relatively-speaking) much technical requirement.

Stepping back, it is worth noting that we have made such extreme advances in NLP in recent years due to the collaborative and open nature of AI research. Unlike any cutting edge discipline in science before, we are seeing the most powerful tools open sourced and available for massive and immediate use. This democratises the ability to build upon the work of others and to utilise these tools to create novel insights. This is the power of open science.

Here at Digital Science, we have been investigating and investing in NLP techniques for many years. In this blog series, we will be sharing an overview of what NLP is, examine how its capabilities are developing, and look at specific use cases for research communication – to demonstrate that NLP is truly here. From offering researchers writing support and article summarisation, to assessing reproducibility and spotting new technology breakthroughs in patents, all the way through to the detection and reduction of bias in recruitment: this new era is just getting started – where it can go next is up to your imagination.

Look out for the next article in our series, “What is NLP?”, and follow the conversation using the hashtag #DSreports.

The post Launching our blog series on natural language processing (NLP) appeared first on Digital Science.

Australian research well placed for adoption of National Persistent Identifier (PID) Strategy

David Ellis — Thu, 09 Oct 2025 07:15:07 +0000

Digital Science report offers “mixed score card”, makes 23 recommendations including mandatory ORCIDs for all Aussie researchers

Thursday 9 October 2025

Digital Science, a technology company serving stakeholders across the research ecosystem, has made a series of 23 recommendations for Australia’s research future in a report published today into the use of persistent identifiers (PIDs) in research.

The report is the Australian National Persistent Identifier (PID) Benchmarking Toolkit, available now on Figshare.

Commissioned by the Australian Research Data Commons (ARDC), Digital Science was tasked with developing a comprehensive PID benchmarking framework, and to conduct a benchmarking process that could be used to monitor the effectiveness of Australia’s National PID Strategy over time. The report, developed collaboratively with the ARDC, also benefited from consultation and engagement with the Australian research community.

The lead author of the report, Digital Science’s VP of Research Futures, Simon Porter, will discuss the findings at two upcoming events in Brisbane, Australia: International Data Week (13-16 October) and the eResearch Australasia Conference (20-24 October).

A unique opportunity for Australian research

“This is the first time Australia’s National PID Strategy has been benchmarked, and it represents a unique opportunity for the Australian research system to benefit from that process,” Simon Porter said.

“What we’ve seen from the benchmarking is that Australia’s adoption of ORCID for research publications across the research sector has been extremely successful – and Australia is now third in the world for including DOI (Digital Object Identifier) links with dissertations published online.

“Workflows between publishers, institutional research information systems, and ORCID are also sufficiently strong, and we can see that Australia is well placed for a more comprehensive use of the ORCID infrastructure.

“However, our comprehensive review gave Australian research a mixed score card and recommended several changes and interventions to help strengthen the national strategy,” Mr Porter said.

“One of the key issues we’ve seen is that although Australian researchers are more engaged than the global average in the practice of data citation, they trail significantly behind their European peers.

“And while ORCID and ROR adoption has been strong for publications, the use of persistent identifiers with data sets and non-traditional research outputs (NTROs) remains the exception rather than the norm. As significant publishers of NTRO items in their own right, institutions should hold themselves to the same standards that they expect from publishers – all creators should ideally be described with an ORCID, and affiliation id (ROR).”

Natasha Simons, Director of National Coordination at the ARDC, congratulated Digital Science on the release of the National PID Benchmarking Toolkit. “The Australian Persistent Identifier Strategy is a critical national initiative to benefit the Australian people by strengthening our digital information ecosystem, the quality of our research and our capacity for effective research engagement, innovation and impact,” she said. “So it is essential to develop robust benchmarks that can track our progress and measure outcomes. The Toolkit provides us with exactly what’s needed.”

Recommendations to strengthen Australia’s research future

Some of the 23 recommendations made in the report include:

Australian research has progressed to the point where ORCIDs should now be mandatory for all researchers; Australian Institutions should require ORCID registration within their institutional research information management systems.

Australian research institutions should adopt the best practices of publishers to ensure that all authors are described by ORCIDs and affiliations via ROR.

Australia should join international pressure to ensure that all publishers both record ORCID records and push the associated metadata into Crossref, and to avoid publishers that do not support ORCID workflows.

Australia should consider a national policy for publishing dissertations with DOIs in institutional repositories, formalizing the use of ORCIDs for authors and their supervisors.

Reports published by universities and their research centres should ideally be published in institutional repositories, with associated identifiers.

Ongoing benchmarking analysis of PIDs should not ignore closed access material. (e.g., ignoring closed-access publications would result in missing 35% of Australia’s research output in 2024.)

RAiDs (Research Activity Identifiers) should be added from “day one” of the creation of a funding grant.

Grants funding organizations should create persistent identifiers “as soon as is practical” – including complete metadata – to enable research funding to be visible and tracked earlier.

“We welcome the opportunity to have led this benchmarking process, and we hope our recommendations will lead to some meaningful improvements within Australian research,” Mr Porter said.

“Importantly, we’ve also demonstrated that it is possible to produce a benchmarking toolkit for PIDs, and our work may have implications for other nations and their roadmaps towards a persistent identifier future.”

Background: The importance of PIDs

Persistent identifiers (PIDs) are unique numbered references to individual researchers and their work, which are connected to digital outputs and resources. They help connect researchers, projects, outputs, and institutions, and have become critical for:

Making research inputs and outputs FAIR (findable, accessible, interoperable, and reusable)
Enabling research outputs to be identified, tracked and cited
Analyzing research impact
Supporting national-scale research analytics

Widely used PIDs include ORCID iDs, DOIs, RORs, and emerging identifiers include DOIs for grants, and identifiers for projects (RAiDs).

Note: In the report, Simon Porter declares that he is also a member of the ORCID Board.

Read the full report

Discover more at International Data Week (13-16 October) and the eResearch Australasia Conference (20-24 October).

About Digital Science

Digital Science is an AI-focused technology company providing innovative solutions to complex challenges faced by researchers, universities, funders, industry and publishers. We work in partnership to advance global research for the benefit of society. Through our brands – Altmetric, Dimensions, Figshare, IFI CLAIMS Patent Services, metaphacts, OntoChem, Overleaf, ReadCube, Symplectic, and Writefull – we believe when we solve problems together, we drive progress for all. Visit digital-science.com and follow Digital Science on Bluesky, on X or on LinkedIn.

Media contact

David Ellis, Press, PR & Social Manager, Digital Science: Mobile +61 447 783 023, d.ellis@digital-science.com

The post Australian research well placed for adoption of National Persistent Identifier (PID) Strategy appeared first on Digital Science.

Fewer dollars. Fewer people. Higher stakes.

Digital Science — Fri, 03 Oct 2025 13:11:12 +0000

Staffing cuts and budget reductions are squeezing federal research agencies from both sides — yet your mission hasn’t gotten any smaller.

When critical reviews take 15–20 days, every lost day means slower funding decisions, higher risk exposure, and reduced program impact. Smaller teams simply can’t afford to waste time chasing data across siloed systems.

Waiting for resources to improve isn’t a strategy.

With fewer people to share the load, inefficiencies multiply — and so do the risks of missed impacts, unvetted partners, and misaligned funding.

Our new report, Doing More with Less: How Federal Research Agencies Are Maximizing Impact with Smarter Data Intelligence, reveals how agencies are:

Cutting review times by up to 90% — without adding headcount
Gaining real-time visibility into performance, partnerships, and risk
Reducing reliance on overburdened staff for manual data work
Securing data access in alignment with FedRAMP and DoD IL-4 requirements, pending 2026 certification

With Dimensions, your smaller team can work like a larger one — unifying publications, grants, patents, policy, collaborator data, and risk insights in one secure platform.

Get the report. Get the advantage.

Fill out the form to access your copy of Doing More with Less and see how other agencies are meeting higher expectations with fewer resources.

***Doing More with Less: How Federal Research Agencies Are Maximizing Impact with Smarter Data Intelligence***

Get the report

The post Fewer dollars. Fewer people. Higher stakes. appeared first on Digital Science.

Podcasts now count towards research impact in world first for Altmetric

David Ellis — Wed, 15 Oct 2025 10:40:02 +0000

Altmetric adds podcasts as an attention source, offering a more complete view of research influence

Wednesday 15 October 2025

In a major step forward for tracking the real-world impact of research, Digital Science today announces that Altmetric has added a new attention source: Podcasts.

Altmetric is the first in the world to include podcasts among its measures of research impact.

Podcasts will now be reflected in the distinctive Altmetric Badges – appearing as a purple color – as well as in Altmetric Attention Scores, with more detail displayed in Altmetric Explorer.

In addition to podcasts, Altmetric’s many attention sources include select social media channels, news, blogs, public policy sites, patents, clinical guidelines, and more.

A complete view of research influence

Miguel Garcia, VP of Product, Digital Science, said: “Altmetric is about tuning in to where research conversations are really happening, and understanding how that research is being received, discussed, debated, and shared. A complete view of research influence isn’t possible without podcasts.

“With Altmetric podcast tracking, we recognize that these real-world conversations play a critical role in shaping public understanding and acceptance of research. Podcasts add rich, narrative-driven evidence to the impact story, offering a more complete view of research influence across scholarly, professional, and public domains.

“With more than half a billion people listening to podcasts for information, and at a time when podcasts are growing as a communication and educational platform, we feel the moment is right to include these conversations as an attention source. Publishers, academics, industry, governments, and funders will all now benefit from better understanding the impact of research.”

Benefits of podcast tracking

By adding podcasts as an attention source, Altmetric will enable users to:

Strengthen reporting on research impact
Capture a broader, more complete attention landscape
Gain deeper public engagement insights
Diversify research impact data sources

All user segments within the research ecosystem will benefit from Altmetric’s podcast tracking:

Academics: Strengthen submissions that demonstrate the real-world impact and influence of research
Enterprise: Identify emerging Key Opinion Leaders (KOLs) and track therapeutic-area conversations, even outside traditional publishing
Publishers: Highlight where journals are discussed in accessible, mainstream forums that boost author engagement
Funders: Ensure research funded is making an impact in broader public discourse, justifying investment

Find out more about Altmetric’s podcast tracking

Podcasts in Altmetric

About Altmetric

Altmetric is a leading provider of alternative research metrics, helping everyone involved in research gauge the impact of their work. We serve diverse markets including universities, institutions, government, publishers, corporations, and those who fund research. Our powerful technology searches thousands of online sources, revealing where research is being shared and discussed. Teams can use our powerful Altmetric Explorer application to interrogate the data themselves, embed our dynamic ‘badges’ into their webpages, or get expert insights from Altmetric’s consultants. Altmetric is part of the Digital Science group, dedicated to making the research experience simpler and more productive by applying pioneering technology solutions. Find out more at altmetric.com and follow @altmetric on X and @altmetric.com on Bluesky.

About Digital Science

Digital Science is an AI-focused technology company providing innovative solutions to complex challenges faced by researchers, universities, funders, industry and publishers. We work in partnership to advance global research for the benefit of society. Through our brands – Altmetric, Dimensions, Figshare, IFI CLAIMS Patent Services, metaphacts, Overleaf, ReadCube, Symplectic, and Writefull – we believe when we solve problems together, we drive progress for all. Visit digital-science.com and follow Digital Science on Bluesky, on X or on LinkedIn.

Media Contact

David Ellis, Press, PR & Social Manager, Digital Science: Mobile +61 447 783 023, d.ellis@digital-science.com

The post Podcasts now count towards research impact in world first for Altmetric appeared first on Digital Science.

Digital Science Launches Digital Research Reports

Sun, 08 Jun 2014 11:09:00 +0000

We’d like to announce the launch of our Digital Research Reports, a new quarterly series of publications about research data and analytical possibilities in a practical, applied context.

A massive volume and diversity of data is associated with research. Most people are familiar with analyses of publications and citations (bibliometrics), especially around research performance benchmarking. They are aware that such analyses have both limitations and flaws and are often misused. But performance is only one part of the publication story and bibliometrics are only one part of the data portfolio.

This series will report on what publication analysis can tell us about other aspects of researcher activity and behaviour, such as collaboration and interdisciplinarity. In the first report, we look at what researchers choose to submit for assessment compared to what they say best represents leading research in their field.

We will also report on other parts of the research ecosystem. For example, what can we learn about research activity from data about data, figures, graphs and tables? How will the system respond to mandates to make all publicly funded research data openly available? And we will also look at the other ways in which people mention and alert one another to new research papers via Twitter, Facebook and blogs. Can this be a source of valid information about the social and economic impact of research?

Overall, we aim to address the challenge of better information for people engaged in research as well as sounder and more relevant information for policy and evaluation purposes. Our reports are written for all kinds of people who deal with ‘research’, to inform, to stimulate discussion and sometimes to provoke debate. And our focus is on how to use the available numbers to deliver more, better research as well as tracking what research has already been done.

Download our first report, Evidence for excellence: has the signal overtaken the substance? An analysis of journal articles submitted to RAE2008

The post Digital Science Launches Digital Research Reports appeared first on Digital Science.

Taking Open Access book usage from reports to operational strategy

Guest Author — Thu, 25 Aug 2022 09:22:02 +0000

Make the most of your OA data

Understanding how many times an open access (OA) book has been viewed or downloaded is only part of the story – what you then do with the data is when the tale really unfolds…

Note: This blog post includes excerpts from the OA eBook Usage Data Analytics and Reporting Use-cases by Stakeholder report by Drummond and Hawkins.

While the term “usage data” most often refers to webpage views and downloads associated with a given book or book chapter, scholarly communications stakeholders have identified a near future where linked open access (OA) scholarship usage data analytics could directly inform publishing, discovery, and collections development in addition to impact reporting.

In the 2020-2022 Exploring Open Access Ebook Usage research project supported by the Mellon Foundation, publisher and library representatives expressed their interests in using OA eBook Usage (OAeBU) data analytics to inform overall OA program investment, strategy and fundraising. A report summarizing a year of virtual focus groups noted multiple operational use cases for OA book usage analytics, spanning book marketing, sales, and editorial strategy; collections development and hosting; institutional OA program strategy, reporting, and investment; and OA impact reporting for institutions and authors to support reporting to their funding agencies, donors, and policy-makers.

Figure 1: OA Book Usage Data Use Cases for Publishers, Libraries, and Book Publishing Platforms and Services. Excerpted from Drummond, Christina, & Hawkins, Kevin. (2022). OA eBook Usage Data Analytics and Reporting Use-cases by Stakeholder. Pg 2. Zenodo. https://doi.org/10.5281/zenodo.7017047

In order to realize the full benefits of OA data usage we must create an ecosystem that operates according to open data sharing, security, and use principles.”

Christina Drummond

OA Book Usage Data Trust

Evidence-based decision-making depends on comprehensive, quality data. The tale of an OA book or author’s impact depends upon marrying usage data created by a plethora of publisher platforms, digital libraries, and OA repositories with linked citation information from scholarship, syllabi, gray literature and policy proceedings. As Laura Ricci and Michael Clarke elegantly documented, OA book usage data is created at multiple points across the book supply chain, to be ultimately curated and collated by each library, publisher, and library management system working with OA titles.

Figure 2: Open Access EBook Supply Chain: Usage Data Capture and Reporting Data Flows. Excerpted from Clarke, Michael, & Ricci, Laura. (2021). Open Access eBook Supply Chain Maps for Distribution and Usage Reporting. Zenodo. https://doi.org/10.5281/zenodo.4681871

The work required to produce cross-platform, contextual usage-related OA reports and analytics is significant. Such data aggregation and reporting requires the processing and curation of numerous COUNTER-compliant and non-compliant reports, APIs, dashboards, and spreadsheets.This resource-intensive exercise requires specialized expertise to understand which metrics can (and cannot) be combined while annotating bot traffic, avoiding data quality issues, and reporting on publicly available data alongside information that’s accessible per data-use agreements. While larger operations have access to such expertise, smaller presses, publishers, and start-ups risk being unable to do so, thereby missing out on derived strategic insights.

OAeBU data can inform service relationships as publishers and libraries seek to understand online book distribution niches or evaluate book hosting and dissemination offerings. Similarly, publishing platforms and services can leverage usage data to improve and target their own offerings for the scholarly communications community. Yet context is key. As illustrated through pilot OA monograph usage data dashboards developed by the Curtin Open Knowledge Initiative, innovation is occurring around usage data dashboards and analytics services to meet the specialized needs of publishers, libraries, funding agencies, and scholars.

University of Michigan Press OA Book Usage Data Dashboard: Graphics reflect screen captures of the Authors Page Prototype.

To fully analyze the impacts of Open Access on society, scholarly communications stakeholders must improve OAEBU data quality, processing, and reliability. Federated national or regional data infrastructure efforts already suggest ways to facilitate data processing and exchange across public and private organizations big and small. The US-based National COVID Cohort Collaborative is simplifying the controlled, ethical data sharing, aggregation, and use of COVID trial data across public and commercial research labs. European industry-based collaboratives are applying International Data Space (IDS) standards and certifications to facilitate public and private data exchange for mobility, logistics, and healthcare. Supported by the Mellon Foundation, a team led by PIs at the University of North Texas, OPERAS, OpenAIRE, and Johns Hopkins University is working to build upon past efforts to: a) host community consultations to create a multilateral data-processing and stewardship rule book for OA book usage data, b) quantify data trust participation benefits for book publishing stakeholders, and c) understand the full operational costs related to an international data space for OA book usage. If successful, this OA Book Usage Data Trust effort could make usage data management and reporting less costly and more accessible for all open monograph stakeholders.

While the use of data space infrastructure will drive economies of scale – and therefore cost savings – for book publishing stakeholders, in order to realize the full benefits of OA data usage we must create an ecosystem that operates according to open data sharing, security, and use principles. By fostering trusted, responsible, direct data exchange, our researchers, publishers, libraries, funders and the wider community will all stand to gain.

About the author

Christina Drummond

Executive Director | OA Book Usage Data Trust

For over 20 years, Christina has worked at the intersection of data analytics, strategy, and policy. As the Executive Director for the OA eBook Usage Data Trust effort, Christina is helping to improve the quality, completeness, and timeliness of OA impact data while reducing reporting costs through better global usage data exchange, aggregation and governance.

The post Taking Open Access book usage from reports to operational strategy appeared first on Digital Science.

Digital Science reports that preprints account for one quarter of COVID-19 research

Thu, 04 Jun 2020 10:43:39 +0000

Digital Science, a leading technology company serving emergent needs across the research sector, has today released a report highlighting the global research landscape trends and cultural changes in response to the COVID-19 pandemic.

The report How COVID-19 is Changing Research Culture analyses publication trends, regional focal points of research, collaboration patterns, and top institutional producers of research in COVID-19.

The report key findings include:

As of 1 June 2020, there have been upwards of 42,700 scholarly articles on COVID-19 published, 3,100 clinical trials, 420 datasets, 270 patents, 750 policy documents, and 150 grants.
Preprints have rapidly established as a mainstream research output and a key part of COVID-19 research efforts. They started at relatively low levels in early January 2020 and accounted for around one quarter of research output by the beginning of May 2020.
To date, more than 8,300 organisations have been involved in supporting COVID-19 research, with over 71,800 individual researchers identified as working on COVID-19 research.
The highest intensity of research into COVID-19 began in China and gradually migrated west mirroring the movement of the virus itself.
While the US and EU have both now published more than China in journals such as The Lancet, New England Journal of Medicine and JAMA, China continues to benefit from an early mover advantage and continues to enjoy the lionshare of the citations. While research in the field is clearly moving quickly, it currently remains anchored to China’s early publications.
A density map of global COVID-19 paper production shows there are three to four major centres of research: an extended area in China composed of several cities—Wuhan, where the virus is alleged to have started, Beijing and Shanghai; Europe, specifically Italy and the UK, two of the harder hit countries; the US’s east coast research corridor including Boston and New York; and finally, a lighter focus from the Californian institutions on the West coast.
The top producing institution of COVID-19 research (since the beginning of 2020) is in China, Huazhong University of Science and Technology, followed by Harvard University and the University of Oxford.
The top healthcare producers of COVID-19 research (since the beginning of 2020) are Zhongnan Hospital of Wuhan University, then Renmin Hospital of Wuhan University, and Massachusetts General Hospital.
While the proportion of internationally co-authored work is steady, the vast majority of research on COVID to date has been unusually authored within countries.
At the time of writing, 156 grants totalling at least 20.8m USD have been awarded to COVID-themed researchers in public institutions.
Much of the clinical trial initiation activity in January and February is sponsored by China and this then begins to fall off in March, April and May. We see a similar wave for Europe and the US, but shifted back by two months, beginning in March.

Daniel Hook, CEO of Digital Science and co-author of the report, said: “Although the situation around COVID-19 is unfortunate, it is allowing us to see the development of a new field, and the culture change associated with that development in ‘real time’. Dimensions’ combination of full text search and daily updates, together with the inclusion of data beyond publications and citations, gives us critical insights not only into the research itself, but also into how the community is responding to this important issue.

“That response has been immediate and intensive. The research world has moved faster than many would have suspected possible. As a result, many issues in the scholarly communication system, that so many have been working to improve in recent years, are being highlighted in this extreme situation. From peer review to scholarly search, we are seeing changes in behaviours condensed into a few months that many have long campaigned for. We live in interesting times.”

The report can be read in full here

Notes to editors:

Digital Science is a technology company working to make research more efficient. We invest in, nurture and support innovative businesses and technologies that make all parts of the research process more open and effective. Our portfolio includes admired brands including Altmetric, CC Grant Tracker, Dimensions, Figshare, Gigantum, ReadCube, Symplectic, IFI Claims, GRID, Overleaf, Ripeta, Scismic and Writefull. Digital Science’s Consultancy group works with organisations around the world to create new insights based on data to support decision makers. We believe that together, we can help researchers make a difference. Visit www.digital-science.com and follow @digitalsci on Twitter.

Dimensions is a modern, innovative, linked research knowledge system that re-imagines discovery and access to research. Developed by Digital Science in collaboration with over 100 leading research organizations around the world, Dimensions brings together grants, publications, citations, alternative metrics, clinical trials, patents and datasets to deliver a platform that enables users to find and access the most relevant information faster, analyze the academic and broader outcomes of research, and gather insights to inform future strategy. Visit Dimensions’ website and find us on Twitter @DSDimensions.

Media contact

David Ellis, Press, PR & Social Manager, Digital Science: Mobile +61 447 783 023, d.ellis@digital-science.com

The post Digital Science reports that preprints account for one quarter of COVID-19 research appeared first on Digital Science.

Exploring Arctic Research with the Dimensions Database: Three Pilot Reports Released Today

Wed, 14 Sep 2016 12:30:30 +0000

In recent weeks, several members of the University of the Arctic (UArctic), a cooperative network of universities, colleges, research institutes and other organizations concerned with education and research in and around the north, have produced a pilot report, “International Arctic Research: Analyzing Global Funding Trends”, in collaboration with Digital Science. This report aims to analyze the current state of Arctic research using international funding information and relate this to both activities of the Arctic Council Member and Observer states, as well as international research activity generally.

The UArctic aims to build and strengthen collective resources and collaborative infrastructure by the provision of unique educational and research opportunities through collaboration within a powerful network of members. This pilot report is the initiation of a strategic partnership between Digital Science and UArctic and is the first ever attempt to create a comprehensive view of global Arctic research funding using a dataset of such magnitude.

“…first ever attempt to create a comprehensive view of global Arctic research funding using a dataset of such magnitude.”

Seven key findings of the pilot report are as follows:

1% of all recorded research funding is related to Arctic research.
Earth Sciences and Biological Sciences are the two most-funded subfields of Arctic research.
The USA is the largest Arctic research nation both in total spending and number of projects started.
UArctic institutions are central actors in Arctic research globally. Overall, UArctic member institutions represent approximately 35% of all the Arctic research funding, based on total of 4.8 billion US dollars in funding
Arctic Council Observer nations are increasingly doing more research on the Arctic. The UK, in particular, has a considerable number of Arctic research projects.
The analysis suggests that there is neither significant growth nor shrinkage in the volume of Arctic research funding over the period 2008-2014.

The report illustrates what is possible when using the Dimensions database from ÜberResearch for a detailed analysis of the research funding and activity in any given research area. With over 200 funders representing 2.5million awarded research grants, and growing all the time, Dimensions is the only tool which has both the width of data and precision technology behind it to produce this kind of landscape analysis of any research area.

Alongside the main report, two working papers have also been released which analyze different aspects of Arctic research, in publications data and alternative metrics respectively. The first of these – “Arctic Research Publications” – examines the research activity in Arctic research using the same definition as the main report.

The second working report – “Arctic Altmetrics“ – explores alternative metrics in Arctic research. Using the Altmetric Attention Score, this report highlights the distinct benefits of considering attention information when analyzing the impact of a given field of study. Various levels of data aggregation were taken into consideration from that of the individual publication, all the way up to the impact that an entire country is making on a given field. Examples were also given of how one might use Altmetrics to characterize contributions of a given scientist, journal, university, and Arctic research subfield. Finally, the potential to use real-time, Altmetric attention-spikes as a trigger to begin new survey-driven, social-science-focused research projects was explored.

Together, these three pilot reports highlight three distinct ways in which any given research area can be examined with Digital Science tools. The resulting complementary information can build a more comprehensive overview of the activity in any given research area, offering increased understanding and awareness, information that until now has not been easily accessible.

The post Exploring Arctic Research with the Dimensions Database: Three Pilot Reports Released Today appeared first on Digital Science.

Digital Science reports that just 10% of global research output relates to the UN’s Sustainable Development Goals

Thu, 07 May 2020 10:55:57 +0000

London, UK – 7 May Digital Science, a leading technology company serving emergent needs across the research sector, has today released a report highlighting the growth in research around the UN’s Sustainable Development Goals (SDGs).

The report Contextualizing Sustainable Development Research, which was produced by Digital Science’s Consultancy Group led by Dr Juergen Wastl, looks at the state of the world’s research on the UN’s SDG’s. It finds that more than 500,000 publications were published in 2019 relating to the 17 SDGs, constituting around 10 percent of the world’s total research output. This is three times the percentage produced in 2000 when the UN introduced the Millenium Development Goals, the predecessors of the SDGs.

The SDGs recognise that ending poverty and other imbalances in society and our relationship with the natural environment must go hand-in-hand with strategies that improve health and education, reduce inequality, and spur economic growth – all while tackling climate change and working to preserve our oceans and forests.

Digital Science’s Consultancy group used the Dimensions platform to analyse research data on the SDGs.

The key findings include:

10% of global research output now relates to the UN’s Sustainable Development Goals.
While the US remains the top producer of SDG research, China’s research footprint has grown rapidly and it is now the second largest SDG-research power.
India has grown quickly in SDG research, overtaking all the traditionally developed research economies in Europe. It is now in fourth place globally behind the US, China and the UK in SDG research volume.
The UK has a more well-rounded or evenly distributed footprint, reminiscent of a larger and more diversified research economy such as the US.

The report follows Digital Science’s release of an SDG classification system in its Dimensions platform in March. Research into these areas is critical to help transform the world and each development goal has a list of targets which are measured with indicators. Dimensions users (including free version) can now filter this research in the publications content type.

Daniel Hook, CEO of Digital Science and one of the co-authors of the report, said: “It is at once encouraging and concerning to see 10% of the world’s research volume clustered around SDGs. It has taken more than 20 years of sustainable development initiatives to reach a figure of 10% – the research sector so often leads the way in thinking, in debate and in setting the stage for the informed development of public policy. Yet, with just 10% of research dedicated to this most immediate issue of our time, I wonder if we are equipped to take that lead.

“The UN’s Sustainable Development Goals call on the world’s leaders to look beyond Gross Domestic Product (GDP) in their decision-making processes. As we have demonstrated in our report, Dimensions can support all stakeholders across the international research ecosystem and challenges them to embrace a broader range of research information in their decision making processes.”

The full free report is available here

Notes to editors:

Digital Science is a technology company working to make research more efficient. We invest in, nurture and support innovative businesses and technologies that make all parts of the research process more open and effective. Our portfolio includes admired brands including Altmetric, CC Grant Tracker, Dimensions, Figshare, Gigantum, ReadCube, Symplectic, IFI Claims, GRID, Overleaf, Ripeta, Scismic and Writefull. Digital Science’s Consultancy group works with organisations around the world to create new insights based on data to support decision makers. We believe that together, we can help researchers make a difference. Visit www.digital-science.com and follow @digitalsci on Twitter.

Dimensions is a modern, innovative, linked research knowledge system that re-imagines discovery and access to research. Developed by Digital Science in collaboration with over 100 leading research organizations around the world, Dimensions brings together grants, publications, citations, alternative metrics, clinical trials, patents and datasets to deliver a platform that enables users to find and access the most relevant information faster, analyze the academic and broader outcomes of research, and gather insights to inform future strategy. Visit Dimensions’ website and find us on Twitter @DSDimensions.

Media contact

David Ellis, Press, PR & Social Manager, Digital Science: Mobile +61 447 783 023, d.ellis@digital-science.com

The post Digital Science reports that just 10% of global research output relates to the UN’s Sustainable Development Goals appeared first on Digital Science.

Global Attitudes towards Open Data #Stateofopendata 2018

Guest Author — Mon, 22 Oct 2018 13:02:23 +0000

A surprising number of respondents (60%) had never heard of the FAIR principles, a guideline to enhance the reusability of academic data!

Our portfolio company, Figshare, has launched its annual report, The State of Open Data 2018, to coincide with global celebrations around Open Access Week. The report is the third in the series and includes survey results and a collection of articles from global industry experts, as well as a foreword from Ross Wilkinson, Global Strategy at the Australian Research Data Commons.

Two years on from the first report in 2016, which was created to examine attitudes and experiences of researchers working with open data – sharing it, reusing it, redistributing it – survey results continue to show encouraging progress, that open data is becoming more embedded in the research community.

Key findings include:

64% of respondents revealed they made their data openly available in 2018, a 7% rise on 2016
Data citations are motivating more respondents to make data openly available, increasing 7% from 2017 to 46%
60% of respondents had never heard of FAIR principles (Findability, Accessibility, Interoperability and Reusability) – which provide a guideline for data producers and publishers to enhance the reusability of academic data.
The percentage of respondents in support of national mandates for open data is higher at 63% than in 2017 (55%)
Respondents who revealed that they had reused open data in their research continues to shrink. In 2018 48% said they had done this, whereas in 2017 50% had done so, with 2016 57% in 2016
Most researchers felt that they did not get sufficient credit for sharing data (58%), compared to 9% who felt they do
Respondents having lost research data has decreased from 2017 (36% versus 30% in 2018)

We asked a number of questions about FAIR principles this year with some surprising results. The percentage of respondents who reported being familiar with the principles was just 15%, with 25% having previously heard of FAIR and 60% never having heard of them.

The results confirmed that despite publishers, funders and institutions rapidly adopting these principles, there remains a crucial gap in educating researchers. They further show the need for initiatives like Go Fair, which gives researchers clear instructions on how to be FAIR compliant.

Mark Hahnel, CEO and Founder, Figshare, said: “In recent years we’ve seen the conversation move from data not only being open but being FAIR. This is a major shift considering we spent the early years of Figshare trying to convince researchers to share their data full stop.

“For every new feature we build at Figshare, we have one eye on the FAIR principles, so as a repository we are doing as much of the heavy lifting as possible for researchers. There is still a lot work to be done to educate researchers on what is expected of them but the report highlights many new initiatives from across the research ecosystem, all pulling together in the same direction.”

DOWNLOAD FULL REPORT

The post Global Attitudes towards Open Data #Stateofopendata 2018 appeared first on Digital Science.