Data sharing is considered one of the effective practical means to enhance research transparency. Data repositories are pivotal e-infrastructure in fostering this. As a generalist data repository, Science Data Bank (ScienceDB) offers free services to the global community for sharing and dissemination of non-traditional research outputs, such as datasets and codes. It has been built and...
The EU-funded [OSTrails project][1] is advancing a federated approach to Open Science by addressing a key challenge: the fragmentation of research data management (RDM) practices across disciplines, tools, and institutions. By building a network of interoperable services for planning, tracking, and assessing research activities, OSTrails promotes reproducible, FAIR-aligned, and responsible...
Our university is built on the traditional and contemporary homelands of the Dakota people, a federally recognized Tribal Nation made up of four communities and their sovereign governments. We recognize the importance of acknowledging the People on whose land we live, learn, and work, but understand that words are not enough. Within our institutional data repository, we seek to improve and...
With the pace of research accelerating into the age of Quantum computing and AI, data mobility has become the lifeblood of modern scientific research. As unprecedented volumes of research data are being created at increasing speed, it is imperative that data can be easily moved and shared to be findable and accessible. Yet achieving data mobility in scientific research faces significant...
Vertical Federated Learning (VFL) has emerged as a transformative approach in collaborative machine learning, enabling multiple parties to jointly train models while maintaining data privacy through vertical partitioning of features. This paradigm has gained significant traction in privacy-sensitive domains such as healthcare and finance, where different organisations possess distinct feature...
Addressing the requirements for transparent, cost effective and high-impact research in this era of big data and cross-disciplinary research will require significant community changes to how research software is created, managed and maintained. In this talk I will introduce Astronomy Data And Computing Services (ADACS): a highly successful initiative of the Australian astronomy community...
Many current solutions for data management are expensive or require considerable technical underpinnings (or both). The global data community needs to consider simpler approaches in order to include more participants and to improve equity, but this requires guidance about minimal requirements. The Protocols for the Implementation of Archival Repository Services are an attempt to start the...
The Australian National Persistent Identifier (PID) Strategy is a critical national initiative that aims to accelerate Australian research quality, efficiency and impact through universal use of connected persistent identifiers. It supports a vision where researchers, institutions, and infrastructures are connected through a universal, trusted, and interoperable system of PIDs. This strategy...
The Health Studies Australian National Data Asset (HeSANDA), led by the Australian Research Data Commons (ARDC), is building national research infrastructure to enhance the discoverability, access, and reuse of data from health research studies across Australia. HeSANDA was established as a response to the critical need for more accessible and interoperable health research data.
The...
As demand grows to improve health outcomes for Aboriginal and Torres Strait Islander clients, health practitioners and researchers are increasingly embedding Indigenous perspectives through co-design and applying Indigenous data governance frameworks. This paper shares the preliminary reflections from a research program co-led by a Queensland-based Indigenous community-controlled organisation...
The reliable reuse of language data largely depends on both managing the data in ways that respect the rights, responsibilities and communities from whom it originates, and allowing any user with the appropriate skills and resources to inspect, rerun and extend the analyses that underlie the published findings.
In practice, these goals often collide where data may be preserved in one...
The COVID-19 pandemic has altered how health data is regarded and was a distinct driver for change. The need for rapid analysis and assessment of health data at scale brought sharp focus to the challenges, highlighting the importance of Findable, Accessible, Interoperable and Reuseable (FAIR) data. The heterogeneous nature of health data, together with the wide array of systems and associated...
Natural disasters occur more frequently and intensely and impact predominantly vulnerable and under-resourced communities. Crisis Map suggests an end-to-end real-time, privacy respecting platform based on federated data mesh architecture and predictive artificial intelligence models for better disaster response and resource deployment.
With the integration of satellite imageries, public...
This presentation explores ways of improving researchers’ data skills by creating an environment that engages learners, helps them form networks and gives them greater control over what happens. It draws on three years’ experience facilitating a national research data skills summer school for the Australian Research Data Commons (ARDC). The presentation is suitable for anyone who mentors,...
In the current data-driven era, open government data serves as a catalyst for open innovation and the development of value-added services, while also promoting governmental transparency. These attributes collectively contribute to advancing the United Nations' Sustainable Development Goals (SDGs), a set of 17 objectives outlined in the 2030 Agenda for Sustainable Development, which encompass...
National metrology institutes (NMIs) serve to establish measurement standards for their respective nations and disseminate these standards to end users, including various industries. Through this process, NMIs facilitate freedom in economic activities by overcoming technical trade barriers through compliance with the International Committee for Weights and Measures Mutual Recognition...
Many universities have adopted the use of Data Management Plans (DMPs) for research teams to outline how their research data will be handled both during and after a project. DMPs support responsible data management in accordance with the FAIR principles: Findable, Accessible, Interoperable, and Reusable. The objectives are to contribute towards research integrity, reproducibility, and...
German climate research initiative Paleo Modeling or PalMod1 (currently in phase III) is presented here as an exclusive example where the project end-product is unique, scientific paleo-climate data. PalMod data products include simulated climate data from three state-of-the-art coupled climate models of varying complexity and spatial resolutions. Integrating this simulated or...
This study focuses on intelligent data mining and knowledge discovery services, addressing the critical challenges researchers face in processing massive datasets and optimizing experimental schemes. We propose an innovative solution integrating knowledge graph and artificial intelligence technologies, with a specific application to thermoelectric domain through the development of a...
As global awareness of climate change risks deepens, companies are facing increasing pressure from key stakeholders such as investors, regulators and consumers to adopt more transparent and structured approaches to environmental, social and governance (ESG) practices. ESG disclosures have emerged as critical tools for demonstrating corporate accountability and long-term value creation....
The Taiwan Gateway to Health Data (GHD TW) is a government-funded project that collaborates with all primary data custodians and data controllers in Taiwan. We establish a data portal for various data users, including industrial and academic researchers, to promote community health and clinical and biomedical research. Our primary responsibility is to provide services that enhance the...
The RO-Crate (Research Object Crate) specification (Sefton et al. 2023) is a method for describing data sets with rich, interoperable Linked Data metadata. This presentation will show how we, the Language Data Commons of Australia project (LDaCA), use well described RO-Crate data packages (Soiland-Reyes et al. 2022) to enable CARE (Carroll et al. 2020) and FAIR (Wilkinson et al. 2016)...
With the onset of the Open Science movement, research sites and clinical research sponsors are becoming increasingly entrusted with the storage of large amounts of research data and samples. The prospect of sharing a wide array of health data is an exciting one, as the collaboration of ideas and the expansion of shared knowledge promises to lead to accelerated research outcomes. However,...
The Cross Domain Interoperability Framework (CDIF) provides a set of implementation guidelines designed to lower the barriers to cross-domain research data reuse. CDIF provides standards and methodologies for addressing interoperability issues preventing cross-domain research data utilization. CDIF’s initial version comprises five core profiles: Discovery, Access, Controlled Vocabularies, Data...
Background
This study investigated mental health research data analysis across institutions where privacy of data is key, regulatory restrictions, and variation in how data is structured and stored. These limitations are especially pronounced in resource-constrained settings and federated data analysis offers a promising solution. The Observational Medical Outcomes Partnership Common Data...
The Open Science landscape today is full of good intentions, grand infrastructures, and endless new portals. Yet for many researchers, the daily reality has changed very little. Managing data, publications, and workflows remains fragmented, bureaucratic, and painfully slow. FAIR principles are widely endorsed, but in practice, they are often difficult and time-consuming to implement. The...
Delivering AI systems in enterprise settings requires more than model optimization — it demands infrastructure clarity, cross-functional orchestration, and the ability to navigate complexity over time. This poster highlights real-world lessons from deploying intelligent systems within a cloud-native architecture, combining applied technical experience with strategic delivery in organizational...
Over the past decade, open science has moved from the margins to the mainstream. Yet for many researchers, putting open science into practice remains challenging — due to disciplinary norms, fragmented support systems, and tools that prioritize policy over usability. At Springer Nature, we ar evolving our approach to open science support by embedding FAIR principles into product design and...
Introduction
Citizens' juries and assemblies are increasingly popular public participation methodologies for deliberation on data and artificial intelligence. They are formal, top-down exercises that aim to address power imbalances in the design, application, or regulation of data and AI processes through allowing residents to debate and provide recommendations on specific questions. In March...
The International Meridian Circle Program (IMCP) represents a pivotal international initiative aimed at advancing coordinated space weather observations to address critical global scientific challenges and enhance operational applications. Effective data governance, sharing, and utilization form the cornerstone for transforming multi-national observational synergies into scientific...
Douala General Hospital is a first-class hospital in Cameroon where we meet a
multidisciplinary medical team treating several thousand patients each year. This hospital hosts
numerous patient records that may be useful for public health research. However, majority of
these records are paper-based, hence limiting their exploitation. For some cases, particularly
the pulmonology department,...
Data repositories recognize that the CARE Principles (Collective Benefit, Authority to Control, Responsibility and Ethics) and data sovereignty are integral when working with indigenous communities, but it can be difficult to put words into action. Ocean Networks Canada (ONC) has been working with Local Contexts to address this gap.
ONC has been working on integrating Local Context Labels...
Australia is one of the most food secure countries in the world. However, long-term strategies are needed to ensure Australia has a resilient and sustainable food industry that maintains its ability and reputation for delivering high-quality food nationally and internationally.
The Australian Research Data Commons (ARDC) established its Food Security Data Challenges program to support the...
Polar Environment Data Science Center (PEDSC) of the Joint Support-Center for Data Science Research (DS), the Research Organization of Information and Systems (ROIS) aimed to promote opening and sharing scientific data obtained by research activities in polar regions. Its purpose is to strengthen collaboration with universities and other communities, and to support creation of further...
The Australian Internet Observatory (AIO - https://internetobservatory.org.au/) has been funded by the Australian Research Data Commons (ARDC – www.ardc.edu.au) in 2024 to support large-scale access to and use of social media and digital data more broadly by Australian researchers. A core part of the AIO is the Australian Internet observatory Research Dashboard (AIReD -...
QCIF has developed a purpose-built trusted research environment named KeyPoint to address the increasing need for secure and trusted digital environments for sensitive data in various research fields, including population health, biosecurity, food security, environmental science, and social science. KeyPoint provides a remote analysis environment for sensitive data which enables robust...
To address the pressing challenge of capturing complex non-linear structures in semi-supervised multi-view clustering, we introduce a fundamentally novel framework:Label Propagation Assisted Soft-constrained Deep Non-negative Matrix Factorization for Semi-supervised Multi-view Clustering (LapSDNMF). Unlike prior approaches,LapSDNMF innovatively integrates deep hierarchical modelling with label...
With the rise of cyber threats, automating Named Entity Recognition (NER) in open-source documents is crucial for Cyber Threat Intelligence (CTI). However, cybersecurity NER models face challenges in maintaining large annotated datasets due to the ever-evolving threat landscape. To address this, we introduce LETNER, a label-efficient NER framework that balances performance and annotation...
Automatically linking controlled vocabulary terms in metadata enhances semantic consistency and improves data interoperability across systems—particularly by connecting terms from frameworks such as OntoPortal, Skosmos, Wikidata, and others. This work presents an AI-driven approach that leverages Large Language Models (LLMs) in combination with knowledge graph techniques to identify and...
The field of corpus linguistics has revolutionised linguistic research by providing data-driven insights into the structure, usage, and evolution of languages. By leveraging large-scale text corpora, researchers can uncover patterns in grammar, vocabulary, syntax, and language use that are not easily observable through traditional methods (Omarova et al., 2025). This data-driven approach...
This presentation aims to build a local generative search engine that demonstrates how generative AI can be effectively integrated with semantic search to enhance information retrieval and user interaction. The proposed interface, powered by large language models (LLMs), will simplify access to polar datasets stored in catalogs applications such as GeoNetwork and ERDDAP. By leveraging LLMs,...
Introduction:China has rich geographical and biological diversity, and cultural resources, which have enriched people’s lives.Its diverse and complex geographical environment has given rise to a wealth of geographical products.Geographical Indications (GIs) products hold significant importance in promoting agriculture, enhancing product quality, preserving cultural heritage, and driving...
Effective research data management (RDM) is essential for ensuring that data adheres to the FAIR principles of Findability, Accessibility, Interoperability, and reusability. In this session we will examine how these principles drive the life cycle of metagenomic data at the Uniklinikum University in Aachen, from data generation to long-term storage and reuse.
The process begins when...
Due to the permafrost develops at certain subsurface depths, it cannot be directly observed by remote sensing, and ground-based surveys are costly. As a result, there remains considerable uncertainty in our current understanding of permafrost distribution. This study employs an ensemble simulation approach using multiple machine learning models, integrating the most comprehensive international...
The Macro View, reported at IDW2023, set out to estimate the national scale of research data that is under management for the purpose of future access in Australia and New Zealand. Two key observations can be made:
- The participating institutions lacked internal reports on data as an
asset, from which a total could be easily aggregated. Instead one
off measurement tasks were...
Systemic reform in science continues to face a collective action problem: researchers agree that contributions such as data sharing, peer review, software development, and community engagement are essential, yet these remain structurally undervalued in current incentive systems. Although the Open Science movement has promoted greater transparency and expanded recognition, uptake of alternative...
Dataspaces can unlock the potential of data-intensive research by enabling trusted data sharing. This session explores the challenges and opportunities facing organisations and researchers as they navigate the adoption of trusted sharing using dataspaces.
Bringing together key stakeholders, including researchers, policymakers, and infrastructure providers, this forum will identify barriers,...
There is an urgent need for continental-scale monitoring of threatened species and ecosystems. Acoustic monitoring of the environment, ecoacoustics, provides a scalable way to achieve this. The Open Ecoacoustics platform supports ecoacoustics monitoring of the environment and is open to everyone to aggregate and share data, analyses and tools. The project goal is to enable open science and...
In Australia, the interest in Indigenous Data Sovereignty (IDSov) has increased over the past decade. There are now emerging competing interests between the state as holders of vast Indigenous data assets and how these data are governed and Indigenous communities to drive their agenda on Indigenous data priorities including data sharing from the state to communities and state actors...
The Intergovernmental Panel on Climate Change (IPCC) has been providing climate Assessment Reports (ARs) since 1988, which document the state of climate change and future projections under various options for action. These ARs form the basis of international agreements and actions. UN Secretary-General Guterres described climate action as “the 21st century's greatest opportunity to drive...
The Terrestrial Ecosystem Research Network (TERN) is Australia's national collaborative research Infrastructure for long-term environmental monitoring, data-driven ecological research, and evidence-based decision-making. TERN provides an integrated, standardised, and openly accessible data infrastructure that facilitates collecting, curating, analysing, and distributing high-quality ecological...
Introduction: Proper data management is essential to ensure the integrity, transparency, and reuse of data in scientific research. Objectives: This study aimed to investigate the practices and perceptions related to data management among researchers at a university hospital in southern Brazil. Methods: This was a cross-sectional, exploratory study. Consecutive sampling was used to...
Whole-slide images (WSIs) drive state-of-the-art computational pathology, but hospitals typically restrict their analysis to isolated, air-gapped workstations because these gigapixel slides contain highly sensitive patient data. On such systems the workflow for a single case is onerous: (i) technicians copy the multi-gigabyte WSI to a removable medium and walk it to the secure workstation;...
Dengue surveillance in many countries still relies on a simple outbreak rule: declare an alert when reported cases exceed the historical mean + 2 standard deviations. Although easy to apply, this cut‑off does not adapt to changing transmission patterns, generates frequent false positives, and ignores forecast uncertainty. We develop and evaluate risk‑based outbreak thresholds that incorporate...
Against the global backdrop of "data sharing and reuse", publishing scientific data as a formal research output—compared to traditional data submission or repository archiving—proves more effective in advancing open data sharing and academic recognition. Currently, data publishing remains in its early exploratory stages in China and worldwide, with dedicated scientific data publishing journals...
Since 2015, China has introduced policies and regulations to encourage the sharing of scientific data. Among them, the publication of scientific data is an important part of scientific data sharing. The publication of scientific data includes two parts: data paper and dataset, in which the data paper is published as journal paper, and the dataset is published by registering DOI or e-journal on...
RADAR, developed and operated by FIZ Karlsruhe – Leibniz Institute for Information Infrastructure, provides a robust and versatile research data repository designed to facilitate adherence to FAIR principles—ensuring data is Findable, Accessible, Interoperable, and Reusable. Since its inception, RADAR has evolved significantly, offering enhanced services tailored to diverse research...
The presentation will discuss the importance of relationship building between the university researchers and local Aboriginal community organisations as part of a national research project. The project from the University of Melbourne is led by Distinguished Prof Marica Langton included the current presenters who are members of the research team.
The research project was codesigned by...
This study investigates the current research status of trustworthiness evaluation in China through literature review and web-based surveys, revealing a lack of tailored evaluation frameworks and practices specifically targeting scientific data management platforms in Chinese universities. Building upon the FAIR principles (Findability, Accessibility, Interoperability, and Reusability), this...
Modern life science research increasingly relies on complex data analysis, demanding robust bioinformatics tools, substantial computational resources, and specialised expertise. The German Network for Bioinformatics Infrastructure (de.NBI) addresses these challenges as a national, academic research infrastructure funded by the German Federal Ministry of Education and Research (BMBF) and...
In the context of the digital economy, data has become an important strategic foundational resource and economic growth engine for a country. China is the first country to elevate data to the level of production factors in its policy system, and officially listed data as another key production factor after land, labor, capital, and technology in 2019. When the transformation of production...
The Australian Research Data Commons (ARDC) funded Bushfire Data Commons (BDC) established a range of projects focused on an ever increasing problem to Australia: bushfires. These projects are highlighted in https://ardc.edu.au/program/bushfire-data-challenges/. One of the ARDC funded projects focused on establishment of a front end dashboard to showcase the data sets and tools arising from...
Artificial intelligence (AI) offers powerful potential to address pressing challenges in transboundary water management, especially in regions with insufficient infrastructure for in-situ water quantity and quality monitoring and modeling. However, the successful application of AI in this context depends on more than algorithmic accuracy and can be challenging to achieve even in a system with...
Crisis management plays a role in achieving a sustainable and resilient future by preparing governments and communities to effectively respond to and recover from disruptions . Crisis management generates large volumes of heterogeneous data, including spanning structured databases, unstructured government reports, real-time news reports, and social media channels. Despite such an availability...
Background: Antimicrobial resistance (AMR) is a growing concern in agribusiness sectors with serious consequences to productivity and public health. A data centric approach is needed to support Australian agribusinesses and water sectors to understand the impact of antimicrobial usage on the emergence of resistance for diseases that farmers are faced on a daily basis. The SAAFE CRC...
As digital health technologies proliferate, the potential to harness real-world data (RWD) for improving healthcare outcomes grows dramatically. However, the realization of a truly responsive Learning Health System remains hindered by the complexities surrounding health data sharing. These complexities span technical, legal, regulatory, financial, organizational, and ethical domains and are...
Background: Antimicrobial resistance (AMR) is a growing concern in agribusiness sectors with serious consequences to productivity and public health. A data-centric approach is needed to support Australian agribusinesses and water sectors to understand the impact of antimicrobial usage on the emergence of resistance for diseases that farmers are faced on a daily basis. The SAAFE CRC...
Defined as the presence of any infectious microorganisms in the bloodstream, bloodstream infections (BSIs) pose a major threat to public health. BSI is an important complication that may affect the recovery time, treatments of injured patients. The studies on patients with injury-related BSIs report data from single or selected hospitals. No population-based studies have been conducted on...
In a context where the amount of data is doubling every three years, there is an urgent need to develop and define policies and harmonised practices for research data at an institutional level, connected to national strategies and international frameworks. The aim of this paper will be to demonstrate how international, national and institutional levels can be connected, through the example of...
The World Data System (WDS) is an affiliated body of the International Science Council (ISC) that supports research data repositories and data service providers worldwide. The WDS Early Career Researcher (WDS-ECR) Network is dedicated to nurturing, advancing, and strengthening the capacities of early career scientists within data-centric fields. The network has ten overarching goals as part of...