The [Australian Reference Genome Atlas][1] (ARGA) is a next-generation platform designed to index, connect and expose genomic data for Australia’s mega-biodiversity. It sits at the interface between the twin problems of genomic data discovery in an age where data are rapidly proliferating, and the crisis in documenting and understanding Australia’s vulnerable biodiversity. More than 80% of...
Background
The digitalisation of Electronic Health Record (EHR) data has unlocked unique opportunities for research. Unlike administrative datasets, EHRs provide granular clinical data, real-time updates within systems, and access to detailed clinical notes. Despite these advantages, EHR data—primarily collected for operational purposes—remains siloed, lacks standardisation between systems,...
Overview
Understanding unnecessary data, known as Dark Data is a major operational challenge in large-scale shared storage. We propose an alternative approach that leverages HPC workflow tools to collect extended metadata at each stage of job execution, minimizing the need for fundamental system changes.
Background
The High Performance Computing Infrastructure (HPCI) project was...
The increasing digitalization of science, coupled with the push for Open Science and FAIR data (Findable, Accessible, Interoperable, Reusable), presents significant challenges for managing diverse research outputs effectively throughout their lifecycle. Traditional Data Management Plans (DMPs) often lack the detail and machine-actionability needed for dynamic research processes, while Current...
Agriculture Victoria Research (AVR) undertook a project for multimodal data analysis and anomaly detection for beehive health to address the critical challenge of determining hive health comprehensively in a pollination environment where diverse modalities of data are at interplay. Honeybee populations play an essential role in pollination within Victoria's horticulture industry, making the...
Academic institutions often hold large volumes of unstructured text data—such as chat transcripts, research publications, and strategic documents—but may lack accessible methods to analyze and interpret these resources effectively. This presentation shares how Singapore Management University (SMU) Libraries leveraged BERTopic, an AI-driven topic modeling tool for text clustering, along with...
About Data Commons and Data Meshes
A data commons is a cloud-based software platform with a governance framework that enables a research community to manage, analyze and share its data. A data mesh is a collection of two or more data commons, cloud-based computational resources, and other cloud-based resources that interoperate using a common set of core software services and a hybrid...
In response to inefficiencies in governmental regulation, such as excessive regulation, an overabundance of laws and procedures, lack of flexibility, and disregard for the costs, countries around the world began efforts to optimize regulation, in part by privatization. The trend toward privatization of social services necessitated substantial development of governmental supervision practices....