The SciDataCon 2025 Programme is now published.

13–16 Oct 2025
Brisbane Convention & Exhibition Centre
Australia/Brisbane timezone

Bridging the Data Science and Research Data Communities Through Education and Shared Practices

14 Oct 2025, 11:30
1h 30m
Brisbane Convention & Exhibition Centre

Brisbane Convention & Exhibition Centre

Merivale St, South Brisbane QLD 410
Session Data Science and Data Analysis

Speakers

Carolynne Hultquist (University of Canterbury, New Zealand) Christine Kirkpatrick (San Diego Supercomputer Center / CODATA) Daphne Raban (University of Haifa, CODATA Israel NC) Kelsey Druken (ACCESS-NRI) Leo Lahti (University of Turku) Padmanabhan Seshaiyer (George Mason University, US National Committee for CODATA) Phil Bourne (University of Virginia, US National Committee for CODATA, ADSA Board member)

Description

The data science and research data communities share many common goals and challenges. Despite this, the two communities tend to have separate venues for convening, membership, and educational tracks. This session of short presentations and panel discussion will explore some of the ways that these two worlds can come together in the areas of education, training, and data stewardship practices.

This session explores the evolving intersection of the research data and data science communities through diverse lenses ranging from foundational stewardship to citizen engagement with speakers from across the world. Leo Lahti emphasizes the critical journey from observation to interpretation, underscoring the importance of data throughout the research lifecycle. Daphne Raban highlights how data stewardship serves as a vital bridge between research and data science, ensuring that data is managed, documented, and reused effectively. Phil Bourne further examines this bridge, focusing on practical integration between data science techniques and robust research data infrastructures and suggesting ways the organizations that support both communities can come together. Padmanabhan Seshaiyer brings an educational perspective, advocating for embedding research data principles into K-12 and community college bridge programs to foster inclusive data science literacy. Carolynne Hultquist (invited) offers a citizen science and earth sciences vantage point, illustrating how environmental hazard monitoring can blend data stewardship and data science in participatory ways. Kelsey Druken will discuss how ACCESS-NRI is embedding FAIR through software and data workflows for Australia's climate modelling infrastructure. Together, these contributions reveal key synergies and a shared commitment to building interoperable, ethical, and impactful data ecosystems.
The session is meant to elicit ideas and suggestions for action assembled from the audience as well. An outcome will be input for a roadmap and a similar session proposal at the next Academic Data Science Alliance (ADSA) meeting. This proposed session, as well as the one at ADSA, are unique and have not been held before. The closest approximation were the prior sessions that sought to bridge the gap between the research data community and high performance computing (HPC). Sessions were held at IDW 2022, 2024, ISC (IDW for HPC people in Europe) and Supercomputing 22 & 23. Establishing baselines of understanding and presenting shared priorities and goals was key for driving progress and creating awareness of the gap. Aside from companion presentations, a desired outcome might be a future CODATA Task Group, an RDA interest group, or an ADSA activity. Such a group would look at a roadmap for shared education - including professional development and training opportunities, as well as ecosystem tools and services, and shared research priorities.

Issues to Be Addressed by the Session

1) Fragmentation Across Communities

Despite overlapping goals, research data and data science communities often operate in siloed venues with different infrastructures, memberships, and training programs. This session will explore how to foster cross-community dialogue and collaboration.

2) Disconnect Between Practice and Infrastructure

Research data practices (e.g., data curation, stewardship, provenance) are not always integrated into the day-to-day workflows of data science, limiting reproducibility, transparency, and reuse. How can we better embed stewardship into data science infrastructures?

3) Educational Gaps and Opportunities

There is a lack of integrated education and training pathways that bridge research data management and data science skills. The session will address how to co-develop curricula, professional development, and early education programs that reflect both domains.

4) Institutional and Organizational Coordination

Many data-related and data science-related consortia (e.g., CODATA, ADSA, RDA, WDS, GO FAIR) operate in overlapping and unique domains. The session will discuss how these entities might coordinate activities, policies, and collaborate on funding priorities to support shared goals.

5) Roadmapping Shared Goals

There is currently no unified roadmap for aligning the research data and data science ecosystems. The session aims to collect input from the audience to help define shared priorities, pain points, and actions for a future roadmap and collaboration agenda.

6) Sustainability of Shared Ecosystems

Tools, standards, and services that support FAIR, open, and ethical data practices often struggle with sustainability. The session will explore how shared infrastructures and cross-community collaboration can improve the resilience and utility of data ecosystems.

7) Governance, Ethics, and Impact

As both fields intersect with sensitive data (health, environment, education), there’s a need for shared approaches to governance, equity, and ethical AI/data practices. How can we co-create responsible frameworks for stewardship across domains?

Speaker Name, Affiliation, Topic
1. Leo Lahti , University of Turku, Finland, CODATA Executive Committee, From Observation to Interpretation
2. Daphne Raban, University of Haifa, Chair Israel CODATA NC, The Role of Data Stewardship in Research and Data Science
3. Phil Bourne, University of Virginia, US National Committee for CODATA, ADSA Board member, Bridging Data Science and Research Data
4. Padmanabhan Seshaiyer, George Mason University, US National Committee for CODATA, Weaving research data lessons into K-12 and community college data science bridge programs
5. Carolynne Hultquist (invited), University of Canterbury, New Zealand, Embedding FAIR through software and data workflows for Australia's climate modelling infrastructure
6. Kelsey Druken, Australian National University, Embedding FAIR through software and data workflows for Australia's climate modelling infrastructure

Panel moderated by Christine Kirkpatrick, San Diego Supercomputer Center, US National Committee for CODATA

Primary authors

Bonnie Carroll (Co-Chair, US National Committee for CODATA) Christine Kirkpatrick (San Diego Supercomputer Center / CODATA) Kelsey Druken (ACCESS-NRI) Leo Lahti (University of Turku) Padmanabhan Seshaiyer (George Mason University, US National Committee for CODATA) Rania Kosti (National Academies of Sciences, Engineering, Medicine)

Presentation materials

There are no materials yet.