Dr Susanna-Assunta Sansone

Associate Director (Life, Natural and BioMedical Sciences)

http://uk.linkedin.com/in/sasansone - twitter: @SusannaASansone

I am Associate Director at the University of Oxford e-Research Centre and I also work at Nature Publishing Group as data consultant and Honorary Academic Editor for Scientific Data, an open access data publication platform.


As Principal Investigator at the Centre, my activities are around and in support of data curation, management and publication and their pivotal roles in enabling reproducible research, driving science and discoveries. I focus on life science, environmental and biomedical domains, collaborating with data producers and service providers, and pre-competitive informatics initiatives, journals and funding agencies to develop software and promote the creation and uptake of community-developed ontology and standards.

I lead the Centre in several RCUK, EC and pharma-funded projects and in the ELIXIR UK Node, where I am responsible for standards and curation areas; I am also the international partner in two NIH Big Data to KnowledgeCenters of Excellence.

I am a founding and core member of several international grass-root standards and advocacy groups, and seat on the board of few non-for-profit efforts, including Dryad, the Research Data Alliance, and Force11 working on promoting and supporting the data reproducibility agenda.

I also co-lead the development of a new module on “Data Management, Analysis and Statistics” for the BBSRC Oxford Interdisciplinary Bioscience Doctoral Training Programme.

I hold a PhD in Molecular Biology from Imperial College of Science, Technology and Medicine, London; after few years working on vaccine genetics in an Imperial's spinnoff (now known as Emergent BioSolutions, Inc .) I moved to the European Bioinformatics Institute (EBI, Cambridge) where I worked for nine years as a Project and Team Coordinator and Principal Investigator.

  • Nature Publishing Group - Open Data; Consultant
  • Dryad - Board of Directors; Vice-Chair
  • Research Data Alliance - Technical Advisory Board; Member
  • Storage as a Service Project Board, University of Oxford - Chair
  • IT Architecture Advisory Group, University of Oxford - Member
  • Research Data Oxford Management, University of Oxford - Member
  • Data Intensive Bioscience Expert Working Group, BBSRC - Member
  • BBSRC Oxford Interdisciplinary Bioscience DTP - Data Management Analysis & Statistics module; Co-lead
  1. Philippe Rocca-Serra, Senior Research Lecturer
  2. Alejandra Gonzalez-Beltran, Research Lecturer
  3. Milo Thurston, Research Software Engineer
  4. Peter McQuilton, Knowledge Engineer
  5. Allyson Lister, Knowledge Engineer
  6. David Johnson, Research Software Engineer - Senior Research Associate
  7. Eamonn Maguire, Contractor (former DPhil student)
  8. Reasearch Software Developer(s) - NEW POSTS....


BioSharing and ISA are long-standing and mature infrastructure and resources I run, serving a variety of stakeholder communities in the life sciences, providing them with access to: registries of information on open community standards and a suite of software for collection, curation and storage of data and its provenance, along with semantic technologies and data publication methods

ISA infrastructure and ISA Commons

Embedded in several funded project

Providing a toolkit and a community-driven format, implemented by a growing community of service providers, institutional projects and data journals - to facilitate standards compliant collection, curation, sharing and publication of experiments in the life, natural and biomedical sciences.

Part of the UK interoperability components in the ELIXIR Excelerate platform.

BioSharing Information Resources

Embedded in several funded projects

Web-based curated and searchable portal to ensure that standards, databases and data policies in life sciences are informative and discoverable by researchers, publishers, funders and all stakeholders looking to make informed decisions.

Part of the UK interoperability components in the ELIXIR Excelerate platform.

StatO and OBI - Ontologies for Statistics Results and BioMedical Investigation

Embedded in several funded project

The Ontology for Biomedical Investigations (OBI) project is an international, collaborative effort to build an integrated ontology for the description of biological and clinical investigations.

Digital platforms for scholarly publishing


Collaborations with scientific, technical and medical publishers, inlcuding Nature Publishing Group's Scientific Data to develop novel data platforms to track and publish scholarly outputs.


NIH BD2K CEDAR - Centre for Expanded Data Annotation and Retrieval

Funds and duration: NIH, 2014-2018

CEDAR works to facilitate the use of metadata in the analysis of Big Data sets, contributing to the implementation of NIH Big Data two Knowledge (BD2K) initiative's vision. We work with collegaues at Stanford and Yale Universities to create a unified framework that researchers can use to create consistent, easily searchable standards-compliant metadata. As partner in the centre, I also seat on the Steering Committee, bringing in ISA, BioSharing and our ontology activities.

NIH BD2K BioCADDIE - Biomedical and healthCAre Data Discovery and Indexing Ecosystem

Funds and duration: NIH, 2014-2017

BioCADDIE engages a broad community of stakeholders to create the NIH Big Data two Knowledge (BD2K) Data Discovery Index (DDI). The DDI will do for data what PubMed (and PubMed Central) did for the literature. I seat on its Executive and Steering Committee and lead several working groups, bridging our BioSharing activities on standards and metadata.


Funds and duration: BBSRC, MRC, NERC, 2014-2017 (phase 1); EC, 2015- 2018

The UK Node contributes the country’s substantial expertise in bioinformatics expertise for researchers, computer scientists and data managers in the Life, Natural and Medical Sciences. We lead on standards and curation areas. The UK Node is also funded as part of the larger ELIXIR EXCELERATE grant, set to better integrate activities cross all nodes.

IMI eTRIKS - European Translational Information and Knowledge Management Services

Funds and duration: Roche, 2014-2017

eTRIKS develops the knowledge management platform and services to support data intensive translational research for the Innovative Medicines Initiative (IMI), Europe’s largest public-private initiative. Funded by Roche, we bring in this project ISA, BioSharing and our expertise on community standards.

COPO - Collaboratively Open Plant Omics

Funds and duration: BBSRC, 2015-2018

COPO develops a framework to utilise existing services to facilitate the description, deposition and publication of datasets, but also to enable the identification and citation of datasets, thereby increasing transparency and reproducibilty.

UK-China collaboration on omics data publication and curation

Funds and duration: BBSRC, 2012-2015 (phase 1), 2015-2018 (phase 2)

Collaboration with GigaScence, a joint BioMedCentral and BGI data journal with associated database to define common curation practices for omics-based datasets.

Metagenomics Data Infrastructure

Funds and duration: BBSRC, 2012-2015

Coordinated by EMBL-European Bioinformatics Institute, the Metagenomics service is being developed to be an automated pipeline for the curation, archiving and analysis of metagenomic data.

COSMOS - COordination Of Standards In MetabOlomicS

Funds and duration: EC FP7, 2012-2015

Coordinated by EMBL-European Bioinformatics Institute, COSMOS (Coordination of Standards in Metabolomics) has brought together European metabolomics data providers to set and promote community standards.

PhenoMeNal: Infrastructure for phenome and metabolome analysis

Funds and duration: EC H2020, 2 015-2018

NEW project starting Sep 2015. More info and website coming soon!

MultiMot: Infrastructure for cell migration data

Funds and duration: EC H2020, 2 015-2018

NEW project starting Sep 2015. More info and website coming soon!


The NIH BD2K biomedical and healthCAre Data Discovery Index Ecosystem ( bioCADDIE). White Paper.
From peer-reviewed to peer-reproduced: a role for data standards, models and computational workflows in scholarly publishing. PloS ONE.
The NIH BD2K Center for Expanded Data Annotation and Retrieval (CEDAR). J Am Med Inform Assoc.
An open data ecosystem for cell migration research. Trends Cell Biol.
Modeling a microbial community and biodiversity assay with OBO Foundry ontologies: the interoperability gains of a modular approach. Database.
LinkedISA: semantic representation of ISA-Tab experimental metadata. BMC Bioinformatics.
A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nature Biotechnology.
The Risa R/Bioconductor package: integrative data analysis from experimental metadata and back again. BMC Bioinformatics.
A sea of standards for omics data: sink or swim? J Am Med Inform Assoc.
Standardizing data. Nature Nanotechnology.
OntoMaton: a Bioportal powered ontology widget for Google Spreadsheets. Bioinformatics.
MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data. NAR Database Issue.
Towards interoperable bioscience data. Nature Genetics.
The Stem Cell Discovery Engine: An integrated repository and analysis system for cancer stem cell comparisons. NAR Database Issue.

All my publications are here.