Skip to content Skip to footer

Documentation

Introduction

The RSEc version-controlled repository federates metadata for research software, predominantly within the life sciences domain. These metadata cover a wide spectrum of use cases, spanning software discovery, evaluation, deployment, and execution. Centralised in an open and version-controlled repository, these metadata can be used to cross-link multiple services, to facilitate curation, and to provide insights on bioinformatics software through their aggregation and analysis. This page contains the information and links to access and understand the software metadata provided by the Research Software Ecosystem, as well as contribution guidelines and support channels.

Metadata Repository contents

The RSEc metadata can be accessed on the GitHub dedicated repository. The main folders to access metadata are: the imports folder, which contains one subfolder per metadata source, and the data folder, which contains one subfolder for each of the bio.tools entries, combining bio.tools tools and metadata files which are directly linked to it. An example of this organisation is illustrated in Figure 1.

Fig. 1: Example organisation of the metadata files imported in the RSEc metadata repository

Unsupported markdown: link

imports

biotools

software1.biotools.json

software2.biotools.json

software3.biotools.json

bioconda

bioconda_software1.yaml

bioconda_software2.yaml

bioconda_software4.yaml

data

software1

software1.biotools.json

bioconda_software1.yaml

software2

software2.biotools.json

bioconda_software2.yaml

software3

software3.biotools.json

Supported Formats

Details about the specific formats for each of the federated resources can be found in the following places:

Resource Description Link
Bio.tools API Reference Bio.tools API Reference
OpenEBench OpenEBench Technical metrics and endpoints description
Tool JSON Schema
Metrics JSON Schema
Bioconda Bioconda contribution guidelines
Biocontainers WIP
Galaxy Codex Documentation work-in-progress
Debian Med The YAML files describing the packages are based on information extracted from the Ultimate Debian Database using a custom import script
BIII The metadata describing the software are serialized as Bioschemas-based JSON-LD files, using a custom import script

Most metadata formats for a given source include cross-links to other sources:

Destination / Source bio.tools OpenEBench Bioconda Biocontainers Galaxy Codex Debian Med BIII
bio.tools XXXXXXXXX   url entries of the download key where url starts with "https://anaconda.org/bioconda/", the remainer of the url being the Bioconda package name     url entries of the download key where url starts with "https://tracker.debian.org/pkg/", the remainer of the url being the Debian package name  
OpenEBench List elements that have and @id key starting with "https://openebench.bsc.es/monitor/metrics/biotools" XXXXXXXXX List elements that have and @id key starting with "https://openebench.bsc.es/monitor/metrics/bioconda"   List elements that have and @id key starting with "https://openebench.bsc.es/monitor/metrics/galaxy"    
Bioconda YAML list extra.identifiers, CURIEs starting with "biotools:"   XXXXXXXXX     For usegalaxy.eu, YAML list extra.identifiers, CURIEs starting with "usegalaxy-eu:"  
Biocontainers       XXXXXXXXX      
Galaxy Codex ‘bio.tool_id’ key in the JSON file   ‘Conda_id’ key in the JSON file   XXXXXXXXX    
Debian Med YAML list registries, CURIES are in entry key when name is "bio.tools"   YAML list registries, CURIES are in entry key when name is "conda:bioconda"     XXXXXXXXX  
BIII             XXXXXXXXX

Metadata Import Workflow

The metadata is imported and updated using a GitHub actions workflow which runs weekly, importing the updated metadata from all sources. Each import task (listed in the table below) is an independent Github action that queries a specific source and updates the metadata in the git repository. It is usually running a python script that:

  • cleans all the data from the source from the existing repository checkout
  • retrieves the latest version of the metadata, using e.g. an HTTP API, a git repository checkout, or a database access.
  • formats these metadata in a format which is as close as possible to the source format, yet compatible with git (e.g. YAML or JSON reformatting is sometimes required).
  • commits this version of the metadata. The outline of this workflow is illustrated in Fig. 2.
Resource Description CI code location
Bio.tools https://github.com/research-software-ecosystem/utils/tree/main/biotools-import
OpenEBench https://github.com/research-software-ecosystem/utils/tree/main/openebench-import
Bioconda https://github.com/research-software-ecosystem/utils/tree/main/bioconda-import
Biocontainers https://github.com/research-software-ecosystem/utils/tree/main/biocontainers-import
Galaxy Codex https://github.com/research-software-ecosystem/utils/tree/main/galaxytool-import
Debian Med https://github.com/research-software-ecosystem/utils/tree/main/debian-med-import
BIII https://github.com/research-software-ecosystem/utils/tree/main/biii-import
CI Import workflows in the repository

bio.tools

metadata import

OpenEBench

BioConda

BIII

BioContainers

Galaxy CoDeX

Debian Med

RSEc

Contributing guidelines

We welcome any contribution to the project. Please refer to the governance document, and get in contact with us (see Contact Information).

Citation

Ienasescu H, Capella-Gutiérrez S, Coppens F et al. The ELIXIR research software ecosystem [version 1; not peer reviewed]. F1000Research 2023, 12:988 (slides) https://doi.org/10.7490/f1000research.1119585.1

Contact Information

A public channel is available on gitter to get in contact with the project team: https://app.gitter.im/#/room/#bio-tools_ecosystem:gitter.im