Introduction
The RSEc version-controlled repository federates metadata for research software, predominantly within the life sciences domain. These metadata cover a wide spectrum of use cases, spanning software discovery, evaluation, deployment, and execution. Centralised in an open and version-controlled repository, these metadata can be used to cross-link multiple services, to facilitate curation, and to provide insights on bioinformatics software through their aggregation and analysis. This page contains the information and links to access and understand the software metadata provided by the Research Software Ecosystem, as well as contribution guidelines and support channels.
Metadata Repository contents
The RSEc metadata can be accessed on the GitHub dedicated repository. The main folders to access metadata are: the imports
folder, which contains one subfolder per metadata source, and the data
folder, which contains one subfolder for each of the bio.tools entries, combining bio.tools tools and metadata files which are directly linked to it. An example of this organisation is illustrated in Figure 1.
Fig. 1: Example organisation of the metadata files imported in the RSEc metadata repository
Supported Formats
Details about the specific formats for each of the federated resources can be found in the following places:
Resource Description | Link |
---|---|
Bio.tools API Reference | Bio.tools API Reference |
OpenEBench | OpenEBench Technical metrics and endpoints description Tool JSON Schema Metrics JSON Schema |
Bioconda | Bioconda contribution guidelines |
Biocontainers | WIP |
Galaxy Codex | Documentation work-in-progress |
Debian Med | The YAML files describing the packages are based on information extracted from the Ultimate Debian Database using a custom import script |
BIII | The metadata describing the software are serialized as Bioschemas-based JSON-LD files, using a custom import script |
Bioconductor | The metadata describing the software are in |
Most metadata formats for a given source include cross-links to other sources:
Destination / Source | bio.tools | OpenEBench | Bioconda | Biocontainers | Galaxy Codex | Debian Med | BIII | Bioconductor |
---|---|---|---|---|---|---|---|---|
bio.tools | XXXXXXXXX | url entries of the download key where url starts with "https://anaconda.org/bioconda/" , the remainer of the url being the Bioconda package name |
url entries of the download key where url starts with "https://tracker.debian.org/pkg/" , the remainer of the url being the Debian package name |
|||||
OpenEBench | List elements that have and @id key starting with "https://openebench.bsc.es/monitor/metrics/biotools" |
XXXXXXXXX | List elements that have and @id key starting with "https://openebench.bsc.es/monitor/metrics/bioconda" |
List elements that have and @id key starting with "https://openebench.bsc.es/monitor/metrics/galaxy" |
||||
Bioconda | YAML list extra.identifiers , CURIEs starting with "biotools:" |
XXXXXXXXX | For usegalaxy.eu, YAML list extra.identifiers , CURIEs starting with "usegalaxy-eu:" |
|||||
Biocontainers | XXXXXXXXX | |||||||
Galaxy Codex | ‘bio.tool_id’ key in the JSON file | ‘Conda_id’ key in the JSON file | XXXXXXXXX | |||||
Debian Med | YAML list registries, CURIES are in entry key when name is "bio.tools" |
YAML list registries, CURIES are in entry key when name is "conda:bioconda" |
XXXXXXXXX | |||||
BIII | XXXXXXXXX | |||||||
Bioconductor | XXXXXXXXX |
Metadata Import Workflow
The metadata is imported and updated using a GitHub actions workflow which runs weekly, importing the updated metadata from all sources. Each import task (listed in the table below) is an independent Github action that queries a specific source and updates the metadata in the git repository. It is usually running a python script that:
- cleans all the data from the source from the existing repository checkout
- retrieves the latest version of the metadata, using e.g. an HTTP API, a git repository checkout, or a database access.
- formats these metadata in a format which is as close as possible to the source format, yet compatible with git (e.g. YAML or JSON reformatting is sometimes required).
- commits this version of the metadata. The outline of this workflow is illustrated in Fig. 2.
CI Import workflows in the repository
Contributing guidelines
We welcome any contribution to the project. Please refer to the governance document, and get in contact with us (see Contact Information).
Citation
Ienasescu H, Capella-Gutiérrez S, Coppens F et al. The ELIXIR research software ecosystem [version 1; not peer reviewed]. F1000Research 2023, 12:988 (slides) https://doi.org/10.7490/f1000research.1119585.1
Contact Information
A public channel is available on gitter to get in contact with the project team: https://app.gitter.im/#/room/#bio-tools_ecosystem:gitter.im