Fequently Asked Questions (FAQ)

Are the dataset submissions open to all researchers?#

The repository in Open Access to all researchers. Viewing the data is possible without limitation, the download of datasets is only possible if the users are registered (to avoid misuse) but all interested people can register. The submission of datasets requires registration (for further correspondence and curation requests if applicable) but is open to all scientists.

Does the repository assign a stable persistent identifier (PID) for each dataset at publication, such as a digital object identifier (DOI)? #

The Chemotion-Repository assigns a DOI for each dataset (and molecules, reactions) via DataCite via the DataCite agency TIB Hannover. The name used in the Publisher namespace is “Chemotion.net”. The Chemotion-Repository is assigned to the DOI string 10.14272.

Under which kind of license are the data of the Chemotion-Repository available? Is there any restriction for derivative use or commercial use?#

Data are available under CC by SA 4.0 license. https://creativecommons.org/licenses/by-sa/4.0/

Does the Chemotion-Repository charge access fees or subscription fees?#

No. Viewing the data is possible without limitation (Open Access), only the download of datasets is limited to registered users (to avoid misuse). All interested people can register easily and without any costs or disadvantages.

Is there a long-term data management plan (including funding) to ensure that datasets are maintained for the foreseeable future.#

The repo will be part of the initiative Science Data Center of the federal state of Baden-Württemberg (https://mwk.baden-wuerttemberg.de/de/service/presse/pressemitteilung/pid/vier-science-data-centers-in-baden-wuerttemberg/) and is planned for the national research data infrastructure in Germany. A long term storage and archival plan is developed with the Steinbuch Center for Computing under these umbrellas.

Are there examples that demonstrate the acceptance within the relevant research community?#

A very early example showing the acceptance by reviewers and another one showing how a nice integration of the repository data to a publisher’s website can be gained are given below. Other examples can be found in the repository and the link of the datasets to the relevant publication. The first published article with full data deposition in Chemotion (Org Lett.) with referencing of all datasets in the SI. All data were published in the Chemotion-Repository before the final acceptance of the publication, showing the acceptance by the reviewers and the publisher:
N. Jung, S. Grässle, D. S. Lütjohann, S. Bräse, Org. Lett. 2014, 16, 1036. https://pubs.acs.org/doi/10.1021/ol403313h
Another example is a publication in BJOC in 2018, where the publisher developed novel options for the listing of research data based on the functions of the repository (indexing per RInCHI and link to Chemotion-Repository). Y.-C. Huang, A. Nguyen, S. Vanderheiden, S. Gräßle, N. Jung, S. Bräse, Beilstein J. Org. Chem. 2018, 14, 515. https://www.beilstein-journals.org/bjoc/articles/14/37
The relevance in materials sciences could be demonstrated by a recently accepted manuscript: Synthesis of Functionalized Azobiphenyl- and Azoterphenyl- Ditopic Linkers: Modular Building Blocks for Photoresponsive Smart Materials, S. Grosjean, P. Hodapp, Z. Hassan, C. Wöll, M. Nieger, S. Bräse, ChemistryOpen 2019, accepted.

Are there any entries in other databases for the Chemotion-Repository?#

Chemotion is listed in re3data: https://www.re3data.org/repository/r3d100010748,

FAIRsharing.org: https://fairsharing.org/biodbcore-001268/ and

RiSources: https://risources.dfg.de/detail/RI_00351_en.html

Is there a repository Twitter handle or similar activities?#

Hints and demos for the repository and the ELN can be found on youtube: https://www.youtube.com/channel/UCWBwk4ZSXwmDzFo_ZieBcAw

How large is your current user base?#

Users with registration: 163. To access the data, no registration is necessary. The access to data is not tracked.

How many datasets are currently hosted by the repository?#

There are currently 3251 analyses having 8045 datasets.

How long has your resource been available to the community?#

Since 2014 (with interruption due to major updates/rework in 2016-2018)

What type of experimental data can be hosted by the repository? (If the repository only accepts specific file formats please state what these are.)#

There is no limitation with respect to file formats. Open File formats are preferred and special visualization tools for e.g. JCAMP spectra exist.

What is the maximum file size that can be handled by the repository?#

No limitation so far but we think about a limit at 50 MB in future

Are there any limitations to the amount of data that an individual is able to upload?#

So far not. We will limit the size if misuse is detected. The data to be disclosed are reviewed and misuse will be detected.

Does the repository have the facility to provide controlled access to sensitive data?#

Users have the option to collect data on their private account (1) without disclosure or (2) to disclose the data with an embargo. If the embargo option is selected, the data will, after reviewing, be available only for the users and external reviewers (by mail notification if desired by the user) and release will take place after additional confirmation by the user.

Is the repository able to facilitate confidential peer review of hosted datasets? If yes, please briefly describe the workflow for reviewing hosted datasets, including how reviewers may access data which are not public at the time of review.#

Reviewing first takes place by an automated quick check of most common data like NMR data (counting of signals necessary and analyzed) and in addition by a peer reviewing by the repository owners. Reviewing workflow includes comment functions for the datasets that allows to reject data and to ask for revision (mediated via the repository UI and per email for notifications). If the data is submitted to the repo with embargo, the user has the option to provide access to the data to single external reviewers (access given by limited accounts, provided by mail).

How can I link hosted datasets to relevant articles after publication?#

The repository submissions can be assigned to a doi or a reference added to single data submissions or a collection of data submissions.

In addition, the repository provides a function to retrieve a virtual DOI for the dataset even if the data is not disclosed yet. This allows to give the correct doi already in the supporting information of a publication even if the dataset is not available to the public yet. This allows a direct link of publication and dataset in the SI (for example). Please see or „How To topics“.

Are there some examples of how to present the doi of datasets in articles or the SI?#

For the link of data to the repository, please see the supporting information of the selected examples:

Example 1: link to publication:

Example 2: link to publication:

Link to Datasets (examples):

Has the repository undergone WDS (World Data Systems), DSA (Data Seal of Approval) or CTS (Core Trust Seal) accreditation?#

No, but we will apply for a Core Trust Seal soon

Are researchers able to modify or remove datasets after publication?#

Only the submission of a new version is possible. A new version can be sent but the old one will be still visible/accessible.

Do you guarantee persistent access to datasets, and for how long?#

For at least 10 years.

Is there is any curation support for researchers uploading their datasets? If so, please describe this briefly.#

The data is curated manually by peer reviewing supported by an automated analysis of NMR data. The analysis of NMR data is compared automatically with the expected signals (according to the formula of the compounds provided).

Do you capture any metadata about hosted datasets in a standardised way? If so, please state which metadata formats are used.#

Metadata support is provided for information like analysis types and reaction names. The metadata can be selected via a dropdown menue (supports search) that provides the embedded ontobee ontologies CHEMINF and CHMO http://www.ontobee.org/. We use the metadata format of DataCite for DOI submissions.

How are data accessible if one set is an embargo?#

The embargo can only be released by the submitter of the datasets. Only if all data that belong to the embargo bundle are complete and reviewed, the release is possible. Please see also the Usage Pages for further information.