Topics
Data journals: methodologies towards definition of quality datasets (e.g. metadata, metrics, policies), standard narrative descriptions of dataset (data papers), evaluation of datasets in given communities or cross-discipline approaches
Data publishing workflows: integration of research life-cycle with research outcome dissemination workflows, including data publishing policies, validation steps, combining publication and dataset publishing over different deposition platforms (e.g. institutional repositories and data archives), etc.
Dataset peer-review: Techniques for manually or automatically validate the quality of datasets prior publishing and dissemination (e.g. tools, services, metrics). Integrating underlying research infrastructures services for data validation/management with data journal practices or data publishing workflows.
Data citation: Data citation practices, including metadata about datasets, granularity of datasets, citation for rewards and citation for re-use of datasets, interlinking with publications.
Dataset contextualization: Data models, tools for humans, and services for enriching datasets with semantic information in order to improve their discovery, re-use and quality evaluation.
As in the previous edition, general topics of this workshop also include, but are not limited to, the following research areas:
Metadata formats: Bibliographic metadata formats (e.g. bibliographic metadata such as Dublin Core, JATS, MARC, MODS), scientific metadata formats (e.g. ,
EAD,
LIDO), authority file formats, and how these relate with interlinking or contextualizing research outputs; e.g. to enable automated re-use and citation of datasets (e.g. DataCite-DOIs for research data).
Access services: Services for exporting (e.g. OAI-PMH, RDA, OAI-ORE) metadata in order to reduce interoperability barriers, improve discovery of research outcomes, and facilitate interlinking or contextualizing research outputs.
Data models: Conceptual models expressing the relationships between publications, datasets and other information apt for re-use, contextualization, etc.; e.g. CERIF.
Aggregation services: services for robust and scalable collection, integration, storage and interlinking of heterogeneous objects and metadata from publication, dataset, and contextual content data sources; e.g. OAI-PMH services (
D-NET software toolkit,
Repox,
MINT, etc.), Linked Data services, Big Data facilities (Column Stores and NoSQL databases).
Linking and contextualization services: services for processing interlinked objects and metadata relative to research outcomes with the purpose of enrichment, disambiguation, annotation; e.g. de-duplication tools and mining tools for data inference, Linked Data annotation tools.
Future publication models and services: Novel concepts of “compound” publications and services supporting such publications (see also “Beyond the PDF” event and
FORCE11 manifesto), which are “information packages” combining datasets, traditional publications and other contextual entities into one navigable and/or machine re-usable objects (e.g. enhanced publications, modular articles, Elsevier’s article of the future,
research objects). Services include the creation, storage, preservation, visualization and re-use of such publications (see previous bullet); e.g. Taverna for research objects,
Utopia Documents for interactive publications.
Enabling experiment repeatability and reproducibility: Models and services supporting the realization of “executable” digital publications; these possibly include a narrative part, and the research and components necessary to execute the experiment underlying such research; e.g. Taverna MyEperiments.org for research objects.