Sci-K 2021

1st International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment
April 13, 2021 — Online
Co-located with The Web Conf 2021


In the last decades, we have experienced a substantial increase in the volume of published scientific articles and related research objects (e.g., data sets, software packages); a trend that is expected to continue. This opens up fundamental challenges including generating large-scale machine-readable representations of scientific knowledge, making scholarly data discoverable and accessible, and designing reliable and comprehensive metrics to assess scientific impact. The main objective of Sci-K is to provide a forum for researchers and practitioners from different disciplines to present, educate from, and guide research related to scientific knowledge. Specifically, we foresee three main themes that cover the most important challenges in the field: representation, discoverability, and assessment.


There is an urge for flexible, context-sensitive, fine-grained, and machine-actionable representations of scholarly knowledge that at the same time are structured, interlinked, and semantically rich. Scientific Knowledge Graphs (SKGs) are becoming ... increasingly popular as infrastructures for representing scholarly knowledge. They are large networks describing the actors (e.g., authors, organisations), documents (e.g., publications, patents), ancillary material (e.g., research data, software), contextual information (e.g., projects, fundings), and research knowledge (e.g., research topics, tasks, technologies) in this space as well as their reciprocal relationships. These resources provide substantial benefits to researchers, companies, and policymakers by powering several data-driven services for navigating, analysing, and making sense of research dynamics. Some SKGs examples include Microsoft Academic Graph (MAG), AMiner, Open Academic Graph,, Semantic Scholar, PID Graph, Open Research Knowledge Graph, OpenCitations, and the OpenAIRE research graph. Regarding this aspect, the main challenge is related to the design of ontologies able to conceptualise scholarly knowledge, model its representation, and enable its exchange across different SKGs.

Read more


It is important that scholarly information is easily findable, discoverable, and visible, so that it can be mined and organised within SKGs. To this end, we need discovery tools able to crawl the Web and identify scholarly data, whether on a publisher’s or elsewhere – institutional repositories, pre-print servers, open-access repositories, and others. This is a particularly challenging endeavour because it requires a deep understanding of both the scholarly communication landscape and the needs of a variety of stakeholders: researchers, publishers, funders, and the general public. Typically, in addition to the journal landing page, a paper's open version, pieces of software as well as data sets are often shared via alternative channels that are disconnected from the journal page. Currently, this is a major obstacle that the community of practitioners is facing for creating comprehensive knowledge graphs. In brief, the challenges are related to the discovery and extraction of entities and concepts, integration of information from heterogeneous sources, identification of duplicates, finding connections between entities, and identifying conceptual inconsistencies.

Read more


Due to the continuous growth in the volume of research output, rigorous approaches for the assessment of research impact are now more valuable than ever. In this context, we urge reliable and comprehensive metrics and indicators of the scientific impact ...and merit of publications, data sets, research institutions, individual researchers, and other relevant entities. Scientific impact refers to the attention a research work receives inside its respective and related disciplines, the social/mass media, etc. Scientific merit, on the other hand, relates to the quality aspects of a work, such as its novelty, reproducibility, FAIR-ness, and readability. Nowadays, due to the growing popularity of Open Science initiatives, a large number of useful science-related data sets have been made openly available, paving the way for the synthesis of more sophisticated indicators of scientific impact and merit and, consequently, more rigorous research assessment. For instance, in recent years, we observed a surge of large SKGs, which providevery rich and relatively clean sources of information about academics, their publications and relevant metadata. These SKGs can be used for the development of novel research assessment approaches.

Read more


Sci-K is calling for high-quality submissions around the three main themes of research related to scientific knowledge: representation, discoverability, and assessment. Topics of interest include, but are not limited to:

Keynote Speakers

Prof. Ludo Waltman

Centre for Science and Technology Studies (CWTS) at Leiden University

FAIRness in publishing and evaluating scientific knowledge

Abstract: The FAIR (findable, accessible, interoperable, and reusable) principles for scientific data are well known and widely supported. I will argue that these principles have a much broader relevance, applying not only to scientific data but more generally to the publication and evaluation of scientific knowledge. The idea of FAIRness in the publication and evaluation of scientific knowledge offers a unifying perspective on developments in a number of closely related areas, including open access publishing, preprinting, transparent peer review, and open metadata. My proposal to take a broader perspective on FAIRness provides an agenda for improving scholarly communication and research assessment. I will outline this agenda, summarize the opportunities that it offers, and discuss the responsibilities that it requires different actors in the research system to take. Lessons learned from the COVID-19 pandemic underscore the importance of implementing this agenda.

Prof. Staša Milojević

Luddy School of Informatics, Computing and Engineering, Indiana University Bloomington (IUB)

Capturing tectonic shifts in contemporary science

Abstract: Scientific enterprise has undergone a major transformation since World War II. In order to help solve increasingly complex outstanding problems and questions, contemporary science has adopted new approaches to knowledge production that are predominantly team based and are less confined by disciplinary boundaries. The character of the social, institutional and intellectual aspects of science and their interplay is very complex, and requires new approaches. In this talk I will showcase a number of studies that use the data from scientific publications to shed light on contemporary research practices, such as the formation of research teams, research workforce, interdisciplinarity, productivity and citation dynamics.


- Workshop Opening - Welcome
- Keynote by Ludo Waltman - FAIRness in publishing and evaluating scientific knowledge
- Break
SESSION 1 - Representation Part 1
- FAIR Linked Data - Towards a Linked Data backbone for users and machines. Johannes Frey and Sebastian Hellmann
- HierClasSArt: Knowledge-Aware Hierarchical Classification of Scholarly Articles. Mehwish Alam, Russa Biswas, Yiyi Chen, Danilo Dessì, Genet Asefa Gesese, Fabian Hoppe and Harald Sack
- On Representation Learning for Scientific News Articles Using Heterogeneous Knowledge Graphs. Angelika Romanou, Panayiotis Smeros and Karl Aberer
- Coffee Break
SESSION 2 - Assessment
- BIP! DB: A Dataset of Impact Measures for Scientific Publications. Thanasis Vergoulis, Ilias Kanellos, Claudio Atzori, Andrea Mannocci, Serafeim Chatzopoulos, Sandro La Bruzzo, Natalia Manola and Paolo Manghi
- Predicting Paper Acceptance via Interpretable Decision Sets. Peng Bao, Weihui Hong and Xuanya Li
- Visualising Scientific Topic Evolution. Panagiotis Deligiannis, Thanasis Vergoulis, Serafeim Chatzopoulos and Christos Tryfonopoulos
- Lunch Break
- Keynote by Staša Milojević - Capturing tectonic shifts in contemporary science
- Break
SESSION 3 - Representation Part 2
- Extraction and Evaluation of Statistical Information from Social and Behavioral Science Papers. Sree Sai Teja Lanka, Sarah Rajtmajer, Jian Wu and C. Lee Giles
- Deep Learning meets Knowledge Graphs for Scholarly Data Classification. Fabian Hoppe, Danilo Dessì and Harald Sack
- Coffee Break
SESSION 4 - Discoverability
- C-Rex: A Large-Scale System for Context-Aware Citation Recommendation. Michael Färber, Vinzenz Zinecker, Isabela Bragaglia Cartus, Sebastian Celis and Maria Duma
- Rewarding Research Data Management. Joachim Schöpfel and Otmane Azeroual
- Finding Keystone Citations for Constructing Validity Chains among Research Papers. Yuanxi Fu, Jodi Schneider and Catherine Blake
- Closing

Accepted Papers




Important Dates

Paper submission

January 31, 2021 (23:59, AoE timezone) January 25, 2021

Notification of acceptance

February 19, 2021 February 15, 2021


Camera ready due

March 1, 2021

Workshop day

April 13, 2021



At least one author of each accepted paper should pay the author registration rate (268,40€) for supporting the production of the workshop proceedings.

Other participant or attendees can use the general (146,40€) or student (61€) rate.

Submission guidelines

Submissions are welcome in the following categories:

The workshop calls for full research papers (up to 8 pages + 1 page of references), describing original work on the listed topics, and short papers (up to 4 pages + 1 page of references), on early research results, new results on previously published works, demos, and projects. In accordance with Open Science principles, research papers may also be in the form of data papers and software papers (short or long papers). The former present the motivation and methodology behind the creation of data sets that are of value to the community; e.g., annotated corpora, benchmark collections, training sets. The latter present software functionality, its value for the community, and its application to a non-specialist reader. To enable reproducibility and peer-review, authors will be requested to share the DOIs of the data sets and the software products described in the articles and thoroughly describe their construction and reuse.

The workshop will also call for vision/position papers (up to 4 pages + 1 page of references) providing insights towards new or emerging areas, innovative or risky approaches, or emerging applications that will require extensions to the state of the art. These do not have to include results already, but should carefully elaborate about the motivation and the ongoing challenges of the described area.

Submissions must adhere to the ACM template and format. For Latex submissions, use Master Article Template – LaTeX, and choose the “sigconf” option. (Read more about LaTeX documentation and ACM LaTeX best practices.) For Microsoft Word submissions, use the Interim Layout document. (Read more about interim sample pdf.) Submissions for review must be in PDF format. Submissions must be self-contained and in English. Submissions that do not follow these guidelines, or do not view or print properly, may be rejected without review. Authors are responsible for ensuring that submissions adhere strictly to the required format.

The proceedings of the workshops will be published jointly with The Web Conference 2021 proceedings.

Submit your contributions to Sci-K 2021 Easychair page:

UPDATE: Please note that in order to keep the Web Conference registration fees low, we have been asked to reduce the number of pages from 10 to 9.

Program Committee

Organising Committee

Co-chairs for Sci-K 2021 (alphabetically)

Paolo Manghi

Italian Research Council (CNR), Pisa (IT)

Andrea Mannocci

Italian Research Council (CNR), Pisa (IT)

Francesco Osborne

The Open University, Milton Keynes (UK)

Dimitris Sacharidis

Université Libre de Bruxelles (ULB), Brussels (BE)

Angelo A. Salatino

The Open University, Milton Keynes (UK)

Thanasis Vergoulis

“Athena” RC, Athens (GR)