SciCom KI: Science Communication Infrastructure

Updated 19 November 2025

SciCom KI is a networked, socio-technical system that collects, curates, and disseminates scientific knowledge in structured, machine-interpretable formats.
It integrates diverse media types to enable civic and expert engagement through semantic search, annotation, and visualizations.
Designed with FAIR principles and decentralized architectures, SciCom KI supports collaborative curation, fact-checking, and interoperability across scientific workflows.

A Science Communication Knowledge Infrastructure (SciCom KI) is a networked, socio-technical system designed to collect, organize, curate, and disseminate scientific knowledge through structured, machine-interpretable formats. It facilitates both civic and expert engagement with scientific outputs in textual and non-textual media, scales collaborative curation, supports fact-checking, and enables advanced semantic search, annotation, and visualization. SciCom KI transcends traditional document-based workflows, integrating people, artifacts, and institutions into a persistent, interoperable environment for leveraging and verifying scientific claims, tools, projects, and media (Kloppenborg et al., 15 Feb 2024, Wittenborg et al., 12 Nov 2025, Wittenborg et al., 12 May 2025).

1. Foundational Principles and Formal Definition

SciCom KI is formally defined as a tuple

$\text{SciCom KI} \;=\;\bigl(\mathcal{U},\;\mathcal{P},\;\mathcal{L},\;\mathcal{M}\bigr)$

where:

$\mathcal{U}$ : community of users/practitioners
$\mathcal{P}$ : set of pages or entities (tools, topics, projects, people, media)
$\mathcal{L} \subseteq \mathcal{P} \times \mathcal{P}$ : semantic links between entities
$\mathcal{M}$ : governance and moderation mechanisms (Kloppenborg et al., 15 Feb 2024)

This abstraction supports both fine-grained scientific workflows and mass-media communication. Deployments range from knowledge graphs centered on scholarly articles (ORKG) (Jaradeh et al., 2019, Brack et al., 2020), to wiki-based repositories (Personal Science Wiki, SciCom Wiki) (Kloppenborg et al., 15 Feb 2024, Wittenborg et al., 12 Nov 2025), and decentralized, peer-to-peer federations for data, workflows, and conversations (Saunders, 2022).

SciCom KI is distinguished from classical Knowledge Infrastructures by its explicit orientation toward science communication tasks, inclusion of media artifacts beyond text, and emphasis on collaborative, FAIR (Findable–Accessible–Interoperable–Reusable) principles (Wittenborg et al., 12 Nov 2025).

2. Stakeholder Roles, Task Taxonomy, and Requirements

Stakeholder analysis in SciCom KI research identifies at least six core roles: viewer, researcher, teacher, content creator, curator, developer (Wittenborg et al., 12 Nov 2025, Wittenborg et al., 12 May 2025). Requirements elicitation (53 survey participants, 11 interviews) yields a ranked task taxonomy:

Task	% Ranking #1	Examples
Find	72%	Locate podcast/video by topic, title, etc.
Compare	—	Cross-check statements across media
Curate	—	Sort/filter by topic, language, date
Debate	—	Discuss open points

Top user-valued media criteria include: topic, language, release date, sources/citations, license, length, transcript availability (Wittenborg et al., 12 Nov 2025).

Further annotation needs cluster around neutrality, conflicts of interest, pseudoscience flags, inclusivity, accessibility, collaboration, and trust metrics (Wittenborg et al., 12 Nov 2025). These requirements inform both the data model and user-interface design of SciCom KI implementations.

3. Architecture, Data Models, and Platform Implementations

SciCom KI is realized via several architectural paradigms:

3.1. Knowledge Graph–Based Infrastructure

ORKG and related systems capture research contributions as triple-based statements in a labeled property graph $G = (V, E)$ with:

$V$ : resource nodes (articles, problems, methods, results, authors, organizations, media, claims)
$E \subseteq V \times P \times V$ : directed, typed edges (statements)
Data model extensible via third-party vocabularies, entity linking, and provenance annotations (Jaradeh et al., 2019)

3.2. Wiki-Based Collaborative Systems

Personal Science Wiki and SciCom Wiki combine MediaWiki/SemanticMediaWiki or Wikibase (for graph storage) with structured templates (infoboxes, semantic properties) and front-end modules:

Main category navigation
Semantic search and expansion
Open editing and incremental review cycles
Taxonomy and tagging with flexible schema overlays

For videos and podcasts, SciCom Wiki orchestrates a tripartite system: Linked Data Wiki (Wikibase, RDF triples), Full Text Wiki (transcripts), and Dashboard (React/TS, microservices for search, filtering, and integration) (Wittenborg et al., 12 Nov 2025, Wittenborg et al., 12 May 2025).

3.3. Decentralized, Federated Infrastructure

P2P approaches layer distributed hash tables (DHT), content-addressable swarms (BitTorrent/IPFS/Dat), modular DAG-based computational workflows, and federated wikis. RDF-style linked data, ActivityPub/Matrix for communication, and flexible governance structures support resilience, interoperability, and credit visibility (Saunders, 2022).

3.4. Formal Data Model Summary

$\mathcal{G} = \{(s, p, o) \mid s \in \mathcal{E},\; p \in \mathcal{P},\; o \in \mathcal{V} \cup \mathcal{L}\}$

with $\mathcal{E}$ entities, $\mathcal{P}$ properties, $\mathcal{V}$ entity references, $\mathcal{L}$ literals (RDF triple structure) (Wittenborg et al., 12 Nov 2025).

4. Knowledge Acquisition, Curation, Fact-Checking, and FAIR Compliance

Knowledge acquisition in SciCom KI is multi-modal, leveraging manual curation, semi-automatic extraction, and fully automated pipelines:

Manual/community curation: Template/infobox creation, incremental page edits, provenance review (crowdsourced and structured) (Kloppenborg et al., 15 Feb 2024, Jaradeh et al., 2019).
Semi-automatic: NLP extractors suggest field values for infoboxes, with curator approval and active learning loops to optimize classifier accuracy (Brack et al., 2020).
Automated ingestion: NER, relation extraction, and domain-specific pipelines populate graphs from bulk sources (papers, corpora, media transcripts); entity linking to external ontologies (Jaradeh et al., 2019).

For non-textual media, SciCom Wiki integrates a neurosymbolic computational fact-checking pipeline:

Transcript extraction (Whisper)
NER and entity linking (Stanza, spaCy, Wikidata)
Relation extraction: SRL, LLM
Graph construction: $G_u$ (media claims), $G_t$ (ground-truth science)
Verification: subgraph matching, semantic proximity scoring
Confidence measurement: $c = w_1 \cdot 1_{\mathrm{exact}} + w_2 \cdot \mathrm{score}_{\mathrm{veracity}} + w_3 \cdot \mathrm{LLM\_confidence}$ , $\sum w_i = 1$ (Wittenborg et al., 12 May 2025)

FAIR principles are deeply embedded:

Principle	SciCom KI Implementation	Example Features
Findable	Persistent Q-IDs, indexed metadata	Faceted search, SPARQL endpoint
Accessible	Open APIs, Web Dashboard	No-login data access, export facilities
Interoperable	RDF/Wikibase JSON, domain vocabularies	Cross-referencing (DOI, ORCID)
Reusable	Explicit licensing, provenance graphs	Versioning, full edit history

5. Evaluation, User Studies, and Metrics

Empirical evaluation of SciCom KI systems utilizes both objective and subjective metrics:

Usability tests (14–21 participants): task completion rates, time-to-completion ( $\bar{x}$ ), subjective effectiveness/efficiency (ASQ, UEQ), feature satisfaction ratings (Kloppenborg et al., 15 Feb 2024, Wittenborg et al., 12 Nov 2025, Wittenborg et al., 12 May 2025).
Cluster analysis of card sorts: hierarchical cluster analysis yields 6–7 archetypal page/resource groups (tracking variables, methods, projects, tools, people, community) (Kloppenborg et al., 15 Feb 2024).
Fact-checking tool evaluation: F-score for verifying peer-reviewed claims ( $\approx$ 0.81–0.85 neurosymbolic; $\approx$ 0.72 LLM-only), expert interpretability, public trust in scores (Wittenborg et al., 12 May 2025).

Findings consistently show high efficiency and effectiveness in search tasks (e.g., mean task time $<$ 3 min, mean tasks solved 4/5), with excellent UEQ benchmarks and robust support for the most valued criteria (topic, language, release date) (Wittenborg et al., 12 Nov 2025).

6. Challenges, Limitations, and Future Directions

Despite successes, SciCom KI faces several technical, legal, and social challenges:

Scaling community curation: seeding massive content repositories is critical; automated extraction and annotation must be accessible to non-technical contributors (Kloppenborg et al., 15 Feb 2024).
Ontology alignment and granularity: balancing specificity, quality, and coverage across domains, mapping internal URIs to broader vocabularies (Brack et al., 2020, Jaradeh et al., 2019).
Legal and ethical boundaries: transcript ingestion and sensitive metadata collection require robust consent, data-protection, and governance frameworks (Wittenborg et al., 12 Nov 2025).
Incentivization and sustainability: adoption among contributors (beyond viewers), reputation metrics, credit assignment, and low-friction entry points (Saunders, 2022, Wittenborg et al., 12 Nov 2025).
Interoperability and extensibility: abstraction for UI, stable APIs, developer documentation, plug-in support (Wittenborg et al., 12 Nov 2025).
Fact-checking of non-textual media: robust curation of ground-truth KGs, improvements in LLM extraction quality, and integration of symbolic rules (Wittenborg et al., 12 May 2025).

Recommended paths forward include collaborative KG hubs, plug-and-play microservice architectures for extraction and verification, extension to real-time media monitoring, and dedicated portals for semantic annotation by subject matter experts (Wittenborg et al., 12 May 2025, Wittenborg et al., 12 Nov 2025).

7. Prospects for Community-Driven, Open, and Interoperable Science Communication

SciCom KI offers a resilient foundation for federated, open science communication across heterogeneous media, research disciplines, and institutional boundaries. Key prospects include:

Integration of macro-to-micro exploration paradigms in knowledge visualization (e.g., phylomemetic networks, seabed/kinship views) to trace the genealogy, emergence, and dissemination of scientific ideas (Lobbé et al., 2021).
Embedding FAIR principles and provenance annotation in all workflows to support transparency, reusability, and public engagement (Wittenborg et al., 12 Nov 2025).
Deployment of decentralized infrastructures to combat platform-capture and foster autonomous, ethically governed federations (Saunders, 2022).
Sustained expansion of shared ontologies, semantic annotation platforms, and automated fact-checking pipelines to address the challenges posed by misinformation and scale (Wittenborg et al., 12 May 2025).

Realizing the full vision of SciCom KI requires ongoing, collaborative effort across technical, institutional, and community dimensions, ensuring that scientific knowledge remains transparent, actionable, and broadly accessible.