2 CKG Dev Workflow

Status: Version 1.0 (Sept ‘25’)

CKG - Knowledge graph and publishing system development workflow.

Goal

The goal is to enable easier knowledge use of complex document corpus by using a knowledge graph to enable the following:

The workflow represents the stages that go from harvesting an unstructured document corpus, web or PDF, and converting it to structured data.

Wikibase is used for storage and knowledge graph creation to support the following features:

Locate AR6 report data sources - authors, glossary, acronyms list, etc.
AR6 web scrape report texts (corpus harvest)
Design initial knowledge graph data model
Wikibase AR6 import
Wikibase to Mediawiki report navigation mapping to Mediawiki Categories for report browsing in Mediawiki
Harvest data:
- Authors
- Glossary
- Acronyms list
- References
- Bibliographic
- etc.
Import above AR6 data to Wikibase
Annotate report using above AR6 data
Community annotation: #semanticClimate, Stockholm Climate Institute (SEI), Potsdam Climate Institute (PIK), UNESCO, UNFCCC, etc.
Wikibase to Wikidata data mapping
Wikibase data analysis and visualisations
Publications generated from Wikibase of A& available as:

REST API
Command line
Jupyter Noteooks
Python CMS (e.g., Wagtail)
Graph RAG LLM
- All of the above publishing channels use the following framework. Computational Publishing Service (CPS) using the publishing engine (CPS_Impress). CPS_Impress publishes from Wikibase to HTML using Jinja templating in a ‘Model View Controller’ architecture. Paged Media CSS styles are used to create PDF like layouts. Publications are saved back to the knowledge graph and online as sharable resources.