King Klown Logo
King Klown& KOA

Abstract Wiki Architect

Abstract Wiki Architect is a family-based, data-driven NLG toolkit for Abstract Wikipedia and Wikifunctions.

Instead of writing one renderer per language (“300 scripts for 300 languages”), this project organises NLG as:

The goal is to provide a professional, testable architecture for rule-based NLG across many languages, aligned with the ideas behind Abstract Wikipedia and Wikifunctions, but usable independently.


1. Abstract Wikipedia and Wikifunctions: motivation

Abstract Wikipedia and Wikifunctions aim to:

To do that at scale, you need more than “one script per language”. You need:

This repository is a way to:

It is not an official Wikimedia component, but it is designed with that ecosystem in mind.


2. Integration into Konnaxion

Konnaxion is a broader socio-technical platform project, focused on:

Konnaxion is not an AI product and does not expose AI features to end users. AI is used on the builder side: to design, generate, and refactor entire files, modules, and documentation. The running system is conventional software.

The intended relationship between Abstract Wiki Architect and Konnaxion is:

  1. Shared semantic structures

    • Use similar semantic frames to Abstract Wikipedia (entities, roles, events, biographical frames, membership frames, etc.) for Konnaxion’s knowledge objects.
    • Keep the semantic layer close to what Abstract Wikipedia / Wikifunctions adopt, so ideas – and possibly data – can move between them.
  2. Renderer for multi-lingual narratives

    • Use Abstract Wiki Architect as a rendering layer for:
      • short descriptions of roles, mandates, and decisions,
      • summaries of processes and events,
      • biographies and contextual information about actors in a socio-technical system.
    • Reuse the same constructions that already express “X is a Polish physicist” to express things like “X is a mediator for Y” or “X coordinates process Z”.
  3. Clear separation of concerns

    • Within Konnaxion, Abstract Wiki Architect is a library/subsystem, not the whole platform.
    • Konnaxion focuses on governance and coordination; Abstract Wiki Architect focuses on transforming structured data into multi-lingual text.
  4. Alignment with Wikimedia

    • Where Abstract Wikipedia / Wikifunctions stabilise on certain semantic patterns or APIs, Konnaxion can reuse them instead of reinventing them.
    • This repo is a place where that alignment can be tested and iterated in code.

In short: Abstract Wiki Architect is the NLG / language layer; Konnaxion is the larger socio-technical platform that may reuse it.


3. What the tool does (architecture overview)

Very roughly, the architecture is:

Engines (families) + Configs (languages) + Constructions (sentence patterns)

  • Lexica + Frames (semantics) + Discourse + Router/API

3.1 Engines and morphology

3.2 Constructions (sentence patterns)

Under constructions/:

Constructions are family-agnostic:

3.3 Frames, semantics, and discourse

3.3.1 Semantic frames

Under semantics/ and docs/FRAMES_*.md:

semantics/normalization.py and semantics/aw_bridge.py map “loose” inputs (dicts, CSV rows, Z-objects) into typed frames that the engines and constructions can consume.

A key example is the biography frame:

from semantics.types import Entity, BioFrame

marie = Entity(
    id="Q7186",
    name="Marie Curie",
    gender="female",
    human=True,
)

frame = BioFrame(
    main_entity=marie,
    primary_profession_lemmas=["physicist"],
    nationality_lemmas=["polish"],
)

This BioFrame can be passed to the internal router or the public NLG API.

3.3.2 Discourse and information structure

Under discourse/:

This is what allows outputs like:

“Marie Curie is a Polish physicist. She discovered radium.”

instead of:

“Marie Curie is a Polish physicist. Marie Curie discovered radium.”

and makes it possible to build topic–comment variants for languages where that matters.

3.4 Lexicon subsystem

Under lexicon/:

Lexicon data in data/lexicon/ (e.g. en_lexicon.json, fr_lexicon.json, it_lexicon.json, ru_lexicon.json, ja_lexicon.json, …) typically contains:

Supporting tools include:

3.5 Router, profiles, and API

Examples:

On top of this, a small public NLG API (docs/FRONTEND_API.md) exposes:

from nlg.api import generate_bio, generate
from semantics.types import Entity, BioFrame

bio = BioFrame(
    main_entity=Entity(name="Douglas Adams", gender="male", human=True),
    primary_profession_lemmas=["writer"],
    nationality_lemmas=["british"],
)

result = generate_bio(lang="en", bio=bio)
print(result.text)       # "Douglas Adams was a British writer."
print(result.sentences)  # ["Douglas Adams was a British writer."]

result2 = generate(lang="fr", frame=bio)
print(result2.text)

The API returns a GenerationResult (final text, sentence list, debug info), and hides router / engine / lexicon details from callers.


4. QA and test-driven development

The toolkit is built around test suites and regression checks:

The intent is to make it easy to:


5. Hosting and HTTP exposure

Beyond the Python API, the stack can be exposed as a web service:

The HTTP API simply wraps the same frame-based generate(...) / generate_bio(...) calls and returns JSON, making it easier to integrate with other systems (including potential Wikifunctions prototypes or Konnaxion).

Details are in docs/hosting.md.


6. How it is built (development approach)

The project emphasises architecture and file-level organisation:

AI is used on the builder side:

Automated tests and QA suites provide stability as the architecture evolves.

The aim is a codebase that is: