King Klown Logo
King Klown& KOA

Consumers / AI Agent Integration

AI agents are a primary type of consumer for Ariane.

They use the UI graphs stored in Atlas as a reference for planning and executing actions inside existing software, turning high-level user goals into concrete sequences of UI operations.

This page describes how an AI agent can integrate with Ariane conceptually.


High-Level Flow

From an agent’s point of view, the interaction with Ariane typically follows this loop:

  1. Understand the user’s goal.
  2. Identify the current UI state.
  3. Query Atlas for available actions and intents.
  4. Plan a sequence of transitions toward the goal.
  5. Execute (or instruct) the steps in the live UI.
  6. Observe the resulting state and adjust if necessary.

Ariane supports steps 2–4 by providing structured, semantic information about the UI.


1. Goal Representation

An agent starts with a goal, often expressed in natural language:

Internally, the agent should map this to:

Ariane does not perform this mapping itself; it exposes a vocabulary of intents that the agent can align to.

See: Atlas/Ontology-Vocabulary


2. State Recognition Against Atlas

To act correctly, the agent must know which state in Atlas corresponds to the user’s current screen.

A typical process:

  1. Observe the live UI via:

    • Direct access to accessibility APIs, or
    • Screen capture + OCR + element detection.
  2. Compute or approximate fingerprints compatible with Atlas:

    • Structural representation (if a tree is accessible).
    • Visual/perceptual hash (if screenshots are available).
    • Semantic hints from labels and titles.
  3. Query Atlas:

    • “Given these fingerprint components and context (app, version, platform), what is the closest state_id?”

Atlas returns:

The agent then:


3. Inspecting Available Actions

Once the agent has a state_id, it can ask:

  1. What actions are available?

    • Query Atlas for all outgoing transitions from this state.
    • Retrieve each transition’s:
      • Action type.
      • Target element.
      • Target state.
      • Optional intent.
  2. How are actions presented in the UI?

    • For each associated element, retrieve:
      • Role (button, menu item, etc.).
      • Label text.
      • Bounds/coordinates.
      • Patterns (primaryAction, destructiveAction, etc.).

The agent now knows, for this state:


4. Planning with Intents

Given a goal intent (e.g., ExportToPDF), the agent can treat the UI graph as a planning space.

Local Decision

In many cases, the next step is local:

Example:

current_state: state_home
goal_intent: ExportToPDF

Atlas returns:
- transition: trans_home_to_export_dialog
  - intent: Export