King Klown Logo
King Klown& KOA

Theseus / State Identification

State identification is how Theseus decides whether the current UI is:

To build a stable UI graph, Theseus needs state identifiers that:


What Is a State?

For Ariane, a state is:

A specific configuration of the user interface that is relevant for interaction and navigation.

Examples:

A state is represented by:


Fingerprints

Each observed UI tree is converted into a fingerprint. Fingerprints are used to derive:

A typical fingerprint is composed of:

  1. Structural hash

    • Encodes the shape and roles of the UI tree:
      • Node types (button, input, menu item, etc.).
      • Hierarchical relationships.
    • Insensitive to purely visual changes (e.g., colors, fonts).
  2. Visual hash (perceptual)

    • Encodes the appearance of the screen (e.g., pHash of a screenshot).
    • Robust to small visual differences but changes when layout/content changes visibly.
  3. Optional semantic hash

    • Encodes textual content (labels, titles) using OCR or accessibility text.
    • Useful when structure is similar but text differences matter.

Example: Fingerprint Computation (Conceptual)

Pseudo-code, omitting implementation details:

def compute_fingerprint(ui_tree, screenshot=None, text_tokens=None):
    structure_id = hash_tree_structure(ui_tree)
    visual_id = perceptual_hash(screenshot) if screenshot is not None else None
    semantic_id = hash_text_tokens(text_tokens) if text_tokens is not None else None

    return {
        "structure": structure_id,
        "visual": visual_id,
        "semantic": semantic_id,
    }

def compute_state_id(fingerprint):
    parts = [
        fingerprint["structure"],
        fingerprint.get("visual") or "",
        fingerprint.get("semantic") or "",
    ]
    return hash_concatenate(parts)