AI Agentic Engine: Data Dictionary
This document defines the schema, logic, and purpose for every entity in the AI Engine. It is designed to support the Tri-State Architecture (Pipeline, Orchestrator, Persona), JIT Execution, and Strict Compliance Versioning.
1. Core Assets (The Building Blocks)
Engine_KnowledgeBase
Represents a collection of documents used for RAG (Retrieval Augmented Generation).
id(UUID): Primary Key.name(String): Display name (e.g., "AML Policy Documents").vector_provider(Enum): The backend service hosting the vectors (e.g.,PINECONE,QDRANT,PGVECTOR).embedding_model(String): The specific model used to embed the text (e.g.,text-embedding-3-small). Critical: You cannot mix models in one index.retrieval_config(JSON): Default settings for querying this base (e.g.,{ "top_k": 5, "threshold": 0.7 }).
Engine_Document
Represents a single file inside a Knowledge Base.
id(UUID): Primary Key.kb_id(UUID): Foreign Key toEngine_KnowledgeBase.source_url(String): The S3 path or original URL of the uploaded file.status(Enum):INDEXING|READY|ERROR. Used to show UI spinners.token_count(Int): Used for billing storage costs.
Engine_Tool (Container)
The parent container for a capability. Does not hold code, only identity.
id(UUID): Primary Key.name(String): Unique identifier (e.g.,company_house_search).type(Enum):CODE: Internal JavaScript/TypeScript executed in a sandbox.API: External HTTP call defined by OpenAPI spec.MCP: Connection to a Model Context Protocol server.
Engine_ToolVersion (Immutable Snapshot)
The actual logic of a tool. Versioned to prevent breaking flows.
id(UUID): Primary Key.tool_id(UUID): Foreign Key toEngine_Tool.version(Int): Sequential version number.status(Enum):DRAFT(Mutable) |PUBLISHED(Immutable).input_schema(JSON): Zod/JSON Schema defining arguments (e.g.,{ "company_number": "string" }). Passed to the LLM for function calling.source_code(String): The executable logic (if type=CODE).api_spec(String): The OpenAPI JSON (if type=API).mcp_endpoint(String): The URL of the MCP server (if type=MCP).
Engine_ToolSecret
Maps abstract keys in the code to environment variables.
id(UUID): Primary Key.tool_version_id(UUID): Foreign Key toEngine_ToolVersion.key_reference(String): The variable name used in the code (e.g.,API_KEY). The runner looks this up in the secure vault at runtime.
2. Agent Definition (The "Brain")
Engine_Agent (Container)
The parent identity of an AI Worker (e.g., "Kira").
id(UUID): Primary Key.name(String): Display name.
Engine_AgentVersion (Immutable Snapshot)
The configuration of the AI at a specific point in time.
id(UUID): Primary Key.agent_id(UUID): Foreign Key toEngine_Agent.version(Int): Sequential version number.status(Enum):DRAFT|PUBLISHED.builder_type(Enum):FORM(Simple) |FLOW(Built via Type C Flow).source_flow_id(UUID): Ifbuilder_type=FLOW, links to the definition.model_provider(String): e.g.,OPENAI,ANTHROPIC.model_name(String): e.g.,gpt-4-turbo.temperature(Float): 0.0 to 1.0. Controls randomness.max_tokens(Int): Hard limit on output length.system_prompt(String): The compiled instructions.knowledge_base_id(UUID): Optional link to RAG data.response_format(JSON): JSON Schema for Structured Output (e.g., force the agent to return valid JSON).
Engine_AgentToolLink
Many-to-Many relationship defining which tools an agent can use.
agent_version_id(UUID): FK.tool_version_id(UUID): FK.
3. Flow Definition (The Architecture)
Engine_Flow (Container)
The parent container for a process (e.g., "Vera Due Diligence").
id(UUID): Primary Key.name(String): Display name.type(Enum):PIPELINE(Type A): Linear, no loops.ORCHESTRATOR(Type B): State machine, loops allowed.PERSONA(Type C): Compiles to System Prompt.
Engine_FlowTrigger
Defines how a flow starts.
id(UUID): Primary Key.flow_id(UUID): FK.type(Enum):WEBHOOK|SCHEDULE|MANUAL|EVENT.cron_expression(String): Standard Cron syntax (e.g.,0 9 * * 1).webhook_slug(String): The URL path component (e.g.,/api/hooks/v1/{slug}).input_mapping(JSON): Maps incoming webhook payload fields to Flow Input Variables.
Engine_FlowVariable
Environment-specific configuration constants.
id(UUID): Primary Key.flow_id(UUID): FK.key(String): Variable name (e.g.,RISK_THRESHOLD).value(String): The value (e.g.,80).environment(Enum):DEV|PROD. Allows testing different settings without changing the graph.
Engine_FlowVersion (The Executable Graph)
id(UUID): Primary Key.flow_id(UUID): FK.version(Int): Sequential number.status(Enum):DRAFT|PUBLISHED.input_schema(JSON): Defines what data is required to start this flow. Used to auto-generate UI forms.output_schema(JSON): Defines the guaranteed shape of the final result.compiled_graph(JSON): The optimized adjacency list used by the JIT Runner. Strips out UI metadata.
Engine_Node (The Step)
A single unit of work in the graph.
id(UUID): Primary Key.version_id(UUID): FK to FlowVersion.label(String): UI Label.type(Enum):AGENT: Calls an LLM.TOOL: Calls a specific tool directly.SUBFLOW: Triggers another Flow.ROUTER: Semantic (LLM) decision.LOGIC: Deterministic (Code) decision.HANDOVER: Pauses for user interaction.WEBHOOK_WAIT: Pauses for external event.
ref_agent_version_id(UUID): FK. If set, uses a specific version of a reusable Agent.ref_tool_version_id(UUID): FK. If set, uses a specific version of a reusable Tool.inline_system_prompt(String): If no reference, defines the prompt here.router_rules(JSON): For Semantic Routers. Maps intents to Target Node IDs.logic_conditions(JSON): For Logic Gates. JS expressions (e.g.,input.score > 50).wait_event_slug(String): ForWEBHOOK_WAIT. The event name to listen for.input_map(JSON): Maps outputs from previous nodes to inputs of this node (e.g.,{{node_1.output}}).
Engine_Edge (The Connection)
id(UUID): Primary Key.source_node_id(UUID): FK.target_node_id(UUID): FK.type(Enum):DEFAULT: Standard path.SEMANTIC: Chosen by LLM Router.CONDITIONAL: Chosen by Logic Gate.ERROR: Followed if the Source Node fails/crashes.
condition_value(String): The value that triggers this path (e.g., "High Risk" or "true").
4. Execution (The Runtime)
Engine_Execution (The Job)
id(UUID): Primary Key.flow_version_id(UUID): FK. Links to the specific version used.trigger_id(UUID): FK. Which trigger started this?status(Enum):RUNNING|SLEEPING|PAUSED|COMPLETED|FAILED.session_id(String): External ID for grouping (e.g., User Session).global_memory(JSON): The "State Object." Accumulates data as the flow runs.
Engine_StepState (The Checkpoint)
id(UUID): Primary Key.execution_id(UUID): FK.current_node_id(String): The ID of the node that just finished or is about to run.status(Enum):ACTIVE|AWAITING_CALLBACK|AWAITING_USER.node_outputs(JSON): A map of every node ID to its result. Used for "Time Travel" debugging.stack_trace(JSON): If failed, the error stack.
Engine_RuntimeArtifact (Generated Files)
id(UUID): Primary Key.execution_id(UUID): FK.node_id(UUID): FK. Which node created this file?filename(String): e.g.,due_diligence_report.pdf.storage_path(String): S3 Key.mime_type(String): e.g.,application/pdf.
Engine_StepLog (Billing & Debugging)
id(UUID): Primary Key.execution_id(UUID): FK.node_id(UUID): FK.input_tokens(Int): Tokens sent to LLM.output_tokens(Int): Tokens received from LLM.cost_usd(Float): Calculated cost of this step.duration_ms(Float): Execution time.provider_response_id(String): OpenAI/Anthropic Request ID (for tracing).
5. Interaction (Handover)
Engine_ChatSession
id(UUID): Primary Key.execution_id(UUID): FK.mode(Enum):CO_PILOT: Iterative. User can trigger re-runs.ANALYST: Consultative. Read-only report, Q&A only.
is_active(Boolean): True if the session is currently open.
Engine_ChatMessage
id(UUID): Primary Key.session_id(UUID): FK.role(Enum):USER|ASSISTANT|SYSTEM|TOOL.content(String): The message text.tool_calls(JSON): If the agent used a tool (e.g., "Re-run Node"), the call details are stored here.