AI Interfaces ¶

AI Interfaces are responsible for structuring, controlling, and managing the interaction between your agent and the LLM. While models abstract away the details of an AI provider, AI interfaces control the conversation flow. They can also be connected to a Runtime, which allows them to marry handle the execution of code and tool calls.

What are AI interfaces? ¶

An AI Interface is both a prompt constructor and a conversation manager. Its job is to create the messages that get sent to the model, manage the history of the conversation, interpret responses, and decide when to stop or continue.

Most importantly, an Agent can use one or several AI interfaces as part of its process. In fact, by default, the Agent will use a PlanningAIInterface to generate a strategy, then hand things off to an ExecutionAIInterface for tool use and code execution.

Built-in AI interfaces ¶

Conatus provides several ready-to-use AI interfaces for different types of agent reasoning and action patterns:

PlainAIInterface: Handles one-shot Q&A and simple schema-extraction tasks, where there's no need for multi-step execution or tool tracking.

PlanningAIInterface: Designed for the "planning" stage. Typically uses reasoning models.

ExecutionAIInterface: Handles step-by-step execution, combining tool use and code. This is the "workhorse" for Tasks.

ReactAIInterface: Implements a ReAct-style loop. It's a more basic version of the ExecutionAIInterface.

ComputerUseAIInterface: A version of the ExecutionAIInterface that is more tailored to computer-use AI models.

The two underlying base classes

There are actually two underlying classes for AI interfaces:

BaseAIInterface: The base class for all AI interfaces. It contains the core logic for managing the conversation and the prompt.
BaseAIInterfaceWithTask : The base class for all AI interfaces that are linked to a Task. It contains some extra methods that are useful for dealing with tasks (e.g. displaying the expected inputs and outputs). It also continues the run if no tool calls are made, essentially forcing the AI to call the terminate action.

Most of the time, you will not have to deal with these classes directly. But the aforementioned, built-in interfaces inherit from them.

The conversation loop of an AI interface ¶

Every AI interface is a loop that manages prompt construction, LLM calls, and step-by-step updating of context and messages. It's abstracted away by the run method.

Here's the typical lifecycle of the run method:

Making the prompt: The interface assembles an AIPrompt, which includes the system message, user message, tools, schemas, and optionally output schemas. This is done through the make_prompt method.
Calling the AI model: The prompt is sent to the chosen AI model, and a response is received. This can be streamed or non-streaming.
Deciding whether to continue: Through the should_continue method, the interface decides whether to keep the loop going (e.g., is another action or observation required? did we reach the maximum number of turns?) or to return a result.
Generating new messages: If continuing, the interface may generate new messages (including tool response messages) and updates the conversation history for the next turn. This is done through the make_new_messages method.
Returning the result: Once finished, it returns a payload with the result, cost, finish reason, and—as needed—full runtime state.

AI interface payloads ¶

For every conversation handled by an AI interface, the interface will return a payload. Each of these payload will contain the following information:

cost: The cost of the conversation (might be -1 if the cost is not available).
finish_reason: The reason the conversation finished.
result: The result of the conversation. If an output schema was requested, the result should be an instance of the schema. Otherwise, it will be the raw text from the AI model.
response: The raw response from the AI model.

For some AI interfaces (e.g. ExecutionAIInterface), the payload will also contain a state attribute, which contains the state of the Runtime.

Example: structured output via PlainAIInterface ¶

Below is a minimum working example, demonstrating the core idea: create a one-shot Q&A session with structured output. This is tested and works as written.

from dataclasses import dataclass
from conatus.agents.ai_interfaces.plain import PlainAIInterface

@dataclass
class Macros:
    calories: int
    protein: float
    carbs: float
    fat: float

interface = PlainAIInterface(
    user_prompt="Please provide the typical macros for 100g of almonds.",
    output_schema=Macros,
)

result_payload = interface.run()
print(result_payload.result)
# Example output: Macros(calories=579, protein=21.0, carbs=22.0, fat=50.0)

You can use this pattern for any single-turn or simple structured output task.