AI Interfaces ¶
AI Interfaces are responsible for structuring, controlling, and managing the
interaction between your agent and the LLM. While models abstract
away the details of an AI provider, AI interfaces control the conversation flow.
They can also be connected to a Runtime, which
allows them to marry handle the execution of code and tool calls.
What are AI interfaces? ¶
An AI Interface is both a prompt constructor and a conversation manager. Its job is to create the messages that get sent to the model, manage the history of the conversation, interpret responses, and decide when to stop or continue.
Most importantly, an Agent can use one or
several AI interfaces as part of its process. In fact, by default, the
Agent will use a
PlanningAIInterface
to generate a strategy, then hand things off to an
ExecutionAIInterface
for tool use and code execution.
Built-in AI interfaces ¶
Conatus provides several ready-to-use AI interfaces for different types of agent reasoning and action patterns:
PlainAIInterface: Handles one-shot Q&A and simple schema-extraction tasks, where there's no need for multi-step execution or tool tracking.
PlanningAIInterface: Designed for the "planning" stage. Typically uses reasoning models.
ExecutionAIInterface: Handles step-by-step execution, combining tool use and code. This is the "workhorse" forTasks.
ReactAIInterface: Implements a ReAct-style loop. It's a more basic version of theExecutionAIInterface.
ComputerUseAIInterface: A version of theExecutionAIInterfacethat is more tailored to computer-use AI models.
The two underlying base classes
There are actually two underlying classes for AI interfaces:
BaseAIInterface: The base class for all AI interfaces. It contains the core logic for managing the conversation and the prompt.BaseAIInterfaceWithTask: The base class for all AI interfaces that are linked to aTask. It contains some extra methods that are useful for dealing with tasks (e.g. displaying the expected inputs and outputs). It also continues the run if no tool calls are made, essentially forcing the AI to call theterminateaction.
Most of the time, you will not have to deal with these classes directly. But the aforementioned, built-in interfaces inherit from them.
The conversation loop of an AI interface ¶
Every AI interface is a loop that manages prompt construction, LLM calls, and
step-by-step updating of context and messages. It's abstracted away by the
run method.
Here's the typical lifecycle of the
run method:
- Making the prompt: The interface assembles an
AIPrompt, which includes the system message, user message, tools, schemas, and optionally output schemas. This is done through themake_promptmethod. - Calling the AI model: The prompt is sent to the chosen AI model, and a response is received. This can be streamed or non-streaming.
- Deciding whether to continue: Through the
should_continuemethod, the interface decides whether to keep the loop going (e.g., is another action or observation required? did we reach the maximum number of turns?) or to return a result. - Generating new messages: If continuing, the interface may generate new
messages (including tool response messages) and updates the conversation
history for the next turn. This is done through the
make_new_messagesmethod. - Returning the result: Once finished, it returns a payload with the result, cost, finish reason, andβas neededβfull runtime state.
AI interface payloads ¶
For every conversation handled by an AI interface, the interface will return a payload. Each of these payload will contain the following information:
cost: The cost of the conversation (might be-1if the cost is not available).finish_reason: The reason the conversation finished.result: The result of the conversation. If an output schema was requested, the result should be an instance of the schema. Otherwise, it will be the raw text from the AI model.response: The raw response from the AI model.
For some AI interfaces (e.g.
ExecutionAIInterface),
the payload will also contain a
state attribute,
which contains the state of the Runtime.
Example: structured output via PlainAIInterface ¶
Below is a minimum working example, demonstrating the core idea: create a one-shot Q&A session with structured output. This is tested and works as written.
from dataclasses import dataclass
from conatus.agents.ai_interfaces.plain import PlainAIInterface
@dataclass
class Macros:
calories: int
protein: float
carbs: float
fat: float
interface = PlainAIInterface(
user_prompt="Please provide the typical macros for 100g of almonds.",
output_schema=Macros,
)
result_payload = interface.run()
print(result_payload.result)
# Example output: Macros(calories=579, protein=21.0, carbs=22.0, fat=50.0)
You can use this pattern for any single-turn or simple structured output task.