AI Messages: Inputs / Outputs Data Structures ¶

Because Conatus works across multiple AI providers, it needs to be able to represent all AI inputs and outputs. This document covers the core data structures for everything you send to the AI model, everything you get back, and all the objects that appear in between (such as messages, content parts, tool calls, and usage).

Summary of the main data structures ¶

Concept	Class(es)	Purpose
Input prompt	`AIPrompt`	The entire user input to an AI model
Messages	`AIMessage`, `UserAIMessage`, etc.	Individual utterances, both in and out
Content parts	`UserAIMessageContentPart`	Structure of (possibly multi-media) message content
Tool calls	`AIToolCall`, `ComputerUseAction`	Requests from the AI to execute code (functions, tools)
Output/response	`AIResponse`	Everything the AI model returns (messages, tool calls, etc.)
Streaming (WIP)	`Incomplete*`	For partial/resumable/streamed results
Usage/cost	`CompletionUsage`	Track tokens, model name, and pricing

Unsupported (for now)

The following concepts are not yet supported:

Audio messages (inputs or outputs)
Image outputs (inputs are fine)
Citations / references
File uploads

Visual overview ¶

This diagram shows the hierarchy of the main classes:

Message class hierarchy Click to view larger

Accordingly, we'll start first by looking at the (in red) AIPrompt class and AIResponse classes, since they are the main data structures you'll be using.

We'll then go one level below with the various AIMessage subclasses. (in green)

Finally, we'll look at the (in orange) UserAIMessageContentPart and (in blue) AssistantAIMessageContentPart classes, which are the building blocks of message content.

We will also look at the (in gray) AIToolCall and ComputerUseAction classes, which are the building blocks of tool calls.

The Input: `AIPrompt` Structure ¶

(Once again, we're talking here about the messages in red in the image above.)

The AIPrompt class is the single entry point for everything you send to the AI model. It is designed to be flexible and extensible, handling:

Message histories (previous_messages, new_messages)
System/developer messages (system)
Tool/function call specifications (tools, actions). These are converted to an AIToolSpecification object, which is essentially a wrapper around a pydantic JSON schema.
Output schemas/structured output formats
"Computer use" configuration (for browser or simulated agent interfaces)

from conatus import AIPrompt, action

@action
def my_adder_function(a: int, b: int) -> int:
    return a + b

prompt = AIPrompt(
    user="What's the sum of 2 and 2?",
    system="Be concise.",
    actions=[my_adder_function],               # For tool use
    output_schema=int,            # Request structured output
)

There are three ways to initialize the AIPrompt class:

Simple, new conversation: Pass a string to user. This will create a new conversation with a single message.
Conversation with multiple messages: Pass a list of messages to messages. This will create a new conversation with the given messages.
Conversation with history: You can make a distinction between previous and new messages. This is helpful because some AI providers (like OpenAI's Responses API) allow to send only new messages, as long as you provide a previous messages ID. In this case, you need to pass (1) a list of new_messages as well as (2) a list of previous_messages, previous_messages_id, or both.

In each case, you can optionally pass a system message with system, a list of tool specifications with tools, a list of actions with actions, and an output schema with output_schema.

For more information, see AIPrompt's API reference.

Example: Passing actions as tools ¶

from conatus import action, AIPrompt

@action
def add(a: int, b: int) -> int:
    return a + b

prompt = AIPrompt(
    user="Add 100 and 23.",
    actions=[add],
    system="Use tools if needed.",
)

Example: Structured output ¶

from conatus import AIPrompt
from typing_extensions import TypedDict

class MenuItem(TypedDict):
    name: str
    price: float

prompt = AIPrompt(
    user="Give me 3 coffee drinks with prices.",
    output_schema=list[MenuItem],
)

Example: Conversation history ¶

from conatus import AIPrompt
from conatus.models.inputs_outputs.messages import UserAIMessage, AssistantAIMessage

prompt = AIPrompt(
    previous_messages=[
        UserAIMessage(content="What's 5+7?"),
        AssistantAIMessage.from_text("12"),
    ],
    new_messages=[
        UserAIMessage(content="What about 8+9?")
    ],
    # This is the ID of the previous response,
    # which is given by some AI providers (e.g. OpenAI's Responses API)
    previous_messages_id="resp_123",
)

Once again, for more, see AIPrompt's API reference.

The Output: `AIResponse` Structure ¶

(Once again, we're talking here about the messages in green in the image above.)

The AIResponse class encapsulates everything returned by an AI model call. This includes:

The full original prompt context (prompt)
The AI's message received (message_received) and its text content (all_text)
Any tool calls embedded in the reply (tool_calls), as well as any code snippets that were present in the AI's message (code_snippets)
Usage statistics (tokens/cost/model) (usage, cost)
Structured output (structured_output) if an output schema was requested and the model returned structured output
Finish reason (finish_reason), UID (uid), and more

from conatus import AIPrompt
from conatus.models import OpenAIModel

model = OpenAIModel()
prompt = AIPrompt(user="What's the sum of 2 and 2?")
response = model.call(prompt)
print(response.message_received) # The AI's message (as an AssistantAIMessage)
print(response.all_text)         # The text content of the AI's message
print(response.code_snippets)    # Any code snippets in the text
print(response.tool_calls)       # Any requested tool calls
print(response.structured_output)  # Parsed to the requested schema/type
print(response.usage)           # Tokens/cost info

For more information, see AIResponse's API reference.

Message Hierarchy (`AIMessage`) ¶

(Once again, we're talking here about the messages in green in the image above.)

Messages are the backbone of any conversation. Conatus represents them via a class hierarchy:

SystemAIMessage: Also known as "developer message" or "instructions".
UserAIMessage: User input (str or multi-part content composed of images and text chunks)
AssistantAIMessage: AI assistant's reply, itself composed of text chunks, reasoning chain-of-thoughts, refusal objects, and tool calls
ToolResponseAIMessage: Results of a tool call execution, which transmits the tool call's output to the AI model.

Here's an example of a simple conversation:

from conatus.models.inputs_outputs.messages import UserAIMessage, AssistantAIMessage

m1 = UserAIMessage(content="Hi!")
m2 = AssistantAIMessage.from_text("Hello. How can I help you?")

Message (Content) Parts ¶

Many messages (especially those going to/from models like OpenAI/Anthropic) may contain rich, multi-part content, not just text:

Text (multi-part or segmented)
Images (base64 or URLs)
Tool calls (serialized commands)
Refusals, rationale, etc.

Conatus has standardized "content part" classes for both users and assistants.

User Message Content Parts ¶

(Once again, we're talking here about the messages in orange in the image above.)

UserAIMessageContentTextPart: Text snippet, may be linked to an image
UserAIMessageContentImagePart: Image (base64 or URL)

from conatus import AIPrompt
from conatus.models import OpenAIModel
from conatus.models.inputs_outputs.messages import UserAIMessage
from conatus.models.inputs_outputs.content_parts import (
    UserAIMessageContentTextPart,
    UserAIMessageContentImagePart,
)

URL = "https://upload.wikimedia.org/wikipedia/commons/thumb/8/85/Tour_Eiffel_Wikimedia_Commons_%28cropped%29.jpg/1280px-Tour_Eiffel_Wikimedia_Commons_%28cropped%29.jpg"

image_part = UserAIMessageContentImagePart(image_data=URL, is_base64=False)
text_part = UserAIMessageContentTextPart(text="Where was this picture taken?")

m = UserAIMessage(content=[text_part, image_part])

prompt = AIPrompt(messages=[m])
model = OpenAIModel()
response = model.call(prompt)
print(response.all_text)
# > This picture was taken in Paris, France, featuring the Eiffel Tower.

Assistant Message Content Parts ¶

(Once again, we're talking here about the messages in blue in the image above.)

AssistantAIMessageContentTextPart: Text snippet
AssistantAIMessageContentToolCallPart: Tool call request
AssistantAIMessageContentRefusalPart: Refusal to answer
AssistantAIMessageContentReasoningPart: Reasoning chain-of-thought (might not contain any text).

See Content Parts for more details.

Tool Calls ¶

(Once again, we're talking here about the messages in gray in the image above.)

Tool calls allow the AI to ask the runtime to invoke a function, perform a computer use action, or otherwise call tools.

Function/Tool Calls ¶

AIToolCall is the protocol-agnostic structure:

name: Function/tool name to call
returned_arguments: Arguments, which can be passed as a dict or a string.
call_id: Which needs to be mentioned when we pass back the result of the tool call to the AI model.

The arguments are accessible as dict or str via two helper properties:

Computer Use Actions ¶

Some providers (OpenAI/Anthropic) sometimes use special tool calls to perform computer use actions. We unify them under the ComputerUseAction family:

Simulates clicks, typing, scrolling, screenshots, etc.
Each type (e.g., CULeftMouseClick) is a dataclass with clearly specified fields

Note that these computer use objects are somewhat tightly linked to the Runtime class, which is responsible for executing tool calls. In particular, the ComputerUseAction objects requires a environment_variable argument that refers to a RuntimeVariable object.

These tool call objects appear in assistant messages as content parts, and may show up in responses (and can be passed to the execution/runtime subsystem).

See Tool Calls for more details.

Usage and Cost Tracking ¶

Each AI response may have usage and pricing statistics attached.

CompletionUsage records:

model_name: The name of the model used
prompt_tokens: The number of tokens in the prompt
completion_tokens: The number of tokens in the completion
total_tokens: The total number of tokens
Special fields for cached tokens (cached_used_tokens, etc.)
Extendable extra fields (e.g. reasoning tokens, etc.)

Note that you can add CompletionUsage instances together, which is useful for accumulating usage during streaming.

Finally, the cost property can be used to compute the cost of the response. This requires that the model name and prices are present in the conatus.models.prices module.

Streaming and Incomplete Structures ¶

To handle streaming/partial results, Conatus provides a family of Incomplete* classes.

You can add them together with + (e.g., combine chunks of output)
Each has a .complete() method which "finalizes" an object (for example, turning a partial message into a proper AssistantAIMessage)
Handles merging text/tool calls smartly.

This protocol is crucial for implementers writing streaming adapters for new models/APIs.

See Streaming and Incompletes for more details.

API Reference Summary ¶

These documentation pages provide full auto-documentation for each data structure and its methods:

Where do I start if I want to implement my own model/provider? ¶

See How-To: Add a new AI provider and refer to the BaseAIModel API docs for the expected protocols and required conversions.