Creating a custom AI interface ¶
Conatus is designed to be extensible, and one way to extend it is to create a
custom BaseAIInterface.
AI interfaces are the classes that handle the interaction between the agent and the LLM. Another way to think of them is as a way to add a step to your agent. In essence, they are a glorified prompt constructor and conversation manager.
Pre-loaded AI interfaces include:
PlanningAIInterface, which is used to plan the task;ExecutionAIInterface, which is used to execute the task;ReactAIInterface, which is used to implement the ReAct framework;- or even
PlainAIInterface, which is used to implement very simple conversations.
But you can also create your own AI interfaces. This is useful if you want to add a step to your agent that takes in the previous steps and the final answer, and decides whether the answer is correct or not.
In this how-to, we'll see how to implement a reflection step in the agent that takes in the previous steps and the current state, and decides whether the agent should adjust.
The BaseAIInterface class ¶
The BaseAIInterface class
is the abstract base class for all AI interfaces. It really only exposes two
methods:
run: This method takes no arguments, and returns anAIInterfacePayload, containing the result of the AI interface, the cost, the finish reason, the response from the LLM, and potentially a structured output.arun: The asynchronous version of therunmethod.
In reality, these methods implement a loop that relies on a few other methods,
including
make_prompt,
should_continue,
make_new_messages,
and
extract_result.
More details
For more details on how the loop works, see the AI interfaces documentation.
BaseAIInterfaceWithTask vs BaseAIInterface ¶
Another wrinkle to keep in mind is that there are two concrete implementations
of the BaseAIInterface
class:
BaseAIInterface, which does not have a task;BaseAIInterfaceWithTask, which is linked to aTask.
Here, we will focus on the
BaseAIInterfaceWithTask
class, since we need to communicate to the AI information such as previous
steps, the task definition, etc. Note, however, that the lessons you will learn
here will apply to the
BaseAIInterface class as
well.
Preparing the prompt ¶
First, you need to prepare the prompt. This is done in the make_prompt
method.
This method should return an
AIPrompt object. While the
structure of that class might seem daunting at first, it's actually quite simple
to instantiate:
from conatus import AIPrompt
prompt = AIPrompt(
system="This is a system (or developer) prompt.",
user="This is a user message.",
)
Your only job is to prepare the two string parameters. Let's see how this is done in practice.
Creating a system message for a reflection step ¶
Let's say that you want to add a reflection step to your agent that does the following:
- Takes in the previous steps and the current state.
- Decides whether the agent should continue or not.
- Adds instructions to the agent on how to adjust its behavior.
Let's say that we want to communicate this to the LLM. We could have a system message that says something like:
system_message = """
You are part of a state of the art multi-agent system that executes task for
the user. The distinctive feature of this system is that we control a secure
Python runtime. This means that you can use **Actions** (essentially functions),
and you can manipulate **variables** by passing them as references.
THINK OF THIS TASK LIKE A FUNCTION: you have inputs, and our goal is to return
the outputs as specified. Therefore, we want this task to be accomplished in a
way that is repeatable. The starting variables in the environment section are
essentially just an example of the inputs of the function.
Given the following steps and the current state, decide whether you think that
the agent has accomplished the task. If it has, return `False`. If it has not,
return `True` and add instructions on how to adjust the agent's behavior.
Please respond in a JSON object with the following fields:
* `has_accomplished_task`: Whether the agent has accomplished the task or not.
Should be a boolean.
* `instructions`: Instructions on how to adjust the agent's behavior. Should be
a string.
"""
Showing the previous steps and the current state ¶
Next, we need to prepare the user prompt. This will take the previous steps and the current state, and format them in a way that is easy for the LLM to understand.
Thankfully, the
BaseAIInterface class has
several methods that help with this.
For instance, the get_steps_so_far_xml
method returns the previous steps in XML format, which is easy for the LLM to
understand.
We could do something like this:
# This CANNOT be run as is.
from collections.abc import Collection
from dataclasses import dataclass
from conatus.agents.ai_interfaces.base import BaseAIInterfaceWithTask
from conatus.models.inputs_outputs.prompt import AIPrompt
from conatus.models.inputs_outputs.messages import ConversationAIMessage, SystemAIMessage
from conatus.models.inputs_outputs.response import AIResponse
@dataclass
class ReflectionPayload:
has_accomplished_task: bool
instructions: str
class ReflectionAIInterface(BaseAIInterfaceWithTask[ReflectionPayload]):
# ...
def make_prompt(
self,
*,
conversation_history: Collection[ConversationAIMessage] = (),
conversation_history_id: str | None = None,
conversation_history_system_message: SystemAIMessage | None = None,
previous_response: AIResponse[ReflectionPayload]
| AIResponse
| None = None,
new_messages: list[ConversationAIMessage] | None = None,
) -> AIPrompt[ReflectionPayload]:
task_definition = self.get_task_definition_xml()
steps_so_far = self.get_steps_so_far_xml()
variables = self.get_all_variables_xml()
return AIPrompt(
system=system_message,
user=f"{task_definition}\n\n{steps_so_far}\n\n{variables}",
output_schema=ReflectionPayload,
)
Now, ReflectionAIInterface.make_prompt will return an AIPrompt
object with the system and user
prompts prepared. This can be safely passed to the LLM (e.g. OpenAIModel
's acall
or acall_stream
methods).
You'll note that we also passed an output_schema to the AIPrompt
object. This is because we want
to validate the response from the LLM. If the response is not in the correct
format, the LLM will be asked to fix it.
Determine whether to continue or not ¶
By default, the
BaseAIInterface class
will stop after one turn. But you can change the behavior by overriding the
should_continue
method.
For instance, let's say that we will require the instructions, if they are given, to be longer than 100 characters; otherwise, we ask the LLM to fix it.
# This CANNOT be run as is.
from dataclasses import dataclass
from typing import Literal
from conatus.agents.ai_interfaces.base import BaseAIInterfaceWithTask
from conatus.models.inputs_outputs.messages import ConversationAIMessage, UserAIMessage
from conatus.models.inputs_outputs.response import AIResponse
@dataclass
class ReflectionPayload:
has_accomplished_task: bool
instructions: str
class ReflectionAIInterface(BaseAIInterfaceWithTask[ReflectionPayload]):
# ...
continuation_reason: Literal["wrong_format", "too_short"] | None = None
def should_continue(self, response: AIResponse[ReflectionPayload]) -> bool:
if response.structured_output is None:
self.continuation_reason = "wrong_format"
return False
# If we don't need to look at the instructions,
# we can stop the conversation here.
if response.structured_output.has_accomplished_task:
return False
# If the instructions are not long enough, we ask the LLM to fix it.
if len(response.structured_output.instructions) <= 100:
self.continuation_reason = "too_short"
return True
return False
async def make_new_messages(
self, response: AIResponse[ReflectionPayload] | AIResponse
) -> list[ConversationAIMessage]:
if self.continuation_reason == "wrong_format":
return [UserAIMessage("Instructions not in correct format.")]
if self.continuation_reason == "too_short":
return [UserAIMessage("Instructions too short.")]
return []
Putting it all together ¶
Now, we can put it all together to create a new AI interface that takes in the previous steps and the current state, and decides whether the agent should continue or not.
At the end of the script, you can see how to use the AI interface. Here, the AI should tell us that we're not done yet.
from collections.abc import Collection
from dataclasses import dataclass
from typing import Literal
from conatus import browsing_actions, task
from conatus.agents.ai_interfaces.base import BaseAIInterfaceWithTask
from conatus.models.inputs_outputs.messages import (
ConversationAIMessage,
SystemAIMessage,
UserAIMessage,
)
from conatus.models.inputs_outputs.prompt import AIPrompt
from conatus.models.inputs_outputs.response import AIResponse
from conatus.runtime.runtime import Runtime
from conatus.io.file import FileWriter
@dataclass
class ReflectionPayload:
has_accomplished_task: bool
instructions: str
system_message = """
You are part of a state of the art multi-agent system that executes task for
the user. The distinctive feature of this system is that we control a secure
Python runtime. This means that you can use **Actions** (essentially functions),
and you can manipulate **variables** by passing them as references.
THINK OF THIS TASK LIKE A FUNCTION: you have inputs, and our goal is to return
the outputs as specified. Therefore, we want this task to be accomplished in a
way that is repeatable. The starting variables in the environment section are
essentially just an example of the inputs of the function.
Given the following steps and the current state, decide whether you think that
the agent has accomplished the task. If it has, return `False`. If it has not,
return `True` and add instructions on how to adjust the agent's behavior.
Please respond in a JSON object with the following fields:
* `has_accomplished_task`: Whether the agent has accomplished the task or not.
Should be a boolean.
* `instructions`: Instructions on how to adjust the agent's behavior. Should be
a string.
"""
class ReflectionAIInterface(BaseAIInterfaceWithTask[ReflectionPayload]):
continuation_reason: Literal["wrong_format", "too_short"] | None = None
def make_prompt(
self,
*,
conversation_history: Collection[ConversationAIMessage] = (),
conversation_history_id: str | None = None,
conversation_history_system_message: SystemAIMessage | None = None,
previous_response: AIResponse[ReflectionPayload]
| AIResponse
| None = None,
new_messages: list[ConversationAIMessage] | None = None,
) -> AIPrompt[ReflectionPayload]:
task_definition = self.get_task_definition_xml()
steps_so_far = self.get_steps_so_far_xml()
variables = self.get_all_variables_xml()
return AIPrompt(
system=system_message,
user=f"{task_definition}\n\n{steps_so_far}\n\n{variables}",
output_schema=ReflectionPayload,
)
def should_continue(self, response: AIResponse[ReflectionPayload]) -> bool:
if response.structured_output is None:
self.continuation_reason = "wrong_format"
return False
# If we don't need to look at the instructions,
# we can stop the conversation here.
if response.structured_output.has_accomplished_task:
return False
# If the instructions are not long enough, we ask the LLM to fix it.
if len(response.structured_output.instructions) <= 100:
self.continuation_reason = "too_short"
return True
return False
async def make_new_messages(
self, response: AIResponse[ReflectionPayload] | AIResponse
) -> list[ConversationAIMessage]:
if self.continuation_reason == "wrong_format":
return [UserAIMessage("Instructions not in correct format.")]
if self.continuation_reason == "too_short":
return [UserAIMessage("Instructions too short.")]
return []
@task(actions=[browsing_actions])
def get_random_wikipedia_article() -> str:
"""Navigate to Wikipedia and get a random article.
Returns:
The URL of the random Wikipedia article.
"""
ai_interface = ReflectionAIInterface(
runtime=Runtime(actions=browsing_actions.actions),
task_definition=get_random_wikipedia_article.definition,
task_config=get_random_wikipedia_article.config,
run_writer=FileWriter(),
)
print(ai_interface.run().result)
# > ReflectionPayload(
# > has_accomplished_task=False,
# > instructions="The agent has not accomplished the task yet...."
# > )