Basic tests

This guide covers writing basic tests with Maia.

Scenario 1: Two agents talk to each other without a user trigger.

class TestConversationSessions(MaiaTest):
    def setup_agents(self):
        self.create_agent(
            name="Alice",
            provider=GenericLiteLLMProvider(config={
                "model": "ollama/mistral",
                "api_base": "http://localhost:11434"
            }),
            system_message="You are a weather assistant. Only describe the weather.",
        )

        self.create_agent(
            name="Bob",
            provider=GenericLiteLLMProvider(config={
                "model": "ollama/mistral",
                "api_base": "http://localhost:11434"
            }),
            system_message="You are an assistant who only suggests clothing.",
        )

@pytest.mark.asyncio
  async def test_agent_to_agent_conversation(self):
      session = self.create_session(["Alice", "Bob"])
      
      # Alice initiates conversation with Bob
      await session.agent_says("Alice", "Bob", "Given the weather: rainy and 20 degrees Celsius, what clothes should I wear?")
      response = await session.agent_responds("Bob")
      assert_agent_participated(session, "Bob")
      
      # Bob responds back to Alice
      await session.agent_says("Bob", "Alice", f"Based on my info: {response.content}")
      response = await session.agent_responds("Alice")
      assert_agent_participated(session, "Alice")

Explanation

This is a basic scenario where one agent talks to another. We are asserting participation to be sure that the agents are responding. We are using GenericLiteLLMProvider as a provider.

Scenario 2: Check the content of the message.

class TestContentAssertions(MaiaTest):
    def setup_agents(self):
        self.create_agent(
            name="Alice",
            provider=GenericLiteLLMProvider(config={
                "model": "ollama/mistral",
                "api_base": "http://localhost:11434"
            }),
            system_message="You are a helpful AI assistant. You will follow user instructions precisely."
        )

    def setup_session(self):
        self.create_session(
            session_id="common_session"
        )
    @pytest.mark.asyncio
    async def test_pattern_contains_assertion(self):
        session = self.extend_session(
            "common_session",
            agent_names=["Alice"],
            assertions=[partial(assert_contains_pattern, pattern="sunny")]
        )
        await session.user_says("What is the weather in a typically sunny place?")
        await session.agent_responds("Alice")

Explanation

We set up the agent and session in the setup phase; then, in the test, we extend the session by providing the agent's name. We are also adding an assertion that checks that the word sunny will appear in every message, including the agent's response.

Scenario 3: A user broadcasts a message, and agents decide how to react.

class TestConversationSessions(MaiaTest):
    def setup_agents(self):
        self.create_agent(
            name="Alice",
            provider=GenericLiteLLMProvider(config={
                "model": "ollama/mistral",
                "api_base": "http://localhost:11434"
            }),
            system_message="You are a weather assistant. Only describe the weather.",
            ignore_trigger_prompt="You MUST NOT answer questions about any other topic, including what to wear. If the user asks about anything other than the weather, you MUST respond with only the exact text: IGNORE_MESSAGE",
        )

        self.create_agent(
            name="Bob",
            provider=GenericLiteLLMProvider(config={
                "model": "ollama/mistral",
                "api_base": "http://localhost:11434"
            }),
            system_message="You are a pirate assistant who only suggests clothing.",
            ignore_trigger_prompt="If the question is not about what to wear, you MUST respond with only the exact text: IGNORE_MESSAGE"
        )
    @pytest.mark.asyncio
    async def test_conversation_broadcast(self):
        session = self.create_session(["Alice", "Bob"])

        # Test that only Alice responds to a weather question
        response_a, responder_a = await session.user_says_and_broadcast("Please describe the usual weather in London in July, including temperature and conditions.")
        
        assert responder_a == 'Alice'
        assert_agent_participated(session, 'Alice')

        # Test that only Bob responds to a clothing question
        response_b, responder_b = await session.user_says_and_broadcast(f"Given the weather: {response_a.content}, what clothes should I wear?")
        
        assert responder_b == 'Bob'
        assert_agent_participated(session, 'Bob')

        # Test that no one responds to an irrelevant question
        response_c, responder_c = await session.user_says_and_broadcast("What is the capital of France?")
        assert response_c is None
        assert responder_c is None

Explanation

A more complex scenario where the system automatically decides which agent should respond. We are configuring agents to respond with IGNORE_MESSAGE, so the framework knows that the message arrived at the agent, but the agent knows that it is not its turn. For this job, we are using a special parameter called ignore_trigger_prompt, which explains to the agent when to return IGNORE_MESSAGE.

IGNORE_MESSAGE is special text that you can use to simulate this behavior.

In the test, we are using user_says_and_broadcast which broadcasts the message. Without ignore_trigger_prompt, both agents will respond; however, because of our configuration, only the Alice agent responds to the weather question. Then, the user again makes a broadcast about clothes (based on the response from Alice), and finally, Bob responds with what to wear. Finally, we are checking that no agent responds to the dummy question.

Scenario 4: Trigger a multi-turn conversation between agents.


    @pytest.mark.asyncio
    async def test_multi_turn_agent_conversation(self):
        self.create_agent(
            name="Alice",
            provider=self.get_provider("ollama"),
            system_message="You are a weather assistant."
        )
        self.create_agent(
            name="Bob",
            provider=self.get_provider("ollama"),
            system_message="You are a helpful assistant who is also a pirate."
        )
    
        session = self.create_session(["Alice", "Bob"])
        
        conversation_log = await session.run_agent_conversation(
            initiator="Bob",
            responder="Alice", 
            initial_message="What's the weather?",
            max_turns=3
        )
        
        assert len(conversation_log) == 7  # 3 turns each + initial message

We can also trigger a multi-turn conversation between agents by executing run_agent_conversation on the session. It will trigger an automatic conversation limited by max_turns. Such a test can be useful to see how agents behave together without user interaction.

In the tests above, we are using a config file to get the provider. This concept is explained in Config File