Build Your First Custom Agent¶
This guide walks you through creating a custom agent from scratch, demonstrating the power of direct instantiation and the three-layer architecture.
What We’ll Build¶
We’ll create a Smart Assistant Agent that:
Analyzes user requests for complexity
Routes simple requests to fast processing
Uses tools for complex requests
Returns structured responses with metadata
Prerequisites¶
Arshai installed:
pip install arshai[openai]OpenAI API key:
export OPENAI_API_KEY="sk-..."Basic understanding of core concepts from Quickstart Guide
Step 1: Set Up the Environment¶
First, let’s create our project structure and imports:
# smart_agent.py
import asyncio
import os
from datetime import datetime
from typing import Dict, Any, List
from enum import Enum
# Arshai imports - Layer 1 (LLM Clients)
from arshai.llms.openai import OpenAIClient
from arshai.core.interfaces.illm import ILLMConfig, ILLMInput
# Arshai imports - Layer 2 (Agents)
from arshai.agents.base import BaseAgent
from arshai.core.interfaces.iagent import IAgentInput
print("✅ Imports successful")
Step 2: Design Your Agent¶
Before coding, let’s design our agent’s behavior:
class RequestComplexity(Enum):
"""Complexity levels for user requests."""
SIMPLE = "simple" # Basic questions, greetings
MODERATE = "moderate" # Requires some analysis
COMPLEX = "complex" # Needs tools or deep reasoning
class ResponseMetadata:
"""Metadata about the agent's response."""
def __init__(self, complexity: RequestComplexity, processing_time: float, tools_used: List[str]):
self.complexity = complexity.value
self.processing_time = processing_time
self.tools_used = tools_used
self.timestamp = datetime.utcnow().isoformat()
Step 3: Implement the Agent Core¶
Now let’s build our custom agent:
class SmartAssistantAgent(BaseAgent):
"""
A smart assistant that adapts its processing based on request complexity.
Features:
- Complexity analysis
- Adaptive processing strategies
- Tool integration
- Structured metadata responses
"""
def __init__(self, llm_client, system_prompt: str, complexity_threshold: float = 0.5):
"""Initialize the smart assistant.
Args:
llm_client: LLM client for processing
system_prompt: Base system prompt
complexity_threshold: Threshold for complexity routing (0.0-1.0)
"""
super().__init__(llm_client, system_prompt)
self.complexity_threshold = complexity_threshold
async def process(self, input: IAgentInput) -> Dict[str, Any]:
"""Process user input with adaptive complexity handling."""
start_time = asyncio.get_event_loop().time()
# Step 1: Analyze complexity
complexity = await self._analyze_complexity(input.message)
# Step 2: Route based on complexity
if complexity == RequestComplexity.SIMPLE:
response = await self._process_simple(input)
tools_used = []
elif complexity == RequestComplexity.MODERATE:
response = await self._process_moderate(input)
tools_used = []
else: # COMPLEX
response, tools_used = await self._process_complex(input)
# Step 3: Calculate metadata
end_time = asyncio.get_event_loop().time()
processing_time = end_time - start_time
# Step 4: Return structured response
return {
"response": response,
"metadata": {
"complexity": complexity.value,
"processing_time": round(processing_time, 3),
"tools_used": tools_used,
"timestamp": datetime.utcnow().isoformat()
}
}
Step 4: Implement Complexity Analysis¶
Add intelligence to classify request complexity:
async def _analyze_complexity(self, message: str) -> RequestComplexity:
"""Analyze the complexity of a user request."""
# Simple heuristics (you can make this more sophisticated)
simple_patterns = [
"hello", "hi", "thanks", "thank you", "bye", "goodbye",
"what is your name", "who are you", "how are you"
]
complex_patterns = [
"calculate", "compute", "search", "find", "lookup",
"analyze", "compare", "research", "explain in detail"
]
message_lower = message.lower()
# Check for simple patterns
if any(pattern in message_lower for pattern in simple_patterns):
return RequestComplexity.SIMPLE
# Check for complex patterns
if any(pattern in message_lower for pattern in complex_patterns):
return RequestComplexity.COMPLEX
# Use LLM to analyze ambiguous cases
analysis_prompt = ILLMInput(
system_prompt="Analyze request complexity. Reply only with 'simple', 'moderate', or 'complex'.",
user_message=f"Classify this request: {message}"
)
result = await self.llm_client.chat(analysis_prompt)
complexity_str = result.get("llm_response", "moderate").lower().strip()
if "simple" in complexity_str:
return RequestComplexity.SIMPLE
elif "complex" in complexity_str:
return RequestComplexity.COMPLEX
else:
return RequestComplexity.MODERATE
Step 5: Implement Processing Strategies¶
Add different processing approaches for each complexity level:
async def _process_simple(self, input: IAgentInput) -> str:
"""Fast processing for simple requests."""
llm_input = ILLMInput(
system_prompt=f"{self.system_prompt} Be brief and friendly.",
user_message=input.message
)
result = await self.llm_client.chat(llm_input)
return result.get("llm_response", "I'm here to help!")
async def _process_moderate(self, input: IAgentInput) -> str:
"""Standard processing for moderate requests."""
llm_input = ILLMInput(
system_prompt=f"{self.system_prompt} Provide a thoughtful response.",
user_message=input.message
)
result = await self.llm_client.chat(llm_input)
return result.get("llm_response", "Let me think about that...")
async def _process_complex(self, input: IAgentInput) -> tuple[str, List[str]]:
"""Advanced processing with tools for complex requests."""
# Define tools
def search_web(query: str) -> str:
"""Search the web for information."""
return f"Found information about: {query}"
def calculate(expression: str) -> str:
"""Calculate mathematical expression."""
try:
# Simple calculator (use safe eval in production!)
result = eval(expression.replace("^", "**"))
return f"Result: {result}"
except:
return "Invalid calculation"
def get_current_time() -> str:
"""Get the current time."""
return datetime.now().strftime("%Y-%m-%d %H:%M:%S")
# Process with tools
llm_input = ILLMInput(
system_prompt=f"{self.system_prompt} You have access to tools. Use them when helpful.",
user_message=input.message,
regular_functions={
"search_web": search_web,
"calculate": calculate,
"get_current_time": get_current_time
}
)
result = await self.llm_client.chat(llm_input)
# Extract which tools were used
function_calls = result.get("function_calls", {})
tools_used = list(function_calls.keys()) if function_calls else []
response = result.get("llm_response", "I processed your complex request.")
return response, tools_used
Step 6: Create the Agent Instance¶
Now let’s put it all together:
async def create_smart_agent():
"""Create and configure the smart assistant agent."""
# Check for API key
if not os.getenv("OPENAI_API_KEY"):
raise ValueError("Please set OPENAI_API_KEY environment variable")
# Configure LLM client (Layer 1)
llm_config = ILLMConfig(
model="gpt-3.5-turbo",
temperature=0.7,
max_tokens=200
)
llm_client = OpenAIClient(llm_config)
print("✅ LLM client created")
# Create agent (Layer 2)
agent = SmartAssistantAgent(
llm_client=llm_client,
system_prompt="You are a helpful AI assistant that adapts to user needs.",
complexity_threshold=0.5
)
print("✅ Smart agent created")
return agent
Step 7: Test Your Agent¶
Let’s test our agent with different complexity levels:
async def test_agent():
"""Test the smart agent with various inputs."""
agent = await create_smart_agent()
test_cases = [
# Simple requests
"Hello!",
"Thanks for your help",
# Moderate requests
"Explain what artificial intelligence is",
"What are the benefits of exercise?",
# Complex requests
"Calculate 15 * 23 + 45",
"Search for information about Python programming",
"What time is it now?"
]
print("\n" + "=" * 60)
print("TESTING SMART ASSISTANT AGENT")
print("=" * 60)
for message in test_cases:
print(f"\n👤 User: {message}")
# Process with agent
result = await agent.process(IAgentInput(message=message))
# Display results
print(f"🤖 Agent: {result['response']}")
print(f"📊 Metadata:")
print(f" Complexity: {result['metadata']['complexity']}")
print(f" Processing time: {result['metadata']['processing_time']}s")
if result['metadata']['tools_used']:
print(f" Tools used: {', '.join(result['metadata']['tools_used'])}")
print("-" * 40)
# Run the test
if __name__ == "__main__":
asyncio.run(test_agent())
Step 8: Make It Interactive¶
Add an interactive chat loop:
async def interactive_chat():
"""Interactive chat with the smart agent."""
agent = await create_smart_agent()
print("\n🤖 Smart Assistant Ready!")
print("Features: Adaptive complexity, tool usage, detailed metadata")
print("Type 'quit' to exit, 'help' for commands")
print("=" * 60)
while True:
try:
# Get user input
user_input = input("\n👤 You: ").strip()
if user_input.lower() == 'quit':
print("👋 Goodbye!")
break
if user_input.lower() == 'help':
print("""
Available commands:
- 'quit' - Exit the chat
- 'help' - Show this help
- Any other text - Process with the agent
Try different types of requests:
- Simple: "Hello", "Thanks"
- Moderate: "Explain AI", "What is Python?"
- Complex: "Calculate 25*4", "What time is it?"
""")
continue
if not user_input:
continue
# Process with agent
result = await agent.process(IAgentInput(message=user_input))
# Display response
print(f"\n🤖 Agent: {result['response']}")
# Display metadata (optional)
metadata = result['metadata']
print(f"\n📊 [Complexity: {metadata['complexity']} | "
f"Time: {metadata['processing_time']}s")
if metadata['tools_used']:
print(f" | Tools: {', '.join(metadata['tools_used'])}", end="")
print("]")
except KeyboardInterrupt:
print("\n👋 Goodbye!")
break
except Exception as e:
print(f"❌ Error: {e}")
# Run interactive mode
if __name__ == "__main__":
asyncio.run(interactive_chat())
Complete Example¶
Here’s the full working agent in one file:
Complete smart_agent.py
# smart_agent.py - Complete Smart Assistant Agent
import asyncio
import os
from datetime import datetime
from typing import Dict, Any, List
from enum import Enum
from arshai.llms.openai import OpenAIClient
from arshai.core.interfaces.illm import ILLMConfig, ILLMInput
from arshai.agents.base import BaseAgent
from arshai.core.interfaces.iagent import IAgentInput
class RequestComplexity(Enum):
SIMPLE = "simple"
MODERATE = "moderate"
COMPLEX = "complex"
class SmartAssistantAgent(BaseAgent):
def __init__(self, llm_client, system_prompt: str, complexity_threshold: float = 0.5):
super().__init__(llm_client, system_prompt)
self.complexity_threshold = complexity_threshold
async def process(self, input: IAgentInput) -> Dict[str, Any]:
start_time = asyncio.get_event_loop().time()
complexity = await self._analyze_complexity(input.message)
if complexity == RequestComplexity.SIMPLE:
response = await self._process_simple(input)
tools_used = []
elif complexity == RequestComplexity.MODERATE:
response = await self._process_moderate(input)
tools_used = []
else:
response, tools_used = await self._process_complex(input)
end_time = asyncio.get_event_loop().time()
processing_time = end_time - start_time
return {
"response": response,
"metadata": {
"complexity": complexity.value,
"processing_time": round(processing_time, 3),
"tools_used": tools_used,
"timestamp": datetime.utcnow().isoformat()
}
}
async def _analyze_complexity(self, message: str) -> RequestComplexity:
simple_patterns = ["hello", "hi", "thanks", "bye"]
complex_patterns = ["calculate", "search", "analyze"]
message_lower = message.lower()
if any(pattern in message_lower for pattern in simple_patterns):
return RequestComplexity.SIMPLE
if any(pattern in message_lower for pattern in complex_patterns):
return RequestComplexity.COMPLEX
return RequestComplexity.MODERATE
async def _process_simple(self, input: IAgentInput) -> str:
llm_input = ILLMInput(
system_prompt=f"{self.system_prompt} Be brief and friendly.",
user_message=input.message
)
result = await self.llm_client.chat(llm_input)
return result.get("llm_response", "Hello!")
async def _process_moderate(self, input: IAgentInput) -> str:
llm_input = ILLMInput(
system_prompt=f"{self.system_prompt} Provide a thoughtful response.",
user_message=input.message
)
result = await self.llm_client.chat(llm_input)
return result.get("llm_response", "Let me help you with that.")
async def _process_complex(self, input: IAgentInput) -> tuple[str, List[str]]:
def calculate(expression: str) -> str:
try:
result = eval(expression.replace("^", "**"))
return f"Result: {result}"
except:
return "Invalid calculation"
def get_current_time() -> str:
return datetime.now().strftime("%Y-%m-%d %H:%M:%S")
llm_input = ILLMInput(
system_prompt=f"{self.system_prompt} Use tools when helpful.",
user_message=input.message,
regular_functions={
"calculate": calculate,
"get_current_time": get_current_time
}
)
result = await self.llm_client.chat(llm_input)
function_calls = result.get("function_calls", {})
tools_used = list(function_calls.keys()) if function_calls else []
return result.get("llm_response", "Processed"), tools_used
async def main():
if not os.getenv("OPENAI_API_KEY"):
print("❌ Please set OPENAI_API_KEY environment variable")
return
config = ILLMConfig(model="gpt-3.5-turbo", temperature=0.7)
llm_client = OpenAIClient(config)
agent = SmartAssistantAgent(
llm_client=llm_client,
system_prompt="You are a helpful AI assistant."
)
print("🤖 Smart Assistant Ready! (Type 'quit' to exit)")
while True:
user_input = input("\n👤 You: ").strip()
if user_input.lower() == 'quit':
break
result = await agent.process(IAgentInput(message=user_input))
print(f"🤖 Agent: {result['response']}")
print(f"📊 [{result['metadata']['complexity']} | {result['metadata']['processing_time']}s]")
if __name__ == "__main__":
asyncio.run(main())
Key Concepts Demonstrated¶
1. Direct Instantiation¶
# You create everything explicitly
llm_client = OpenAIClient(config)
agent = SmartAssistantAgent(llm_client, prompt)
2. Single Responsibility¶
# Agent has ONE clear purpose: smart assistance with adaptive complexity
class SmartAssistantAgent(BaseAgent):
"""Smart assistant that adapts to request complexity."""
3. Dependency Injection¶
# Dependencies passed in constructor
def __init__(self, llm_client, system_prompt: str, complexity_threshold: float):
super().__init__(llm_client, system_prompt) # Explicit
self.complexity_threshold = complexity_threshold
4. Stateless Design¶
# No internal state - same input = same output
async def process(self, input: IAgentInput) -> Dict[str, Any]:
# All state comes from input parameters
pass
5. Three-Layer Architecture¶
Layer 1:
OpenAIClient- LLM accessLayer 2:
SmartAssistantAgent- Business logicLayer 3: Your application - Orchestration
Next Steps¶
Now that you’ve built your first custom agent:
Experiment - Try different complexity analysis strategies
Add Tools - Integrate more external capabilities
Compose Systems - Combine multiple agents
Explore Layers - Learn more in Building Systems (Layer 3)
Test Thoroughly - Add comprehensive unit tests
Testing Your Agent¶
import pytest
from unittest.mock import AsyncMock
@pytest.mark.asyncio
async def test_smart_agent():
# Mock LLM client
mock_llm = AsyncMock()
mock_llm.chat.return_value = {"llm_response": "Test response"}
# Create agent with mock
agent = SmartAssistantAgent(mock_llm, "Test prompt")
# Test processing
result = await agent.process(IAgentInput(message="Hello"))
# Verify structure
assert "response" in result
assert "metadata" in result
assert result["metadata"]["complexity"] == "simple"
Congratulations! You’ve built a sophisticated agent that demonstrates all the key principles of Arshai’s architecture. You now have complete control over an AI component that can adapt, use tools, and provide rich metadata. 🎉
Ready for more advanced patterns? Check out Building Systems (Layer 3) →