Design Decisions¶
This document explains the key design decisions in Arshai and the reasoning behind them. Understanding these choices will help you work with the framework more effectively.
Why We Removed the Settings Pattern¶
The Problem with Settings¶
The Settings pattern was initially created to simplify configuration, but it became a bottleneck:
# Old approach - Settings controlled everything
settings = Settings()
settings.load_config("config.yaml")
agent = settings.create_agent("assistant", agent_config)
llm = settings.create_llm()
Issues: - 🔒 Single point of failure - Everything depends on Settings - 🎭 Hidden dependencies - Components created behind the scenes - 🧩 Poor testability - Hard to mock Settings-created components - 📦 Coupling - All components tied to Settings implementation
The Solution: Direct Instantiation¶
# New approach - Direct instantiation
llm_client = OpenAIClient(ILLMConfig(model="gpt-4"))
memory = InMemoryManager(ttl=3600)
agent = AssistantAgent(llm_client, memory)
Benefits: - ✅ No single point of failure - Components are independent - ✅ Explicit dependencies - See exactly what’s needed - ✅ Excellent testability - Easy to mock individual components - ✅ Loose coupling - Components only know their interfaces
Why Direct Instantiation Over Factories¶
Factory Pattern Limitations¶
# Factory pattern - abstraction without benefit
factory = AgentFactory()
agent = factory.create("assistant", config)
# What type is agent? What dependencies does it have?
Problems: - Type information is lost - Dependencies are hidden - Customization is limited - Testing requires mocking the factory
Direct Instantiation Benefits¶
# Direct instantiation - clear and type-safe
agent = AssistantAgent(
llm_client=OpenAIClient(config),
memory=RedisMemory(url="redis://localhost"),
tools=[SearchTool(), CalculateTool()]
)
# Type is clear: AssistantAgent
# Dependencies are explicit
Advantages: - Full type safety and IDE support - Clear dependencies for testing - Easy customization - No abstraction overhead
Why Agents Are Stateless¶
The Problem with Stateful Agents¶
# Stateful agent - problematic
class StatefulAgent:
def __init__(self):
self.conversation_history = [] # Internal state
self.user_preferences = {} # More state
def process(self, message):
self.conversation_history.append(message) # Mutation
# What if multiple requests come in parallel?
Issues: - 🔄 Concurrency problems - Race conditions with shared state - 🧪 Hard to test - Must reset state between tests - 📊 Difficult scaling - Can’t easily distribute stateful agents - 🔍 Debugging complexity - State makes behavior unpredictable
Stateless Design Benefits¶
# Stateless agent - clean and scalable
class StatelessAgent(BaseAgent):
async def process(self, input: IAgentInput) -> Any:
# Get state from external source
context = input.metadata.get("context", {})
# Process without mutation
result = await self._process_with_context(input.message, context)
# Return new state, don't mutate
return {
"response": result,
"new_context": self._update_context(context, result)
}
Advantages: - ✅ Thread-safe - No shared mutable state - ✅ Easy testing - No state to reset - ✅ Horizontal scaling - Deploy multiple instances - ✅ Predictable - Same input = same output
Why Interface-First Design¶
Interfaces Define Contracts¶
# Interface defines the contract
class ILLM(Protocol):
async def chat(self, input: ILLMInput) -> Dict[str, Any]: ...
async def stream(self, input: ILLMInput) -> AsyncGenerator: ...
# Multiple implementations
class OpenAIClient(ILLM): ...
class GeminiClient(ILLM): ...
class MockLLMClient(ILLM): ... # For testing
Benefits: - 📋 Clear contracts - Know exactly what methods are available - 🔄 Easy substitution - Swap implementations without changing code - 🧪 Testability - Create test doubles that match interfaces - 📚 Documentation - Interfaces document expected behavior
Dependency Inversion¶
# Depend on interfaces, not implementations
class Agent:
def __init__(self, llm: ILLM): # Interface, not concrete class
self.llm = llm
async def process(self, input: Any):
# Works with any ILLM implementation
return await self.llm.chat(input)
Why Environment Variables for Secrets¶
Security First¶
# Secrets from environment - secure
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
raise ValueError("OPENAI_API_KEY not set")
# Never in code or config files
# BAD: api_key = "sk-..." # Never do this!
Reasons: - 🔒 Security - Secrets not in source control - 🚀 Deployment - Different secrets per environment - 🔄 Rotation - Change secrets without code changes - 📋 Compliance - Meet security standards
Why Three Layers Instead of More (or Fewer)¶
Not Too Simple (Two Layers)¶
# Two layers - too limiting
LLM Clients ← → Applications
# Where does business logic go?
# How do you compose complex systems?
Not Too Complex (Many Layers)¶
# Too many layers - over-engineered
Infrastructure → Providers → Clients → Services →
Agents → Orchestrators → Controllers → Applications
# Too much abstraction, hard to understand
Just Right (Three Layers)¶
# Three layers - balanced
Layer 1: LLM Clients (Foundation)
Layer 2: Agents (Logic)
Layer 3: Systems (Orchestration)
Perfect because: - Simple enough to understand - Flexible enough for complex systems - Clear separation of concerns - Natural progression of complexity
Why Python Protocols Over ABC¶
Abstract Base Classes - Rigid¶
# ABC - forces inheritance
from abc import ABC, abstractmethod
class AbstractAgent(ABC):
@abstractmethod
def process(self): ...
class MyAgent(AbstractAgent): # Must inherit
def process(self): ...
Protocols - Flexible¶
# Protocol - structural typing
from typing import Protocol
class IAgent(Protocol):
def process(self): ...
class MyAgent: # No inheritance needed!
def process(self): ...
# Still type-safe
def use_agent(agent: IAgent): ...
use_agent(MyAgent()) # Works!
Benefits: - No forced inheritance hierarchies - Duck typing with type safety - Multiple protocol implementation - Easier testing with mocks
Why Async-First Design¶
Modern AI is I/O Bound¶
# Synchronous - blocking and slow
response1 = llm.chat(prompt1) # Wait...
response2 = llm.chat(prompt2) # Wait...
response3 = llm.chat(prompt3) # Wait...
# Asynchronous - concurrent and fast
responses = await asyncio.gather(
llm.chat(prompt1),
llm.chat(prompt2),
llm.chat(prompt3)
) # All three in parallel!
Benefits: - ⚡ Performance - Handle multiple requests concurrently - 📊 Scalability - Better resource utilization - 🔄 Non-blocking - UI stays responsive - 🎯 Modern - Aligns with async Python ecosystem
Why Minimal Base Classes¶
Heavy Base Classes - Constraining¶
# Heavy base class - too much assumption
class HeavyBaseAgent:
def __init__(self):
self.logger = create_logger()
self.metrics = create_metrics()
self.cache = create_cache()
self.validator = create_validator()
# What if I don't need all this?
Minimal Base Classes - Freedom¶
# Minimal base class - just essentials
class BaseAgent:
def __init__(self, llm_client: ILLM, system_prompt: str):
self.llm_client = llm_client
self.system_prompt = system_prompt
@abstractmethod
async def process(self, input: IAgentInput) -> Any:
pass
# That's it! Add what YOU need
Benefits: - No unnecessary overhead - Add only what you need - Clear and simple - Easy to understand
Summary of Design Principles¶
Explicit > Implicit - Show what’s happening
Simple > Complex - Start simple, add complexity as needed
Composition > Inheritance - Combine simple parts
Interfaces > Implementations - Depend on contracts
Stateless > Stateful - Avoid mutable state
Direct > Indirect - Minimize abstraction layers
Developer Control > Framework Magic - You’re in charge
These decisions create a framework that is: - Powerful yet simple - Flexible yet structured - Type-safe yet dynamic - Testable yet practical