From Agno CLI Script to Production-Ready FastAPI in Minutes
Transform your local Agno experiments into FastAPI endpoints your team can integrate and deploy
You've built an amazing AI agent using Agno. It works perfectly on your local machine, answers questions, processes data, and delivers exactly what you need. But now you want to share it with the world—or at least make it accessible to your team through an API.
Today, I'm walking you through the exact process of transforming any Agno CLI script into a production-ready FastAPI endpoint. Step by step, with complete code examples.
Why This Matters
Most developers start with local scripts when experimenting with AI agents. That's smart. It's fast, iterative, and perfect for testing. But the moment you want to integrate your agent into a web application, mobile app, or share it with others, you need an API.
FastAPI makes this transition seamless. And with Agno's architecture, the transformation is surprisingly straightforward.
What Is Agno?
Agno is a Python framework for building multi-agent systems with shared memory, knowledge, and reasoning. It's one of the best frameworks available for creating production-grade AI agents that actually work in real business environments.
What makes Agno special? It's model-agnostic, highly performant, and includes built-in reasoning, memory, and multi-agent capabilities. Whether you're building simple tool-using agents or complex multi-agent workflows, Agno handles the heavy lifting.
The framework comes with an exceptional cookbook filled with hands-on code examples that any developer can use as a starting point. These examples cover everything from basic agents to advanced multi-agent systems.
FastAPI: The Production Bridge
FastAPI has become the go-to choice for Python APIs in production. It's fast, automatically generates documentation, handles validation, and integrates beautifully with modern deployment platforms.
For AI applications, this combination is powerful: Agno handles the agent intelligence, FastAPI handles the web infrastructure.
The Starting Point: A Simple CLI Agent
Let's start with a basic Agno script that uses the YFinance tool to get stock prices. This is the kind of script you might build while following the Agno cookbook:
from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.tools.yfinance import YFinanceTools
from dotenv import load_dotenv
load_dotenv()
agent = Agent(
model=OpenAIChat(id="gpt-4o-mini"),
tools=[YFinanceTools(stock_price=True)],
instructions="Use tables to display data. Don't include any other text.",
markdown=True,
)
agent.print_response("What is the stock price of Apple?", stream=True)
This script works perfectly. Run it locally, and you'll get Apple's current stock price in a nicely formatted table. But it's limited to your local environment and requires manual execution. Let’s turn our Agno CLI script into a FastAPI endpoint, step by step.
Step 1: Add Input Validation with Pydantic
The first step in any API is defining what data you expect to receive. FastAPI uses Pydantic models for this, which automatically handle validation and generate documentation.
from pydantic import BaseModel
class QueryRequest(BaseModel):
question: str
class Config:
json_schema_extra = {
"example": {"question": "What is the stock price of Apple?"}
}
This model tells FastAPI to expect a JSON payload with a question
field. The Config
class provides an example for the auto-generated documentation.
Step 2: Initialize FastAPI and Move Agent Outside Endpoints
Create your FastAPI app and move the agent initialization outside any endpoint functions. This ensures the agent is created once when the server starts, not on every request:
from fastapi import FastAPI
app = FastAPI(title="Stock Price Agent API", version="1.0.0")
# Initialize the agent once (reuse across requests)
agent = Agent(
model=OpenAIChat(id="gpt-4o-mini"),
tools=[YFinanceTools(stock_price=True)],
instructions="Use tables to display data. Don't include any other text.",
markdown=True,
)
This pattern is crucial for performance. Creating agents on every request would be slow and wasteful.
Step 3: Create Your API Endpoints
Now we'll create multiple endpoints to demonstrate different API patterns:
Health Check Endpoint:
@app.get("/")
async def root():
"""Health check endpoint"""
return {"message": "Stock Price Agent API is running"}
Path Parameter Endpoint:
@app.get("/stock/{symbol}")
async def get_stock_price(symbol: str):
"""Get stock price for a given symbol"""
try:
response = agent.run(f"What is the stock price of {symbol}?")
return {"symbol": symbol, "response": response.content, "status": "success"}
except Exception as e:
return {"symbol": symbol, "error": str(e), "status": "error"}
POST Endpoint with Request Body:
@app.post("/query")
async def custom_query(query: QueryRequest):
"""Custom query endpoint for any stock-related question"""
try:
user_query = query.question
if not user_query:
return {"error": "No question provided", "status": "error"}
response = agent.run(user_query)
return {"query": user_query, "response": response.content, "status": "success"}
except Exception as e:
return {"query": query.question, "error": str(e), "status": "error"}
Step 4: Add the Server Runner
Finally, add the code to run the server:
import uvicorn
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000, reload=True)
The reload=True
parameter automatically restarts the server when you make code changes during development.
The Complete Transformation
Here's the full FastAPI version of our original CLI script:
from fastapi import FastAPI
import uvicorn
from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.tools.yfinance import YFinanceTools
from pydantic import BaseModel
from dotenv import load_dotenv
load_dotenv()
class QueryRequest(BaseModel):
question: str
class Config:
json_schema_extra = {
"example": {"question": "What is the stock price of Apple?"}
}
app = FastAPI(title="Stock Price Agent API", version="1.0.0")
# Initialize the agent once (reuse across requests)
agent = Agent(
model=OpenAIChat(id="gpt-4o-mini"),
tools=[YFinanceTools(stock_price=True)],
instructions="Use tables to display data. Don't include any other text.",
markdown=True,
)
@app.get("/")
async def root():
"""Health check endpoint"""
return {"message": "Stock Price Agent API is running"}
@app.get("/stock/{symbol}")
async def get_stock_price(symbol: str):
"""Get stock price for a given symbol"""
try:
# Use the agent to get stock price
response = agent.run(f"What is the stock price of {symbol}?")
return {"symbol": symbol, "response": response.content, "status": "success"}
except Exception as e:
return {"symbol": symbol, "error": str(e), "status": "error"}
@app.post("/query")
async def custom_query(query: QueryRequest):
"""Custom query endpoint for any stock-related question"""
try:
user_query = query.question
if not user_query:
return {"error": "No question provided", "status": "error"}
response = agent.run(user_query)
return {"query": user_query, "response": response.content, "status": "success"}
except Exception as e:
return {"query": query.question, "error": str(e), "status": "error"}
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000, reload=True)
Running Your New API
Save the code to a file (e.g., stock_agent_api.py
) and run:
uvicorn stock_agent_api:app --reload
Your API will be available at http://localhost:8000. FastAPI automatically generates interactive documentation at http://localhost:8000/docs
where you can test your endpoints directly.
Testing Your Endpoints
Health Check:
curl http://localhost:8000/
Get Stock Price by Symbol:
curl http://localhost:8000/stock/TSLA
Custom Query:
curl -X POST "http://localhost:8000/query" \
-H "Content-Type: application/json" \
-d '{"question": "Compare the stock prices of Apple and Microsoft"}'
Key Patterns for Success
Agent Reuse: Always initialize your agent once outside the endpoint functions. Creating agents on every request kills performance.
Error Handling: Wrap agent calls in try-catch blocks. AI agents can fail, and your API should handle that gracefully.
Input Validation: Use Pydantic models to validate incoming data. It prevents errors and generates better documentation.
Multiple Endpoint Types: Offer both GET endpoints for simple queries and POST endpoints for complex requests.
Scaling This Pattern
This same transformation pattern works with any Agno agent, regardless of complexity:
Agents with Knowledge: Your vector databases and RAG capabilities work exactly the same
Agents with Memory: Session management transfers seamlessly to API endpoints
Multi-Agent Teams: The entire team can be exposed through a single API
Workflow Agents: Complex workflows become powerful API services
The beauty of Agno's architecture is that the agent behavior remains identical whether it's running in a CLI script or behind a FastAPI endpoint.
Production Considerations
For production deployment, consider:
Environment Variables: Use proper environment management for API keys and configuration
Authentication: Add API key authentication or OAuth depending on your needs
Rate Limiting: Implement rate limiting to prevent abuse
Logging: Add comprehensive logging for monitoring and debugging
Containerization: Use Docker for consistent deployment across environments
Next Steps
You now have the blueprint for transforming any Agno CLI script into a production-ready API. The process is consistent, the performance is excellent, and the resulting APIs are robust and well-documented.
Take any example from the Agno cookbook, apply this transformation pattern, and you'll have a working API in minutes.
The gap between experimentation and production just got a lot smaller.