Building a Quant Trading Bot with Phidata and CrewAI
Oliver Schmidt
# Building a Quant Trading Bot with Phidata and CrewAI ## 1. What It Is and Who It’s For This guide shows how to assemble a quantitative trading bot that combines **Phidata** (a Python library for f...
Building a Quant Trading Bot with Phidata and CrewAI
1. What It Is and Who It’s For
This guide shows how to assemble a quantitative trading bot that combines Phidata (a Python library for fetching market data and managing feature stores) with CrewAI (a framework for orchestrating multiple LLM‑driven agents). The bot is aimed at developers and researchers who want to experiment with AI‑augmented trading strategies without building a full‑scale infrastructure from scratch. It assumes basic Python proficiency, familiarity with pandas, and access to a brokerage API or a paper‑trading sandbox.
2. Key Features and Capabilities
- Data ingestion via Phidata – connectors to Yahoo Finance, Alpha Vantage, and CCXT for crypto; automatic caching of OHLCV bars; built‑in technical indicator calculators (SMA, EMA, RSI, MACD).
- Multi‑agent reasoning with CrewAI – separate agents for data collection, signal generation, risk management, and order execution; each agent can use tools, retain short‑term memory, and iterate on its output.
- Pluggable strategy logic – the strategy agent receives a feature DataFrame and returns a signal (-1, 0, 1) using any user‑defined function or LLM prompt.
- Backtest‑ready execution – the execution agent can simulate trades using historical data or route real orders through a broker’s REST/WebSocket API (example uses Alpaca paper trading).
- Observability – logs are written to a rotating file and can be streamed to Loki or ELK via standard Python logging.
3. Architecture and Workflow
The bot follows a pipeline pattern where each stage is a CrewAI agent. Phidata supplies the data layer; CrewAI handles coordination.
flowchart TD
A[Scheduler] --> B[DataAgent]
B --> C[FeatureAgent]
C --> D[StrategyAgent]
D --> E[RiskAgent]
E --> F[ExecutionAgent]
F --> G[Broker/PaperTrading]
G --> H[TradeLogger]
H --> A
- Scheduler – triggers the pipeline at a fixed interval (e.g., every 5 minutes) using
APScheduler. - DataAgent – calls
phidata.fetch(symbol, interval, limit)to retrieve raw candles and stores them in a local SQLite cache. - FeatureAgent – computes technical indicators via
phidata.indicators.add_all(df)and writes the enriched DataFrame to a feature store (pickle or Parquet). - StrategyAgent – receives the latest feature row, builds a prompt for the LLM (or runs a deterministic rule), and outputs a signal.
- RiskAgent – checks position limits, max drawdown, and volatility; can attenuate the signal.
- ExecutionAgent – translates the final signal into an order size, sends it to the broker, and records the fill.
- TradeLogger – appends each trade to a CSV for post‑mortem analysis.
All agents inherit from crewai.Agent and are given a list of tools (e.g., fetch_data, calculate_indicator, place_order). The LLM used can be any OpenAI‑compatible model; the examples below use gpt-4o-mini.
4. Real‑World Use Cases
- Intraday mean‑reversion on equities – a user subscribes to 1‑minute bars for AAPL, computes a 20‑period SMA, and lets the StrategyAgent decide to go long when price deviates >1.5 σ below the mean.
- Crypto arbitrage signal – the DataAgent pulls tickers from Binance and Coinbase Pro via CCXT; the FeatureAgent calculates the spread; the StrategyAgent triggers when spread exceeds a threshold, and the ExecutionAgent places simultaneous limit orders.
- Research prototype – a quant researcher swaps the LLM‑based StrategyAgent for a simple moving‑average crossover to compare AI‑generated signals against classic rules without changing the surrounding pipeline.
5. Strengths and Limitations
Strengths
- Clear separation of concerns: data, features, decision, risk, execution are independently testable.
- Adding a new agent (e.g., a news sentiment agent) only requires implementing a tool and registering it with CrewAI.
- Phidata’s caching reduces redundant API calls, important for rate‑limited data vendors.
Limitations
- The latency of an LLM call (≈300‑800 ms for gpt‑4o-mini) may be too high for sub‑second strategies; users should restrict LLM‑based agents to longer timeframes.
- Phidata does not yet provide a distributed feature store; scaling to hundreds of symbols requires a custom solution (e.g., writing to a shared S3 bucket).
- CrewAI’s memory implementation is in‑process; persisting agent state across restarts needs extra work.
6. Comparison with Alternatives
| Feature | Phidata + CrewAI | LangChain/LangGraph | AutoGen | Backtrader + Zipline |
|---|---|---|---|---|
| Multi‑agent orchestration | Native (role‑based agents) | Requires custom graph | Built‑in conversational | Not applicable |
| Data connector library | Phidata (Yahoo, CCXT) | User‑built or external | User‑built | Built‑in (Yahoo, CSV) |
| Technical indicator helpers | Yes (via phidata.indicators) | No (rely on pandas‑ta) | No | Yes (built‑in) |
| LLM‑driven decision making | Direct prompt per agent | Possible via chains | Possible via agents | No |
| Backtesting framework | ExecutionAgent can simulate | Needs extra wrapper | Needs extra wrapper | Full‑featured |
| Community size (2026) | Growing (~2k stars) | Large (~30k stars) | Medium (~8k stars) | Established (~15k stars) |
| Setup complexity | Low‑medium | Medium | Medium | Low |
The table shows that Phidata+CrewAI trades off some backtesting depth for a more flexible AI‑agent workflow. If your primary goal is rigorous statistical backtesting, a dedicated engine like Zipline may be preferable; if you want to experiment with LLM‑generated signals and rapid iteration, the agent stack is a strong fit.
7. Getting Started Guide
Prerequisites
- Python 3.10+
- An API key for a data vendor (e.g., Yahoo Finance is free; for crypto you may need Binance or Alpaca).
- Optional: OpenAI API key for LLM agents.
Installation
pip install phidata crewai yfinance ccxt alpaca-trade-api pandas-ta APScheduler
Project Layout
quant_bot/
├─ agents.py # CrewAI agent definitions
├─ data.py # Phidata wrapper
├─ strategy.py # Signal logic (LLM or rule)
├─ execution.py # Order submission
├─ config.yaml # API keys, symbols, intervals
└─ main.py # Scheduler and pipeline runner
Step‑by‑Step
- Configure credentials – create
config.yaml:
data:
provider: yfinance # or ccxt
symbols:
- AAPL
- MSFT
interval: 1m
lookback: 100
broker:
name: alpaca # or paper_trading for simulation
api_key: YOUR_ALPACA_KEY
api_secret: YOUR_ALPACA_SECRET
base_url: https://paper-api.alpaca.markets
llm:
model: gpt-4o-mini
api_key: YOUR_OPENAI_KEY
- Implement the data layer (
data.py):
import yfinance as yf
import pandas as pd
from pathlib import Path
CACHE_DIR = Path("./cache")
CACHE_DIR.mkdir(exist_ok=True)
def fetch(symbol: str, interval: str, limit: int = 100) -> pd.DataFrame:
"""Download OHLCV and cache locally."""
cache_file = CACHE_DIR / f"{symbol}_{interval}.parquet"
if cache_file.exists():
df = pd.read_parquet(cache_file)
if len(df) >= limit:
return df.iloc[-limit:]
# yfinance returns daily by default; we resample for intraday
df = yf.download(tickers=symbol, interval=interval, period="5d", progress=False)
df = df.rename(columns=str.lower)
df.to_parquet(cache_file)
return df.iloc[-limit:]
# Optional: add technical indicators using pandas-ta
def enrich(df: pd.DataFrame) -> pd.DataFrame:
import pandas_ta as ta
df = df.copy()
df["sma_20"] = ta.sma(df["close"], length=20)
df["rsi_14"] = ta.rsi(df["close"], length=14)
return df
- Define CrewAI agents (
agents.py):
from crewai import Agent, Task, Crew, Process
from langchain_community.llms import OpenAI
from data import fetch, enrich
llm = OpenAI(model="gpt-4o-mini", temperature=0.0)
# Tool wrappers that agents can call
def get_data(symbol: str) -> str:
df = fetch(symbol, interval="1m", limit=50)
df = enrich(df)
return df.tail().to_csv()
def compute_signal(df_csv: str) -> str:
# In a real setup you would pass the DataFrame to an LLM prompt
# Here we illustrate a simple rule: buy if close > sma_20
import pandas as pd
from io import StringIO
df = pd.read_csv(StringIO(df_csv))
last = df.iloc[-1]
signal = "BUY" if last["close"] > last["sma_20"] else "SELL"
return signal
# Agents
DataAgent = Agent(
role="Data Collector",
goal="Fetch latest market data for the given symbols",
backstory="You are responsible for pulling fresh candles and caching them.",
llm=llm,
tools=[get_data],
verbose=True,
)
StrategyAgent = Agent(
role="Signal Generator",
goal="Produce a trading signal based on the latest features",
backstory="You analyze the data and decide whether to go long, short, or flat.",
llm=llm,
tools=[compute_signal],
verbose=True,
)
# Risk and Execution agents omitted for brevity; they would follow the same pattern.
# Assemble the crew
crew = Crew(
agents=[DataAgent, StrategyAgent],
tasks=[
Task(description="Get data for AAPL", agent=DataAgent),
Task(description="Compute signal", agent=StrategyAgent, context=["Get data for AAPL"]),
],
process=Process.sequential,
verbose=True,
)
- Scheduler and main loop (
main.py):
import time
import yaml
from apscheduler.schedulers.blocking import BlockingScheduler
from agents import crew
with open("config.yaml", "r") as f:
cfg = yaml.safe_load(f)
def run_pipeline():
print("=== Starting pipeline ===")
result = crew.kickoff()
print("Crew output:", result)
# Here you would inspect result and send orders via your execution agent
scheduler = BlockingScheduler()
scheduler.add_job(run_pipeline, "interval", minutes=1)
try:
scheduler.start()
except (KeyboardInterrupt, SystemExit):
pass
- Run the bot
python main.py
You should see log output from each agent, the fetched data frame, and the computed signal. Replace the simple rule in compute_signal with an LLM prompt that asks the model to interpret the indicator values and output a JSON signal.
Next Steps
- Persist agent memory using CrewAI’s
memorycomponent to enable learning across runs. - Swap the Yahoo Finance provider for CCXT to trade crypto pairs.
- Connect the ExecutionAgent to Alpaca’s
submit_ordermethod or a simulated slack webhook for paper trading. - Add a RiskAgent that computes volatility‑adjusted position sizing using the ATR indicator.
- Deploy the scheduler to a lightweight VM or a Cloud Run job for 24/7 operation.
By following these steps you have a working skeleton that marries Phidata’s data‑handling strengths with CrewAI’s multi‑agent reasoning. From here you can iterate on strategy complexity, risk controls, and execution speed to match your trading objectives.
This article reflects the state of Phidata v0.4.2 and CrewAI v0.9.1 as of September 2026. Always backtest any strategy thoroughly before allocating capital.