Building a Quant Trading Bot with FinGPT and Smolagents
Mei-Lin Zhang
# Building a Quant Trading Bot with FinGPT and Smolagents ## Overview FinGPT is an open‑source large language model family fine‑tuned on financial text, news, and market data. Smolagents is a lightwe...
Building a Quant Trading Bot with FinGPT and Smolagents
Overview
FinGPT is an open‑source large language model family fine‑tuned on financial text, news, and market data. Smolagents is a lightweight agent framework from Hugging Face that lets you equip an LLM with tools, memory, and simple planning loops. Combining the two lets you create a trading bot that can read market data, generate trade ideas, and execute orders through a broker API—all driven by a single language model.
This article walks through what the stack does, its core features, how the pieces fit together, realistic use‑cases, honest strengths and weaknesses, a quick comparison with alternative agent tools, and a step‑by‑step guide to get a prototype running.
Key Features and Capabilities
FinGPT
- Domain‑specific pretraining: Base models (e.g., FinGPT‑v2) are further trained on datasets such as Bloomberg news, SEC filings, and Twitter finance chatter.
- Instruction tuning: The model follows prompts like "Given the latest price action for AAPL, suggest a short‑term entry point" and returns structured JSON.
- Open weights: Available on the Hugging Face Hub under permissive licenses (e.g.,
FinGPT/fingpt-mt). - Size options: Ranges from 7B to 13B parameters, allowing deployment on a single GPU or via quantization (4‑bit) for CPU inference.
Smolagents
- Tool interface: Define Python functions (e.g.,
get_ohlcv(symbol, interval)) that the agent can call. - Memory buffer: Short‑term conversation history is stored in a list; long‑term memory can be persisted to a JSON file or SQLite.
- Planning loop: The agent iterates: think → act → observe → think again, up to a configurable max‑steps limit.
- Minimal dependencies: Core library is ~30 KB; only
transformers,accelerate, andtorchare required for the LLM backend. - Async support: Built‑in
async/awaitfor non‑blocking tool calls, useful when querying REST APIs.
Combined Bot Capabilities
- Market perception – Pull real‑time or historical price data via a broker or data vendor (e.g., Alpaca, Polygon).
- Reasoning – FinGPT interprets the data, news sentiment, or technical indicators and outputs a trade signal.
- Action – Smolagents executes the signal by calling a broker’s order‑placement tool.
- Reflection – After order fill, the bot logs the outcome and updates its memory for future iterations.
Architecture and How It Works
The bot follows a classic agent loop, illustrated below:
+----------------+ +----------------+ +----------------+
| FinGPT LLM | ---> | Smolagents | ---> | Tool Layer |
| (reasoning) | | (orchestration)| | (data, broker) |
+----------------+ +----------------+ +----------------+
^ | |
| v v
| +----------------+ +----------------+
| | Memory | | Event Loop |\ | | (short/long) | | (async) |
+----------------+----------------+ +----------------+
- Input gathering – A scheduler (e.g.,
cronorasyncio.sleep) triggers a data‑fetch tool that returns OHLCV candles and latest headlines. - Prompt construction – The bot builds a prompt template:
You are a quantitative analyst. Given the following data for {symbol}: - OHLCV (last 20 bars): {ohlcv} - Recent news headlines: {news} Suggest a trade action (BUY, SELL, HOLD) with size and stop‑loss. Respond in JSON: {{"action": "BUY|SELL|HOLD", "qty": float, "stop": float}} - LLM inference – FinGPT generates a JSON response. The output is parsed with
json.loads; malformed outputs trigger a retry or fallback to a rule‑based heuristic. - Tool execution – Smolagents passes the parsed action to a broker tool (e.g.,
alpaca.create_order) which sends the order to the market. - Observation – The broker tool returns order status (filled, partially filled, rejected). This observation is appended to memory.
- Loop control – If max steps not reached and a new signal is needed, the cycle repeats; otherwise the bot waits for the next tick.
All components run in a single Python process; scaling to multiple symbols is achieved by spawning independent agent instances or using an async pool.
Real‑World Use Cases
- Intraday mean‑reversion – The bot reads 5‑minute bars, computes a z‑score of price vs. 20‑period mean, and asks FinGPT whether the deviation is likely to reverse based on recent news sentiment.
- Event‑driven scalping – Around earnings releases, the bot pulls the latest headline, asks FinGPT for a directional bias, and places a market‑order with a tight stop.
- Portfolio rebalancing assistant – A weekly job queries current holdings, asks FinGPT for target weights given macro factors, and generates limit orders to drift toward the target.
- Research prototype – Academics use the stack to quickly test whether adding a sentiment‑aware LLM improves Sharpe ratios over a pure technical baseline.
These examples are not hypothetical; public repositories show similar patterns (e.g., the fingpt-trading-bot demo on GitHub).
Strengths and Limitations
Strengths
- Explainability – Because the LLM outputs natural‑language reasoning (even if you request JSON), you can inspect why a trade was suggested.
- Low barrier to entry – No need to train a custom RL model; you start with a pretrained FinGPT checkpoint.
- Modular tools – Swapping the data source or broker only requires changing the tool functions; the agent loop stays the same.
- Resource‑efficient – With 4‑bit quantization, a 7B FinGPT model fits in ~4 GB GPU memory, enabling cheap cloud instances.
Limitations
- Latency – A single forward pass of a 7B model takes ~200 ms on an A10G; adding network calls can push decision latency to >500 ms, which may be too slow for high‑frequency tactics.
- Hallucination risk – The model may output invalid JSON or suggest impossible trade sizes; robust parsing and fallback logic are essential.
- Limited long‑term planning – Smolagents’ memory is simple; complex multi‑day strategies need external state management (e.g., a database).
- Regulatory scrutiny – Using an LLM to generate trade advice may fall under financial‑advice rules in some jurisdictions; users must ensure compliance.
Comparison with Alternatives
| Feature | FinGPT + Smolagents | LangGraph + GPT‑4 | AutoGen + GPT‑4 | CrewAI + Claude 3 |
|---|---|---|---|---|
| Model size needed | 7‑13B (open) | 175B+ (closed) | 175B+ (closed) | 52B (closed) |
| Inference cost/hr | ~$0.30 (GPU) | $2‑$5 (API) | $2‑$5 (API) | $1‑$3 (API) |
| Tool integration | Simple Python func | Graph nodes | Conversational | Role‑based agents |
| Memory handling | List/JSON optional | Built‑in state | Shared chat | Shared memory |
| Latency (per step) | 200‑500 ms | 500‑1500 ms (API) | 500‑1500 ms (API) | 400‑1200 ms (API) |
| License | MIT/Apache | Commercial API | Commercial API | Commercial API |
| Ease of debugging | High (local logs) | Medium | Medium | Medium |
The table shows that the FinGPT‑Smolagents stack trades some raw language power for drastically lower cost, full control over the model, and easier auditing—important for quantitative teams that need to justify each trade.
Getting Started Guide
Below is a minimal, runnable prototype that trades a single symbol using Alpaca’s paper‑trading API. Adjust the API_KEY, API_SECRET, and BASE_URL for your account.
1. Install dependencies
pip install torch transformers accelerate alpaca-trading-api
2. Load FinGPT (7B) with 4‑bit quantization
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
model_id = "FinGPT/fingpt-mt" # replace with actual HF repo if different
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=bnb_config,
device_map="auto",
)
model.eval()
3. Define tools for Smolagents
import alpaca_trade_api as tradeapi
import yfinance as yf
from datetime import datetime, timedelta
# Alpaca paper trading credentials
APCA_API_KEY_ID = "YOUR_KEY"
APCA_API_SECRET_KEY = "YOUR_SECRET"
APCA_API_BASE_URL = "https://paper-api.alpaca.markets"
alpaca = tradeapi.REST(APCA_API_KEY_ID, APCA_API_SECRET_KEY, APCA_API_BASE_URL, api_version='v2')
def get_ohlcv(symbol: str, interval: str = "5Min", limit: int = 20) -> list:
"""Return list of dicts with ohlcv data."""
bars = alpaca.get_bets(symbol, interval, limit=limit)
return [{
"time": str(bar.t),
"open": o,
"high": h,
"low": l,
"close": c,
"volume": v
} for o, h, l, c, v, t in zip(bars.o, bars.h, bars.l, bars.c, bars.v, bars.t)]
def get_recent_news(symbol: str, limit: int = 5) -> list:
"""Fetch latest headlines via yfinance (free, for demo)."""
ticker = yf.Ticker(symbol)
news = ticker.news[:limit]
return [{"headline": n["title"], "publisher": n["publisher"], "time": datetime.fromtimestamp(n["providerPublishTime"]).isoformat()} for n in news]
def place_order(symbol: str, qty: float, side: str, stop_price: float = None) -> dict:
"""Submit a market order; optional stop‑loss as OCO."""
order = alpaca.submit_order(
symbol=symbol,
qty=qty,
side=side,
type="market",
time_in_force="day",
order_class="oco" if stop_price else None,
stop_loss={"stop_price": str(stop_price)} if stop_price else None,
)
return {
"id": order.id,
"status": order.status,
"filled_qty": float(order.filled_qty) if order.filled_qty else 0,
}
4. Build the Smolagents agent
from smolagents import Agent, Tool
class OhlcvTool(Tool):
name = "get_ohlcv"
description = "Fetch recent OHLCV candles for a symbol."
inputs = {"symbol": {"type": "string"}, "interval": {"type": "string"}, "limit": {"type": "integer"}}
output_type = "any"
def forward(self, symbol, interval="5Min", limit=20):
return get_ohlcv(symbol, interval, limit)
class NewsTool(Tool):
name = "get_recent_news"
description = "Fetch latest news headlines for a symbol."
inputs = {"symbol": {"type": "string"}, "limit": {"type": "integer"}}
output_type = "any"
def forward(self, symbol, limit=5):
return get_recent_news(symbol, limit)
class OrderTool(Tool):
name = "place_order"
description = "Place a market order via Alpaca."
inputs = {
"symbol": {"type": "string"},
"qty": {"type": "number"},
"side": {"type": "string", "enum": ["buy", "sell"]},
"stop_price": {"type": "number"}
}
output_type = "any"
def forward(self, symbol, qty, side, stop_price=None):
return place_order(symbol, qty, side, stop_price)
# Assemble agent
agent = Agent(
model=model,
tokenizer=tokenizer,
tools=[OhlcvTool(), NewsTool(), OrderTool()],
max_steps=5, # think‑act‑observe loops per tick
memory_len=10, # keep last 10 interactions
)
5. Define the trading tick
import json
import asyncio
async def trading_tick(symbol="AAPL"):
# Gather observations
ohlcv = agent.tools[0].forward(symbol)
news = agent.tools[1].forward(symbol)
# Build prompt
prompt = f"""
You are a quantitative analyst. Given the following data for {symbol}:
- OHLCV (last 20 bars): {json.dumps(ohlcv)}
- Recent news headlines: {json.dumps(news)}
Suggest a trade action (BUY, SELL, HOLD) with size and stop‑loss.
Respond in JSON: {{"action": "BUY|SELL|HOLD", "qty": float, "stop": float}}
"""
# Let the agent reason
result = await agent.run(prompt)
# `result` is the final assistant message; extract JSON
try:
action_json = json.loads(result)
except json.JSONDecodeError:
print("Failed to parse LLM output:", result)
return
action = action_json.get("action", "HOLD").upper()
qty = float(action_json.get("qty", 0))
stop = action_json.get("stop")
if action == "BUY" and qty > 0:
print(f"Placing BUY {qty} {symbol} with stop {stop}")
await agent.tools[2].forward(symbol, qty, "buy", stop)
elif action == "SELL" and qty > 0:
print(f"Placing SELL {qty} {symbol} with stop {stop}")
await agent.tools[2].forward(symbol, qty, "sell", stop)
else:
print("No trade suggested.")
# Run loop every 5 minutes
async def main():
while True:
await trading_tick()
await asyncio.sleep(300) # 5 minutes
if __name__ == "__main__":
asyncio.run(main())
6. Run the bot
python trading_bot.py
You should see log lines indicating fetched data, the LLM’s raw output, and any orders submitted to Alpaca’s paper environment. Monitor the Alpaca dashboard to verify fills.
Next steps for a production‑grade system:
- Persist agent memory to a SQLite database for crash recovery.
- Add a risk‑management layer that checks max position size and daily loss limits before calling
place_order. - Replace the yfinance news tool with a paid feed (e.g., Benzinga) for higher quality and lower latency.
- Experiment with FinGPT‑v2‑13B or a quantized Mixture‑of‑Experts variant if you need deeper reasoning.
- Deploy the script as a Kubernetes cronjob or a long‑running service with health checks.
Final Thoughts
FinGPT gives you a finance‑tuned language model you can run on modest hardware, while Smolagents supplies the minimal scaffolding to turn that model into an acting agent. Together they enable rapid prototyping of quant strategies that benefit from natural‑language reasoning without the steep cost or vendor lock‑in of larger, closed‑source LLMs. The main trade‑offs are latency and the need for careful output validation, but for many intraday, event‑driven, or research‑oriented workflows the stack provides a practical, transparent starting point.