Building a Quant Trading Bot with FinGPT and Smolagents

Overview

FinGPT is an open‑source large language model family fine‑tuned on financial text, news, and market data. Smolagents is a lightweight agent framework from Hugging Face that lets you equip an LLM with tools, memory, and simple planning loops. Combining the two lets you create a trading bot that can read market data, generate trade ideas, and execute orders through a broker API—all driven by a single language model.

This article walks through what the stack does, its core features, how the pieces fit together, realistic use‑cases, honest strengths and weaknesses, a quick comparison with alternative agent tools, and a step‑by‑step guide to get a prototype running.

Key Features and Capabilities

FinGPT

Domain‑specific pretraining: Base models (e.g., FinGPT‑v2) are further trained on datasets such as Bloomberg news, SEC filings, and Twitter finance chatter.
Instruction tuning: The model follows prompts like "Given the latest price action for AAPL, suggest a short‑term entry point" and returns structured JSON.
Open weights: Available on the Hugging Face Hub under permissive licenses (e.g., FinGPT/fingpt-mt).
Size options: Ranges from 7B to 13B parameters, allowing deployment on a single GPU or via quantization (4‑bit) for CPU inference.

Smolagents

Tool interface: Define Python functions (e.g., get_ohlcv(symbol, interval)) that the agent can call.
Memory buffer: Short‑term conversation history is stored in a list; long‑term memory can be persisted to a JSON file or SQLite.
Planning loop: The agent iterates: think → act → observe → think again, up to a configurable max‑steps limit.
Minimal dependencies: Core library is ~30 KB; only transformers, accelerate, and torch are required for the LLM backend.
Async support: Built‑in async/await for non‑blocking tool calls, useful when querying REST APIs.

Combined Bot Capabilities

Market perception – Pull real‑time or historical price data via a broker or data vendor (e.g., Alpaca, Polygon).
Reasoning – FinGPT interprets the data, news sentiment, or technical indicators and outputs a trade signal.
Action – Smolagents executes the signal by calling a broker’s order‑placement tool.
Reflection – After order fill, the bot logs the outcome and updates its memory for future iterations.

Architecture and How It Works

The bot follows a classic agent loop, illustrated below:

+----------------+      +----------------+      +----------------+
|   FinGPT LLM   | ---> |   Smolagents   | ---> |   Tool Layer   |
| (reasoning)    |      | (orchestration)|      | (data, broker) |
+----------------+      +----------------+      +----------------+
        ^                         |                         |
        |                         v                         v
        |                +----------------+      +----------------+
        |                |   Memory       |      |   Event Loop   |\        |                | (short/long)   |      | (async)        |
        +----------------+----------------+      +----------------+

Input gathering – A scheduler (e.g., cron or asyncio.sleep) triggers a data‑fetch tool that returns OHLCV candles and latest headlines.

Prompt construction – The bot builds a prompt template:

You are a quantitative analyst. Given the following data for {symbol}:
- OHLCV (last 20 bars): {ohlcv}
- Recent news headlines: {news}
Suggest a trade action (BUY, SELL, HOLD) with size and stop‑loss.
Respond in JSON: {{"action": "BUY|SELL|HOLD", "qty": float, "stop": float}}

LLM inference – FinGPT generates a JSON response. The output is parsed with json.loads; malformed outputs trigger a retry or fallback to a rule‑based heuristic.
Tool execution – Smolagents passes the parsed action to a broker tool (e.g., alpaca.create_order) which sends the order to the market.
Observation – The broker tool returns order status (filled, partially filled, rejected). This observation is appended to memory.
Loop control – If max steps not reached and a new signal is needed, the cycle repeats; otherwise the bot waits for the next tick.

All components run in a single Python process; scaling to multiple symbols is achieved by spawning independent agent instances or using an async pool.

Real‑World Use Cases

Intraday mean‑reversion – The bot reads 5‑minute bars, computes a z‑score of price vs. 20‑period mean, and asks FinGPT whether the deviation is likely to reverse based on recent news sentiment.
Event‑driven scalping – Around earnings releases, the bot pulls the latest headline, asks FinGPT for a directional bias, and places a market‑order with a tight stop.
Portfolio rebalancing assistant – A weekly job queries current holdings, asks FinGPT for target weights given macro factors, and generates limit orders to drift toward the target.
Research prototype – Academics use the stack to quickly test whether adding a sentiment‑aware LLM improves Sharpe ratios over a pure technical baseline.

These examples are not hypothetical; public repositories show similar patterns (e.g., the fingpt-trading-bot demo on GitHub).

Strengths and Limitations

Strengths

Explainability – Because the LLM outputs natural‑language reasoning (even if you request JSON), you can inspect why a trade was suggested.
Low barrier to entry – No need to train a custom RL model; you start with a pretrained FinGPT checkpoint.
Modular tools – Swapping the data source or broker only requires changing the tool functions; the agent loop stays the same.
Resource‑efficient – With 4‑bit quantization, a 7B FinGPT model fits in ~4 GB GPU memory, enabling cheap cloud instances.

Limitations

Latency – A single forward pass of a 7B model takes ~200 ms on an A10G; adding network calls can push decision latency to >500 ms, which may be too slow for high‑frequency tactics.
Hallucination risk – The model may output invalid JSON or suggest impossible trade sizes; robust parsing and fallback logic are essential.
Limited long‑term planning – Smolagents’ memory is simple; complex multi‑day strategies need external state management (e.g., a database).
Regulatory scrutiny – Using an LLM to generate trade advice may fall under financial‑advice rules in some jurisdictions; users must ensure compliance.

Comparison with Alternatives

Feature	FinGPT + Smolagents	LangGraph + GPT‑4	AutoGen + GPT‑4	CrewAI + Claude 3
Model size needed	7‑13B (open)	175B+ (closed)	175B+ (closed)	52B (closed)
Inference cost/hr	~$0.30 (GPU)	$2‑$5 (API)	$2‑$5 (API)	$1‑$3 (API)
Tool integration	Simple Python func	Graph nodes	Conversational	Role‑based agents
Memory handling	List/JSON optional	Built‑in state	Shared chat	Shared memory
Latency (per step)	200‑500 ms	500‑1500 ms (API)	500‑1500 ms (API)	400‑1200 ms (API)
License	MIT/Apache	Commercial API	Commercial API	Commercial API
Ease of debugging	High (local logs)	Medium	Medium	Medium

The table shows that the FinGPT‑Smolagents stack trades some raw language power for drastically lower cost, full control over the model, and easier auditing—important for quantitative teams that need to justify each trade.

Getting Started Guide

Below is a minimal, runnable prototype that trades a single symbol using Alpaca’s paper‑trading API. Adjust the API_KEY, API_SECRET, and BASE_URL for your account.

1. Install dependencies

pip install torch transformers accelerate alpaca-trading-api

2. Load FinGPT (7B) with 4‑bit quantization

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

model_id = "FinGPT/fingpt-mt"  # replace with actual HF repo if different
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    quantization_config=bnb_config,
    device_map="auto",
)
model.eval()

3. Define tools for Smolagents

import alpaca_trade_api as tradeapi
import yfinance as yf
from datetime import datetime, timedelta

# Alpaca paper trading credentials
APCA_API_KEY_ID = "YOUR_KEY"
APCA_API_SECRET_KEY = "YOUR_SECRET"
APCA_API_BASE_URL = "https://paper-api.alpaca.markets"

alpaca = tradeapi.REST(APCA_API_KEY_ID, APCA_API_SECRET_KEY, APCA_API_BASE_URL, api_version='v2')

def get_ohlcv(symbol: str, interval: str = "5Min", limit: int = 20) -> list:
    """Return list of dicts with ohlcv data."""
    bars = alpaca.get_bets(symbol, interval, limit=limit)
    return [{
        "time": str(bar.t),
        "open": o,
        "high": h,
        "low": l,
        "close": c,
        "volume": v
    } for o, h, l, c, v, t in zip(bars.o, bars.h, bars.l, bars.c, bars.v, bars.t)]

def get_recent_news(symbol: str, limit: int = 5) -> list:
    """Fetch latest headlines via yfinance (free, for demo)."""
    ticker = yf.Ticker(symbol)
    news = ticker.news[:limit]
    return [{"headline": n["title"], "publisher": n["publisher"], "time": datetime.fromtimestamp(n["providerPublishTime"]).isoformat()} for n in news]

def place_order(symbol: str, qty: float, side: str, stop_price: float = None) -> dict:
    """Submit a market order; optional stop‑loss as OCO."""
    order = alpaca.submit_order(
        symbol=symbol,
        qty=qty,
        side=side,
        type="market",
        time_in_force="day",
        order_class="oco" if stop_price else None,
        stop_loss={"stop_price": str(stop_price)} if stop_price else None,
    )
    return {
        "id": order.id,
        "status": order.status,
        "filled_qty": float(order.filled_qty) if order.filled_qty else 0,
    }

4. Build the Smolagents agent

from smolagents import Agent, Tool

class OhlcvTool(Tool):
    name = "get_ohlcv"
    description = "Fetch recent OHLCV candles for a symbol."
    inputs = {"symbol": {"type": "string"}, "interval": {"type": "string"}, "limit": {"type": "integer"}}
    output_type = "any"

    def forward(self, symbol, interval="5Min", limit=20):
        return get_ohlcv(symbol, interval, limit)

class NewsTool(Tool):
    name = "get_recent_news"
    description = "Fetch latest news headlines for a symbol."
    inputs = {"symbol": {"type": "string"}, "limit": {"type": "integer"}}
    output_type = "any"

    def forward(self, symbol, limit=5):
        return get_recent_news(symbol, limit)

class OrderTool(Tool):
    name = "place_order"
    description = "Place a market order via Alpaca."
    inputs = {
        "symbol": {"type": "string"},
        "qty": {"type": "number"},
        "side": {"type": "string", "enum": ["buy", "sell"]},
        "stop_price": {"type": "number"}
    }
    output_type = "any"

    def forward(self, symbol, qty, side, stop_price=None):
        return place_order(symbol, qty, side, stop_price)

# Assemble agent
agent = Agent(
    model=model,
    tokenizer=tokenizer,
    tools=[OhlcvTool(), NewsTool(), OrderTool()],
    max_steps=5,          # think‑act‑observe loops per tick
    memory_len=10,        # keep last 10 interactions
)

5. Define the trading tick

import json
import asyncio

async def trading_tick(symbol="AAPL"):
    # Gather observations
    ohlcv = agent.tools[0].forward(symbol)
    news = agent.tools[1].forward(symbol)

    # Build prompt
    prompt = f"""
You are a quantitative analyst. Given the following data for {symbol}:
- OHLCV (last 20 bars): {json.dumps(ohlcv)}
- Recent news headlines: {json.dumps(news)}
Suggest a trade action (BUY, SELL, HOLD) with size and stop‑loss.
Respond in JSON: {{"action": "BUY|SELL|HOLD", "qty": float, "stop": float}}
"""

    # Let the agent reason
    result = await agent.run(prompt)
    # `result` is the final assistant message; extract JSON
    try:
        action_json = json.loads(result)
    except json.JSONDecodeError:
        print("Failed to parse LLM output:", result)
        return

    action = action_json.get("action", "HOLD").upper()
    qty = float(action_json.get("qty", 0))
    stop = action_json.get("stop")

    if action == "BUY" and qty > 0:
        print(f"Placing BUY {qty} {symbol} with stop {stop}")
        await agent.tools[2].forward(symbol, qty, "buy", stop)
    elif action == "SELL" and qty > 0:
        print(f"Placing SELL {qty} {symbol} with stop {stop}")
        await agent.tools[2].forward(symbol, qty, "sell", stop)
    else:
        print("No trade suggested.")

# Run loop every 5 minutes
async def main():
    while True:
        await trading_tick()
        await asyncio.sleep(300)  # 5 minutes

if __name__ == "__main__":
    asyncio.run(main())

6. Run the bot

python trading_bot.py

You should see log lines indicating fetched data, the LLM’s raw output, and any orders submitted to Alpaca’s paper environment. Monitor the Alpaca dashboard to verify fills.

Next steps for a production‑grade system:

Persist agent memory to a SQLite database for crash recovery.
Add a risk‑management layer that checks max position size and daily loss limits before calling place_order.
Replace the yfinance news tool with a paid feed (e.g., Benzinga) for higher quality and lower latency.
Experiment with FinGPT‑v2‑13B or a quantized Mixture‑of‑Experts variant if you need deeper reasoning.
Deploy the script as a Kubernetes cronjob or a long‑running service with health checks.

Final Thoughts

FinGPT gives you a finance‑tuned language model you can run on modest hardware, while Smolagents supplies the minimal scaffolding to turn that model into an acting agent. Together they enable rapid prototyping of quant strategies that benefit from natural‑language reasoning without the steep cost or vendor lock‑in of larger, closed‑source LLMs. The main trade‑offs are latency and the need for careful output validation, but for many intraday, event‑driven, or research‑oriented workflows the stack provides a practical, transparent starting point.

Building a Quant Trading Bot with FinGPT and Smolagents

Building a Quant Trading Bot with FinGPT and Smolagents

Overview

Key Features and Capabilities

FinGPT

Smolagents

Combined Bot Capabilities

Architecture and How It Works

Real‑World Use Cases

Strengths and Limitations

Strengths

Limitations

Comparison with Alternatives

Getting Started Guide

1. Install dependencies

2. Load FinGPT (7B) with 4‑bit quantization

3. Define tools for Smolagents

4. Build the Smolagents agent

5. Define the trading tick

6. Run the bot

Final Thoughts

Keywords

Keep reading

Building a Knowledge Graph with ChatGPT and VoltAgent

Comparing 40 Agent Frameworks: Mastra vs Haystack

Smolagents: The Research Agent That Reads 18 Papers in Minutes