Home

loadchange/ai-hedge-fund: A GitHub Repository Worth Watching

Di

Diego Herrera

May 7, 202611 min read

# AI Hedge Fund: Where Warren Buffett Meets LangGraph — A Multi-Agent Trading System That Thinks Like 14 Legendary Investors ![Hero Banner](https://github.com/user-attachments/assets/3a264c9b-48a0-44...

AI Hedge Fund: Where Warren Buffett Meets LangGraph — A Multi-Agent Trading System That Thinks Like 14 Legendary Investors

Hero Banner

What if you could clone Warren Buffett's brain, Charlie Munger's temperament, Cathie Wood's growth obsession, and Nassim Taleb's skepticism — then run them all in parallel on the same stock ticker?

That's not a thought experiment. That's loadchange/ai-hedge-fund, an open-source, multi-agent trading system that orchestrates thirteen LLM persona investors alongside a rigorous quantitative stack to produce BUY / SELL / HOLD / SHORT decisions across US, Hong Kong, and China A-share markets — all with bilingual output in English and Simplified Chinese.

And the best part? Every single data source is completely free.


🏗️ Architecture: A System That Thinks in Layers

This isn't a toy wrapper around ChatGPT. The architecture reads like a miniature institutional trading desk:

   CLI · Issue bot
        │
   DataSourceManager  (US: yfinance→akshare · HK: tencent/yfinance/akshare
        │             · CN: baostock/akshare/tencent)
        ├──────────────────────────────┐
        ▼                              ▼
   LLM persona agents            Quant signals (BaseSignal)
   (LangGraph; Buffett /         trend · mean_reversion · momentum
   Munger / Wood / …)            volatility · stat_arb · value
        │                        quality · earnings_surprise
        └────────────┬───────────┘
                     ▼
          Risk manager → Portfolio manager
          (vol / corr / drawdown caps; optional
          cvxpy MVO · risk parity · Black-Litt.)
                     ▼
          BUY / SELL / HOLD / SHORT  (10 bps default cost)
                     ▼
          Backtester · Validation (CPCV + PBO + Deflated Sharpe)

Every layer is modular. The LLM agents and the quantitative signals are completely independent — you can use either one, or both together. The risk manager and portfolio optimizer sit downstream, enforcing discipline regardless of where the alpha originated.


🧠 The Agent Roster: 14 Legendary Investors in Your Terminal

Here's where the project gets genuinely fascinating. Each "persona" isn't just a prompt injection — it's a fully realized agent that reasons about a ticker the way its real-world counterpart would.

LLM-Powered Investor Personas

Agent Philosophy Style
🟢 Warren Buffett Value, moats, long-term compounding Conservative, quality-focused
🟢 Charlie Munger Mental models, rationality, simplicity Contrarian, margin-of-safety
🟢 Cathie Wood Disruptive innovation, exponential growth Aggressive, tech-forward
🟢 Duan Yongping Concentrated bets, consumer tech China-savvy, pragmatic
🟢 Stanley Druckenmiller Macro-driven, asymmetric risk Aggressive macro trader
🟢 Michael Burry Deep value, contrarian, data-heavy Short-seller, forensic analysis
🟢 Aswath Damodaran DCF valuation, narrative + numbers Academic, quantitative value
🟢 Ben Graham Net-net, margin of safety Ultra-conservative, cigar-butt
🟢 Bill Ackman Activist, concentrated positions High-conviction, catalyst-driven
🟢 Nassim Taleb Antifragile, tail-risk awareness Skeptical, barbell strategy
🟢 Peter Lynch Growth at a reasonable price Bottom-up, storytelling
🟢 Phil Fisher Scuttlebutt, qualitative growth Long-term, quality growth
🟢 Rakesh Jhunjhunwala Indian market legend, momentum Bold, emerging market focus
🟢 Mohnish Pabrai Cloning Buffett, asymmetric bets Concentrated, low-fee thinking

Technical Analysts (No LLM Required)

Alongside the personas, the system includes six generic analysts that delegate to the pure quantitative stack:

  • valuation_analyst — DCF and relative valuation
  • sentiment_analyst — Sentiment scoring
  • news_sentiment_analyst — News-driven sentiment
  • fundamentals_analyst — Financial statement analysis
  • growth_analyst — Growth metrics and trajectories
  • technical_analyst — Delegates to src/signals/, zero LLM calls

💡 Key Insight: The risk_management_agent and portfolio_manager are always on. The risk manager is LLM-free (pure vol/correlation math), while the portfolio manager uses an LLM to synthesize all inputs into a final BUY / SELL / HOLD / SHORT / COVER decision.


📊 Quantitative Modules: Six Standalone Packages

The quantitative backbone isn't an afterthought — it's a first-class citizen. Every module is importable independently, no LangGraph dependency required.

Module Purpose LLM-Free?
src/signals/ BaseSignal ABC + 8 signals (5 technical / 3 fundamental); SignalResult[-1, +1]
src/risk/ Vol / correlation, drawdown, scenario stress (2008, COVID, etc.)
src/portfolio/ MVO, risk parity, Black-Litterman via cvxpy
src/validation/ CPCV + PBO + Deflated Sharpe Ratio
src/event_study/ Market-model α/β, AR/CAR, t-statistics
src/features/ Feature engineering pipeline

Signal Library at a Glance

The BaseSignal abstract base class defines a clean interface. Each signal returns a SignalResult with a normalized score between -1 (strong sell) and +1 (strong buy):

Technical Signals:

  • trend — Trend-following indicators
  • mean_reversion — Mean reversion strategies
  • momentum — Price momentum
  • volatility — Volatility-based signals
  • stat_arb — Statistical arbitrage pairs

Fundamental Signals:

  • value — Value factor scoring
  • quality — Quality factor scoring
  • earnings_surprise — Earnings surprise detection

🌏 Multi-Market, Multi-Language: A Truly Global System

Most open-source trading tools are stuck in the US equity universe. This project breaks free.

Market Data Sources Notes
🇺🇸 US yfinance → akshare (fallback) NYSE, NASDAQ, AMEX
🇭🇰 Hong Kong tencent / yfinance / akshare HKEX tickers
🇨🇳 China A-share baostock / akshare / tencent Shanghai + Shenzhen

All data sources are completely free — no Bloomberg terminal, no paid API keys, no subscription tiers. The DataSourceManager handles fallback chains automatically, so if one source is down, the system gracefully degrades.

Bilingual output is built in: pass --lang zhCN for Simplified Chinese output, or leave it at the default for English.


🚀 Quick Start: From Clone to First Trade Signal in 60 Seconds

1. Install

git clone https://github.com/loadchange/ai-hedge-fund.git
cd ai-hedge-fund
curl -LsSf https://astral.sh/uv/install.sh | sh   # if you don't have uv
uv sync
cp .env.example .env                              # set ONE LLM key

2. Run a Single-Day Multi-Agent Decision

uv run python src/main.py --tickers AAPL,MSFT --model mimo-v2.5-pro \
  --analysts warren_buffett,duan_yongping --lang zhCN

This runs Buffett and Duan Yongping through the full pipeline: data fetch → persona analysis → risk assessment → portfolio decision. Output arrives in Simplified Chinese.

3. Backtest Across a Date Range

uv run python src/backtester.py --tickers AAPL --model mimo-v2.5-pro \
  --start-date 2025-01-01 --end-date 2025-02-01 \
  --analysts warren_buffett,duan_yongping

The backtester re-runs the entire multi-agent workflow per business day, applying a configurable cost model:

# Custom transaction costs
uv run python src/backtester.py --tickers AAPL --model mimo-v2.5-pro \
  --cost-model spread --cost-bps 15

⚠️ Cost Warning: Backtester cost scales as analysts × tickers × days. For experimentation, prefer 1 ticker × 2–3 analysts × 1–2 weeks. The system warns you if you're about to exceed the 400-LLM-call cap.

4. Validate Signals (No LLM Required)

uv run python -m src.validation.cli evaluate \
  --signal momentum,trend --ticker AAPL,MSFT \
  --start 2023-01-01 --end 2025-04-01 --rolling-window 180

This runs Combinatorial Purged Cross-Validation (CPCV), Probability of Backtest Overfitting (PBO), and Deflated Sharpe Ratio analysis — all without a single LLM call. This is serious quantitative finance methodology.


🤖 The Issue Bot: Turn GitHub Issues Into Trading Jobs

This is one of the most creative features in the entire repository. The Hedge Fund Issue Bot is a GitHub Actions workflow that transforms issues into runnable analysis jobs.

How It Works

  1. Open an issue from a template
  2. Fill in the body in free-form natural language (the LLM extracts arguments)
  3. Submit → bot acknowledges within seconds
  4. Wait → final reply arrives in 30 seconds to 5 minutes
  5. Issue auto-closes → subscribers get email notifications via GitHub's native flow

Available Modes

Mode Label LLM? Output
📈 Ticker analysis bot-ticker ✅ Yes Single-day BUY/SELL/HOLD/SHORT per ticker
📉 Backtester bot-backtester ✅ Per day Multi-day equity curve + Sharpe + costs
🔬 Signal validation bot-validate ❌ No CPCV IS/OOS Sharpe + PBO + DSR
📰 Event study bot-event-study ❌ No Market-model α/β + AR/CAR + t-stat

🎯 Pro Tip: Don't like the result? Just edit the issue body to retrigger. The bot is designed for rapid iteration.

Failure Handling That Actually Helps

The bot's failure replies are bilingual and actionable:

  • Missing fields → get an example body with the correct format
  • Over the 400-LLM-call cap → receive a full breakdown plus a parameter-combo table that would fit within limits
  • Fundamental signals on bot-validate → redirected to the five technical signals (CPCV is daily-rolling)

⚙️ The LLM Backbone: Powered by Xiaomi MiMo v2.5 Pro

The system is powered by Xiaomi MiMo v2.5 Pro, with new users receiving $2 free credit with invite code FU5PSQ.

But the system isn't locked to a single provider. The --ollama flag enables local model execution, and the src/llm/api_models.json configuration file supports custom API endpoints. Bring your own model — the architecture is provider-agnostic.

# Use local Ollama model
uv run python src/main.py --tickers AAPL --model llama3 --analysts warren_buffett --ollama

🔬 Validation: Where Most AI Trading Projects Stop, This One Starts

The validation suite alone makes this repository worth studying. While most "AI trading" projects stop at backtest charts, ai-hedge-fund implements the gold standard of quantitative validation:

Combinatorial Purged Cross-Validation (CPCV)

Splits time-series data into combinatorial folds with purging to prevent information leakage. Reports in-sample and out-of-sample Sharpe ratios.

Probability of Backtest Overfitting (PBO)

Implements Bailey et al.'s framework to estimate the probability that a strategy's performance is due to overfitting rather than genuine alpha.

Deflated Sharpe Ratio (DSR)

Adjusts the observed Sharpe ratio for the number of trials conducted — the statistical equivalent of a Bonferroni correction for backtests.

# Full validation pipeline — no LLM calls
uv run python -m src.validation.cli evaluate \
  --signal momentum,trend,value,quality \
  --ticker AAPL,MSFT,GOOGL \
  --start 2020-01-01 --end 2025-04-01 \
  --rolling-window 252

📋 Command Reference

Command LLM? Purpose
src/main.py ✅ Yes One call per persona × ticker + portfolio manager
src/backtester.py ✅ Per business day Full backtest with cost modeling
python -m src.validation.cli evaluate ❌ No CPCV / PBO / Deflated Sharpe
from src.signals import ... ❌ No Import quant modules directly
from src.risk import ... ❌ No Risk analytics
from src.portfolio import ... ❌ No Portfolio optimization

🏆 Verdict: The Most Thoughtful Open-Source AI Trading Project We've Seen

What makes ai-hedge-fund exceptional:

  • 🎭 Multi-persona architecture — Not just one LLM call, but 14 distinct investor personas reasoning independently, then synthesized through a disciplined risk and portfolio layer
  • 📊 Quantitative rigor — CPCV, PBO, and Deflated Sharpe Ratio validation that would make Marcos López de Prado proud
  • 🌏 Global reach — US, Hong Kong, and China A-share markets with bilingual output
  • 💰 Zero data cost — Every data source is free (yfinance, akshare, baostock, tencent)
  • 🤖 Issue bot innovation — Turn GitHub issues into trading analysis jobs with zero setup
  • 🧩 Modular design — Six standalone quantitative packages, importable without LangGraph
  • 🔌 Provider-agnostic — Xiaomi MiMo, Ollama, or any OpenAI-compatible API

What to keep in mind:

  • This is explicitly educational and research-only — no real trades, no investment advice
  • Backtesting costs scale quickly; start small (1 ticker, 2–3 analysts, 1–2 weeks)
  • LLM persona quality depends on the underlying model's reasoning ability

🎬 Final Thoughts

In a landscape littered with "ChatGPT but for stocks" repositories that amount to thin API wrappers, ai-hedge-fund stands apart as a genuinely engineered system. The multi-agent architecture isn't gimmickry — it's a thoughtful implementation of ensemble reasoning, where diverse investment philosophies converge on a decision through structured debate and quantitative discipline.

The inclusion of proper validation methodology (CPCV, PBO, Deflated Sharpe) signals that the authors understand a fundamental truth: in quantitative finance, the backtest is not the strategy. The strategy is what survives rigorous out-of-sample testing.

Whether you're a:

  • Student learning how institutional trading systems work
  • Researcher studying multi-agent LLM architectures
  • Quant developer looking for a modular signal generation framework
  • Curious engineer who wants to see what happens when Buffett and Taleb argue about Apple

This repository delivers.

"The best thing a human being can do is to help another human being know more." — Charlie Munger

⭐ Star it. Fork it. Learn from it. Just don't bet your retirement on it.


Built with Python, LangGraph, and the collective wisdom of 14 legendary investors. Market data: all free. Knowledge: priceless.

Keywords

AI hedge fundmulti-agent tradingLLM trading systemLangGraph tradingquantitative financebacktestingCPCVPBODeflated Sharpe RatioWarren Buffett AIopen source tradingChina A-shareHong Kong stocksmachine learning financePython trading

Keep reading

More related articles from DriftSeas.