We run a production AI crypto trading system — built from scratch in Python. Over two years of building, we have used nearly every major library in this space. This guide is not a copy-paste list from documentation. These are the libraries that survived real production use.
TL;DR: Use ccxt for exchange connectivity, pandas-ta for indicators, vectorbt for fast backtesting, backtrader for realistic simulation, and scikit-learn + lightgbm for ML signals.
Table of Contents
1. CCXT — Exchange Connectivity
CCXT
CryptoCurrency eXchange Trading LibraryCCXT is the unified API layer for 100+ crypto exchanges. Without it, you'd write a custom HTTP client for every exchange — each with different auth, rate limits, and response shapes. CCXT abstracts all of that. CCXT Pro adds WebSocket support for streaming order books and trade feeds.
Basic CCXT usage
import ccxt
# Connect to Bybit
exchange = ccxt.bybit({
'apiKey': 'YOUR_API_KEY',
'secret': 'YOUR_SECRET',
'options': {'defaultType': 'future'},
})
# Fetch OHLCV data (1h candles)
ohlcv = exchange.fetch_ohlcv('BTC/USDT:USDT', '1h', limit=200)
# Place a market order
order = exchange.create_order(
symbol='BTC/USDT:USDT',
type='market',
side='buy',
amount=0.001
)
One caveat: CCXT normalizes exchange APIs, but each exchange still has quirks. Position indexing (OneWay vs Hedge mode on Bybit), order type naming, and fee structures all differ. Read the exchange-specific documentation even when using CCXT.
2. Pandas-TA — Technical Indicators
Pandas-TA
Technical Analysis for Pandas DataFramesPandas-TA adds 130+ technical indicators directly to Pandas DataFrames via a .ta accessor. It covers everything from RSI and MACD to Hurst exponent and entropy. In our production system, we use 24 features from Pandas-TA as inputs to the ML model.
Computing indicators with Pandas-TA
import pandas as pd
import pandas_ta as ta
# df has columns: open, high, low, close, volume
df = pd.DataFrame(ohlcv, columns=['timestamp','open','high','low','close','volume'])
df.set_index('timestamp', inplace=True)
# Add RSI(14)
df.ta.rsi(length=14, append=True)
# Add ATR(14)
df.ta.atr(length=14, append=True)
# Add MACD
df.ta.macd(fast=12, slow=26, signal=9, append=True)
# Add Bollinger Bands
df.ta.bbands(length=20, std=2.0, append=True)
# Add all at once with a strategy
MyStrategy = ta.Strategy(
name="Core Features",
ta=[
{"kind": "rsi"},
{"kind": "atr"},
{"kind": "macd"},
{"kind": "obv"},
]
)
df.ta.strategy(MyStrategy)
TA-Lib alternative: TA-Lib is C-based and faster, but requires a compiled binary that often fails on non-Linux systems. Pandas-TA is slower but installs cleanly everywhere with pip install pandas-ta. For production servers, TA-Lib's speed advantage matters; for development, Pandas-TA wins on convenience.
3. VectorBT — Fast Backtesting
VectorBT
Vectorized Backtesting with NumPyVectorBT runs backtests as NumPy array operations rather than Python loops. This makes it 100-1000x faster than event-driven frameworks for simple strategies. In our research pipeline, we run 500+ parameter combinations in minutes using VectorBT. It also includes visualization via Plotly.
Running a parameter sweep with VectorBT
import vectorbt as vbt
import numpy as np
# Price data
price = vbt.YFData.download('BTC-USD', period='2y').get('Close')
# Test RSI strategy across 50 parameter combinations
rsi_periods = np.arange(5, 55, 1) # 50 values
rsi = vbt.RSI.run(price, window=rsi_periods, short_name='rsi')
entries = rsi.rsi_crossed_below(30) # oversold entry
exits = rsi.rsi_crossed_above(70) # overbought exit
pf = vbt.Portfolio.from_signals(price, entries, exits, freq='1D')
# Get stats for all 50 combinations at once
print(pf.total_return())
print(pf.sharpe_ratio())
VectorBT's weakness is complexity: order management, realistic fills, and slippage are harder to model than in Backtrader. For initial research and parameter optimization, VectorBT is unbeatable. For final strategy validation before going live, switch to Backtrader.
4. Backtrader — Realistic Simulation
Backtrader
Event-Driven Backtesting FrameworkBacktrader simulates strategies bar-by-bar, making it easy to model realistic order management: limit orders, OCO orders, slippage, commission tiers, and multiple data feeds. The event-driven architecture closely mirrors how real exchange execution works. Slower than VectorBT but far more realistic for final validation.
Backtrader strategy skeleton
import backtrader as bt
class RsiStrategy(bt.Strategy):
params = (('rsi_period', 14), ('oversold', 30), ('overbought', 70),)
def __init__(self):
self.rsi = bt.indicators.RSI(self.data, period=self.p.rsi_period)
self.order = None
def next(self):
if self.order:
return # Wait for pending order
if not self.position:
if self.rsi < self.p.oversold:
# Buy with a 1% stop-loss
price = self.data.close[0]
self.buy_bracket(
size=1,
stopprice=price * 0.99,
limitprice=price * 1.02,
)
else:
if self.rsi > self.p.overbought:
self.close()
cerebro = bt.Cerebro()
cerebro.addstrategy(RsiStrategy)
cerebro.broker.set_cash(10000)
cerebro.broker.setcommission(commission=0.001)
cerebro.run()
5. ML Stack — scikit-learn + LightGBM
scikit-learn + LightGBM
Machine Learning Signal GenerationPure technical analysis rules are brittle. ML models that learn from historical feature-regime relationships are more adaptive. We use scikit-learn for preprocessing, pipeline management, and calibration, and LightGBM as the core classifier (fast gradient boosting, handles sparse features well). Our production MoE (Mixture of Experts) system runs 6 LightGBM experts gated by a regime classifier.
Simple ML trading signal with LightGBM
import lightgbm as lgb
import pandas_ta as ta
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import precision_score
# Build features
df.ta.rsi(length=14, append=True)
df.ta.atr(length=14, append=True)
df.ta.macd(append=True)
df.ta.obv(append=True)
# Target: 1 if price up >0.5% in next 4 bars, else 0
df['target'] = (df['close'].shift(-4) / df['close'] - 1 > 0.005).astype(int)
df.dropna(inplace=True)
feature_cols = [c for c in df.columns if c not in ['open','high','low','close','volume','target']]
X, y = df[feature_cols], df['target']
# Walk-forward validation
tscv = TimeSeriesSplit(n_splits=5)
for train_idx, test_idx in tscv.split(X):
X_train, X_test = X.iloc[train_idx], X.iloc[test_idx]
y_train, y_test = y.iloc[train_idx], y.iloc[test_idx]
model = lgb.LGBMClassifier(n_estimators=200, learning_rate=0.05, num_leaves=31)
model.fit(X_train, y_train)
preds = model.predict(X_test)
print(f"Precision: {precision_score(y_test, preds):.3f}")
Walk-forward validation is mandatory. Training on the full dataset and testing on a subset is data leakage — your results will not survive contact with live markets. See our full guide on building an AI trading system.
Which Library When?
| Task | Best Library | Alternative |
|---|---|---|
| Exchange connectivity | CCXT | Exchange SDK |
| Technical indicators | Pandas-TA | TA-Lib (C, faster) |
| Fast parameter sweep | VectorBT | NumPy manual |
| Realistic backtesting | Backtrader | Zipline (dated) |
| ML signals | LightGBM | XGBoost, CatBoost |
| Regime detection | scikit-learn HMM | BOCD (Bayesian) |
| Data storage | SQLite + pandas | TimescaleDB |
| Visualization | Plotly | Matplotlib |
A Production Architecture Example
Here is how we combine these libraries in a production trading system:
- Data Layer: CCXT Pro WebSocket → raw OHLCV → SQLite
- Feature Layer: Pandas-TA on cached OHLCV → 24-dim feature vector
- Signal Layer: Regime classifier (sklearn) → route to expert LightGBM model
- Research Layer: VectorBT for parameter search → Backtrader for final validation
- Execution Layer: CCXT for orders → SQLite for position tracking
- Monitoring: Python watchdog → Telegram alerts
The key principle is separation of research and execution. Research (VectorBT parameter sweeps) runs offline. Execution (live CCXT orders) uses only the validated model output. They share the same SQLite database schema, making it easy to compare live performance against backtest predictions.
Want to go deeper?
Read our full guide on building a production AI trading system from research to live deployment.
Read: How to Build an AI Trading System →FAQ
Is CCXT safe to use with real API keys?
Yes, CCXT is widely used in production. Store API keys in environment variables, never in code. Use IP allowlists on your exchange account, and create API keys with only the minimum required permissions (trade, but not withdrawal). Rotate keys if you suspect exposure.
Do I need machine learning to build a profitable crypto bot?
No. Simple rule-based systems can be profitable. However, pure technical analysis rules (crossovers, RSI thresholds) tend to work in specific regimes and fail in others. ML models that condition on market regime tend to be more robust — at the cost of significantly more development complexity. Start simple. Graduate to ML when you have evidence that the simple approach is hitting a wall.
How much capital do I need to test a live trading bot?
Start with $50-$200 on a perpetual futures exchange. At 5-10x leverage, this gives you enough exposure to see real results while limiting catastrophic loss. Run in shadow mode (track signals without executing) for 2-4 weeks first. Understand your fee drag: at 10x leverage, a round-trip fee of 0.11% costs 1.1% of position value per trade. 30 trades/day at $200 account = account wipe in 3 days.
What is the hardest part of building a real trading bot?
Not the code — the psychology of watching real money move. The hardest part is maintaining discipline to not override the bot when it goes through drawdown. The second hardest is writing rigorous walk-forward backtests that do not overfit. The third is handling exchange connectivity issues (rate limits, WebSocket drops, unexpected response shapes). All three require more work than writing the strategy logic itself.