Wang Jiachen
Back to projects
Quantitative Strategy2025.10 – 2026.01

LLM-Driven Factor Mining & ETF Rotation Strategy

End-to-end quant platform: LLM Agent factor discovery (GLM-4.5), multi-tier signal evaluation, rolling XGBoost/linear combination, and live ETF rotation trading.

PythonGLM-4.5pandasPolarsNumPyXGBoostLightGBMPlotlyJoblib

A complete quantitative trading platform integrating LLM-driven factor discovery, signal evaluation, combination optimization, and live ETF rotation trading.

LLM Agent Factor Mining: Designed a multi-model Agent system (GLM-4.5, Gemini) that simulates senior quant researchers. The system constructs context-aware prompts with market state analysis, knowledge base of historical quality factors (IC > 0.03, IC IR > 0.5, Sharpe > 1.5), and innovation direction identification. LLM outputs structured JSON factor expressions parsed through safe_eval sandbox execution with AST-level security checks (future function detection, code injection prevention, complexity limits).

Signal Evaluation Pipeline: Built a multi-tier evaluation framework: Rank IC analysis, 5-layer backtesting, and comprehensive scoring (IC mean/std/IR, Sharpe, max drawdown, Calmar, turnover, coverage). Implemented strict screening: IC > 0.02, IC IR > 0.1, Sharpe > 1.0, coverage > 80%, inter-factor correlation < 0.7. OOS validation uses IC/IR decay rate filtering (50% threshold) with direction consistency checks across in-sample (pre-2020) and out-of-sample (post-2021) periods.

Signal Combination: Supports equal-weight, ICIR-weighted, rolling linear regression, rolling XGBoost, and rolling LightGBM ensemble methods. Rolling models retrain periodically to adapt to market regime changes. HRP (Hierarchical Risk Parity) portfolio optimization with Leidoit-Wolf shrinkage for covariance estimation.

ETF Rotation Strategy: Daily rebalancing across A-share on-exchange ETFs with flexible price execution (open, close, VWAP, open-VWAP 1-5min, cross-day). Supports top-N selection, inverse volatility weighting, and multiple benchmarks (HS300, CSI500, CSI1000). Complete live trading system with automated data pipeline, scheduler, and WeChat notifications.

System Architecture: Modular design with operators registry, Parquet storage, Joblib parallel evaluation, and automated data update pipeline.

Highlights

  • LLM Agent factor mining (GLM-4.5 / Gemini)
  • Multi-tier evaluation: IC → layer backtest → OOS validation
  • Rolling XGBoost/linear/ICIR combination methods
  • Live trading with scheduler & notifications
All projects