Wang Jiachen
Back to projects
Quantitative Research2023.11 – 2025.01

Index Enhancement Strategy with XGBoost

Full-stack index enhancement: factor engineering, XGBoost with linear leaf nodes, and adaptive market-style model fusion.

PythonXGBoostpandasNumPyNumbaPolarscuDF

Index enhancement strategy developed during internship at a quantitative asset management firm.

Factor Engineering: Extracted alpha factors from multi-source heterogeneous data including daily/minute K-lines, tick-by-tick trades, limit order books, financial statements, and consensus estimates. Applied Pandas/NumPy/Numba for vectorized and parallel computation, with Polars and cuDF for large-scale data processing. All factors encapsulated as reusable class modules with standardized APIs.

Model Innovation: Used XGBoost as the core model with Bayesian hyperparameter optimization. Improved model structure by replacing leaf node outputs with linear regression predictions for enhanced accuracy. Implemented Refresh Tree and Build Tree incremental learning for rolling training.

Style Adaptation: Identified market cap style reversal phenomena, constructed sample-adaptive partitioning to train separate models for large-cap and small-cap stocks. Probability-weighted fusion of dual-model predictions significantly improved cross-sectional prediction capability.

Highlights

  • Multi-source alpha factor engineering
  • XGBoost with linear leaf nodes
  • Adaptive large/small-cap model fusion
  • Incremental learning for rolling training
All projects