Data is updated weekly via automated pipeline. Each update adds the latest week of trading data for all 1,391 tickers across all three cleaning versions.
v1.0 — April 9, 2026 (Initial release)
- 1,391 U.S. equities and ETFs
- Three cleaning versions: Raw, Clean, Filled
- Coverage: January 1991 through April 2026 (majority from December 2002)
- Primary source: PiTrading consolidated tape (pre-March 2022)
- Secondary source: IEX Exchange HIST via Alpaca (post-March 2022)
- Nine-step cleaning pipeline
- 27 pre-computed academic variables per ticker per day
- Parquet file format
- REST API with JSON, CSV, and parquet response formats
Data processing timeline
| Date | Event |
| 2026-03-25 | IEX/Alpaca data backfill completed (March 2022 – March 2026) |
| 2026-03-30 | Three-version analysis pipeline built (Raw, Clean, Filled) |
| 2026-04-03 | Holistic audit and cleaning pipeline finalized |
| 2026-04-07 | Brownian bridge sigma estimation bug fixed; Monte Carlo recalibrated |
| 2026-04-08 | Full-dataset bridge grid (1,391 tickers × 5 sigmas) completed |
| 2026-04-08 | Full-dataset 4-method filling comparison completed |
| 2026-04-09 | v1.0 released |