Calculates 103 firm characteristics from CRSP + Compustat directly in Python – no WRDS SAS cloud
-
Updated
Feb 9, 2023 - Python
Calculates 103 firm characteristics from CRSP + Compustat directly in Python – no WRDS SAS cloud
Data matching for corporate governance research
This GitHub repository shows data collection and analysis for “Regulatory Fragmentation” paper by Kalmenovitz, Lowry and Volkova, The Journal of Finance (Forthcoming)
Fuzzy match entity names (primarily persons and companies) across databases
Pipeline dealing with WRDS (Wharton Research Data Services) datasets including crsp, master, etc, in order to build mega-database for scaling in Market Microstructure research
Repository for CQA: How much new information is there in earnings? Reproducible Empirical Accounting Research. Ass III
Replication code for "The Shape of Beta: Industry Factor Structure and Crisis Risk Premium" (Woo & Kim, 2026)
End-to-End Python implementation of Mo et al.'s (2025) ACT-Tensor methodology; a tensor completion framework for financial dataset imputation. Implements cluster-based CP decomposition, HOSVD factor extraction, temporal smoothing (CMA/EMA/Kalman), and downstream asset pricing evaluation. Transforms sparse data into dense machine readable data.
Academically rigorous implementation of the Fama-French (2015) five-factor model using WRDS (CRSP + Compustat) data.
Empirical analysis of ESG performance and financial returns using CRSP, Compustat, and Refinitiv data. Panel of 18K+ U.S. firm-years (2013–2023). Covers multi-database merging, OLS/panel regressions with fixed effects, and industry-level double materiality classification. Python · pandas · statsmodels · WRDS
Academically rigorous implementation of the Fama-French (1993) three-factor model using WRDS (CRSP + Compustat) data.
Rolling-window XGBoost cross-sectional return prediction model for US equities (1995-2024). Annualized Sharpe 2.53, monthly alpha 4.36% (t=13.24), market beta 0.73 over 300 months out-of-sample.
按决策难度匹配 Agent 介入方式的智能数据准备系统 | Intelligent Data Preparation Agent
ML pipeline for monthly U.S. equity return prediction using CRSP / Compustat / JKP factor characteristics. Implements OLS, Ridge, Lasso, XGBoost, and MLP models with rolling-window evaluation and IC analysis.
Idiosyncratic volatility and abnormal returns during VIX spike events empirical study using a survivorship-bias-free CRSP universe (2005–2024)
Minimal PEAD (post-earnings announcement drift) backtest using Wharton Research Data Services (IBES + CRSP) — Python pipeline for research & plots.
End-to-end Python replication of 'The Value of Information: A Puzzle' (Kadan et. al, 2026). Estimates equilibrium dollar value of private information in US equity markets via discrete quadratic covariation of 1-min NYSE TAQ price changes & signed order flow. Implements CLNV trade signing, Amihud filtering, 2-way FE regressions, & SDF entropy bounds
CRSP-based analysis of long-run US stock returns, measuring 10-year, 30-year, and full-life holding-period outcomes while explicitly accounting for delistings and survivorship bias.
Add a description, image, and links to the crsp topic page so that developers can more easily learn about it.
To associate your repository with the crsp topic, visit your repo's landing page and select "manage topics."