Retail Sentiment Trading Signals
Python NLP FinBERT Time Series
Context
Can retail investor sentiment on social media predict short-term equity price movements?
Data & Modeling
Built an NLP pipeline with VADER and FinBERT to score Reddit/Twitter posts, then trained classifiers on sentiment-price lag features.
Results
Achieved statistically significant predictive signal for 1–3 day returns; backtested strategy returned ~12% annualized alpha on selected tickers.
Takeaways
Incorporate real-time streaming data and test on a broader universe of mid-cap stocks.
Evaluation
- 1–3 day forward returns; Reddit/Twitter posts (2020–2023)
- Split: 70% train / 15% validation / 15% test, walk-forward
- Metrics: annualized alpha (~12%), Sharpe, directional hit rate