Skip to main content
← Back to Projects

Retail Sentiment Trading Signals

Python NLP FinBERT Time Series

Context

Can retail investor sentiment on social media predict short-term equity price movements?

Data & Modeling

Built an NLP pipeline with VADER and FinBERT to score Reddit/Twitter posts, then trained classifiers on sentiment-price lag features.

Results

Achieved statistically significant predictive signal for 1–3 day returns; backtested strategy returned ~12% annualized alpha on selected tickers.

Takeaways

Incorporate real-time streaming data and test on a broader universe of mid-cap stocks.

Evaluation

  • 1–3 day forward returns; Reddit/Twitter posts (2020–2023)
  • Split: 70% train / 15% validation / 15% test, walk-forward
  • Metrics: annualized alpha (~12%), Sharpe, directional hit rate