YouTube API · Gemini API· Sentiment Analysis · ELM Theory · Bayesian Updating
Analyzed 5,700+ YouTube comments on Taylor Swift’s engagement news to explore how public sentiment evolves online. This project combined machine learning with persuasion theory to reveal dynamics of digital discourse.
Through this project, I realized that working with social media data requires both technical rigor and theoretical grounding. While the YouTube API provided large-scale comments efficiently, significant effort was needed for cleaning and normalization (duplicates, slang, emojis). This reinforced the importance of data preprocessing as decisive stage that shapes model reliability. On the modeling side, the Elaboration Likelihood Model (ELM) proved valuable instructuring the classification between central and peripheral persuasion pathways. Yet, the relatively lower accuracy on pathway labeling (88%) compared to sentiment classification (99%) highlighted that theoretical constructs are often harder to operationalize in real-world noisy data. It reminded me that applying social science theories to digital discourse is not only a technical task but also requires careful mapping between constructs and observable signals. Finally, the temporal analysis using Bayesian updating and herding tests showed how public sentiment can resemble financial market dynamics—initial overreaction, herd formation, and eventual correction. This analogy broadened my perspective: socialdata analysis is not limited to describing trends, but can also generate insightsfor policy, media strategy, and platform governance. In the future, I would extend the dataset to cross-platform comparisons (e.g., TikTok, Twitter) and refine the labeling scheme with multi-annotator validation to strengthen the robustness of the conclusions.