Python Sentiment Analysis of YouTube Comments

YouTube removed public dislikes from videos last year, the change has been controversial, but does it matter?

Last year, YouTube removed a long-standing feature on the platform: public dislike counts. This decision was controversial to say the least. YouTube tested this feature with select accounts throughout 2021 and started to roll out the change officially beginning November 10 of that year. The company justified this change by claiming in both a blog post and a video that the number of dislikes did not seriously affect the number of views on videos and that removing the public dislike counter greatly decreased “dislike attacks.”

However, not everyone agreed with YouTube’s reasoning.

Lots of people posted opinions, arguing that the change made it difficult for users to discern the usefulness of how-to tutorial videos. YouTube has become a powerful tool for developers, and how-to videos are widely used by developers to troubleshoot and practice new skills. Many suggest that removing the dislike count has forced users to watch multiple videos to find appropriate how-to videos, wasting valuable time which could have previously been saved by simply viewing the dislike counter.

To fix the problem, some suggested reading the comments to judge how useful the video was based on how others responded to it. For this reason, I wanted to answer the question: Does the general sentiment in the comments of a video affect the video’s view count?

Comment Collection

To answer this question, I wrote a Python script using the Selenium Python package to automate the Chrome browser and scroll down a few times on the YouTube Trending page. I did this on a browser that was not logged in to simulate the experience of a YouTube user without any watch history.

After the script scrolled down a few times, it collected all video links, excluding YouTube Shorts, that were visible on the trending page. I then used Google’s YouTube API to collect 100 comments from each video. In total, I collected about 8,000 comments from 80 different videos that were trending on October 18, 2022.

Python sentiment analysis can be done using several pip packages (which are basically add-ons to Python). I chose to use the vaderSentiment Python Package to categorize comments as positive, negative, or neutral. The model provided a compound popularity score between -1 and 1, with -1 being overwhelmingly negative, 1 being overwhelmingly positive, and 0 being neutral. To check how accurate the vaderSentiment model was, an independent coder and I hand-classified 30 comments that were randomly selected using a random number generator. We classified these comments as either positive, negative, or neutral and compared the results to the vaderSentiment model’s results.

Evaluation of the vaderSentiment Model

Unfortunately, the Python sentiment analysis only agreed with the human raters about 67% of the time, with a Cohen’s kappa score of 0.379 with two different human coders. The human coders were about 87% consistent with each other, with a Cohen’s kappa of 0.708.

The fact that two human coders agreed with each other about which comments were positive only 87% of the time does suggest that it is easier to gauge public sentiment using dislike counts than by reading comments. Even though human raters were more consistent with each other than with Python sentiment analysis, I still see value in using the model to analyze many comments very quickly. A correlation between average comment sentiment and the number of views on comparable videos would suggest that users are generally using comments effectively to gauge sentiment.

Results

The graph below shows the number of views of each video on the y-axis, while the ratio of positive comments to negative comments is plotted on the x-axis. As you can see, videos with fewer negative comments and more positive comments do tend to get more views.

Graph of negative/positive comment ratio vs number of views, graph shows a more concentrate cluster of data points as the number of negative comments compared with negative comments approaches zero less negative comments generally tend to get more views.

These results give at least a little support to YouTube’s claim that you can gauge the usefulness of a video based on the comments. Logic would tell us that less useful videos tend to get fewer views and more negative comments. However, this analysis would be better served by collecting more and more varied videos. The videos I collected for this Python sentiment analysis were all from the trending page, and they did tend to be positive. In the future, collecting all of the videos connected to a specific hashtag or query, such as “How to change a tire,” could be beneficial.

As is, we can take YouTube’s word for it that simply reading a few comments can help you decide which video to watch. I’m still a little skeptical, but it is certainly an interesting topic in the constant spread of negativity through social media. However, taking the time to read comments makes using YouTube as a troubleshooting tool for developers more time intensive. Developers should consider using automated sentiment analysis to analyze the usefulness of videos. At the very least, this saves them the time of reading the comments.