
Hello world! Welcome to report where I'm using machine learning to analyze tweets about specified topic and present results in form of various and easy to understand charts. This sentiment analysis algorithm has been developed as part of my Master Thesis in 2017/2018.
This report is currently being published exclusively here on Steemit.

Parameters
Today's analysis has been executed on tweets which contain word "Mcgregor" and were published between 2013-01-01 and 2017-12-31. Detailed specification of the data is shown in the following list:- Keyword: McGregor
- From: 2013-01-01
- To: 2017-12-31
- Number of analyzed tweets: 60000
- Language: en
- Geographical location: Not specified

Results
Sentiment
After downloading 60000 tweets between the specified dates, sentiment analysis has been executed on each and every one of those tweets. Sentiment score has been then aggregated over weeks and months, to lower the granularity of results on the time axis and then plotted as a following linechart.
Sentiment of tweets for keyword "McGregor"
My subjective comment on the chart: I think there's no arguing here that the sentiment of Tweets is declining steadily. I personally believe it's half the envy of people as Conor got rich and successful and half his trashtalk which really got offensive and over the border in last couple of years.
Aggregation using heatmaps
To show the general trend/pattern in the sentiment, linechart works great. We can see the bigger timeframe and estimate the long-term direction. But if you're interested in particular month or week, it's hard and in case of weeks actually impossible to see the change. Has an athlete put the great performance in particular match? Has the brand/company released a new line of product? So see such low lever changes, following 2 heatmaps are to be used.

Chart shows average sentiment per month where 0.50 is the worst and 0.78 the best achieved score
My subjective comment on the chart: Oh wow, I looove this chart. Look where the lowest and highest sentiment accured. Exactly around his 2 leendary fights. The most positive score is in October 2016 as there was anticipated a huuuge bout between him and Eddie Alvarez. It was the first time ever, UFC fighter could potentially hold 2 belts simultaneously. The wost sentiment can be seen in August 2017. Why? Well, he fought Floyd Mayweather and there were probably many many tweets he will "lose", "get his ass kicked" etc etc. This chart doesnt lie ;) Another interesting pattern I've noticed is that sentiment always gets bigger one month before his fight - e.g. in the 2016, he fought in March, August and November. Sentiment peaked 3 times - in February, July and October :) What's also worth noticing is that with his boxing bout, the opposite occurred. Sentiment was bad in the months leading to the fight, but once it happened, sentiment rose up higher.

Chart shows average sentiment per week where 0.50 is the worst and 0.79 the best achieved score
My subjective comment on the chart: This chart basically "zooms" in. It might be that the granularity is bit too much. We can clearly see increase when May-Mac fight was announced and also the week of his Aldo fight for the featherweight belt has huge sentiment.
Most frequently used words
Another very interesting aspect to look into are the repeatedly used words using wordclouds. Even more interesting is to compare two wordclouds generated from different time - usually before and after some event/change happened. If you give this a second though, the problem here is that many short words (like "and", "or", "with" and so on) are used almost in every sentence and would also show up in wordclouds. To mitigate this, I've removed list of 153 so called stopwords. Additionally I've also removed words typical for this area listed in the end of the report*.

My subjective comment on the chart: <3 I totally love data science :D We can very nicely see that before December 2015, two names were often used with Conor's - Aldo and Frankie (two featherweight kings). After 2015, when Conor left division, Nate Diaz or Floyd Mayweather are the big names. It's also interesting to see the word will in the before 2015 wordcloud. I think it comes from tweets which were talking about Conor as a future champion.
Most frequently used UNIQUE words
As we can see in the previous worldcloud, there are many words which are actually shared in both wordclouds. That makes all the sense as there are many areas which will be forever connected with . But I went one step further and decided to create wordclouds which contain only unique words with don't appear in the opposite wordcloud.
Most often UNIQUE used words in tweets containing word "McGregor" before and after 2015-12-12.
My subjective comment on the chart: This chart shows similar results as the previous one, only bit more amplified. I can clearly see following points:
- Biggest change are oviously two words - boxing & suck. THere's no doubt these talk about Conor's bout and skills against Mayweather.
- Positive words as truly, love, star, genius are very popular during Conor's rise to stardom
- When he really became a household name because of Mayweather boxing match, words like ESPN, Payout occurred as they're closely connected to boxing and his big payday for the fight
- Word bottle shows how huge topic his incident with Nate Diaz was. Conor threw a bottle and a Monster can on Nate.
- Huge is also a word refuse which definitely occurred in tweets about Conor being pulled from UFC200 as he refuse to fly over to US for press conference.

Get your report - Christmas present!
- Comment down the parameters of report you're interested in
- Resteem this post
- No upvote required :)
Thanks for reading! Matko.

You can find my latest posts here:
š My STEEMMONSTERS trophies/scalps š
![]() Reached #16 in leaderboard | ![]() One-sided win over JoeParys |
![]() Proud owner of legendary Hydra |
---|---|---|


Join us @steemitbloggers
Animation By @zord189