Who “Won” Tomorrowland 2017?

Some of you may have read my last article where I analyzed 59 sets from EDC Las Vegas 2017. This time, I put on my big boy pants and analyzed 236 sets from both weekends of Tomorrowland in Belgium. This means I had more than 14,000 tracks played by ~200 DJ’s over the course of 2 weekends to feed my analysis. Similar to last time, this data was web scraped from 1001tracklists.com and I have made the code available on my github page if you wish to do something similar yourself. In addition, I also created a summary dashboard on Tableau Public if you wish to explore the data in more detail yourself.

*Note: I have added links to some songs along the way for your enjoyment, so please read on!

1. DJ Snake

DJ Snake had the the most tracks played at Tomorrowland of any DJ by far (65), which was 23 more than the next highest artist (Axwell /\ Ingrosso). In addition, his songs were played by a wide variety of artists (38), far and away the broadest reach of any artist playing at the festival (the next highest was Calvin Harris with 23). The main drivers behind his popularity were his two biggest hits “Propaganda” and “Let Me Love You”, which were the #2 and #5 most played songs overall.

For those who may be interested in a new spin on these (already overplayed) hits, the most popular remixes were the “Propaganda (Nom de Strip & TJR Remix)” and “Let Me Love You (Don Diablo Remix)”.

2. Ed Sheeran

As amazing as it would be to see Ed Sheeran working the turntables and screaming for the crowd to “put their hands up in the air”, sadly he was not. However, he still ended up having his songs played (in one form or another) by 22 different artists at the festival. This puts him at tied for third alongside of Axwell /\ Ingrosso and Valentino Khan. Much like Kendrick Lamar was for EDC, Ed is the most popular artist not appearing in person to have his tracks played (given that they are such similar artists this should come as no surprise *sarcasm*).

“Shape of You” led the way as the most common song played, but if anyone is looking to impress their friends with their Ed Sheeran discology, I would recommend checking out the most popular remix of “Castle on the Hill” by Gareth Emery & Ashley Wallbridge.

3. Hardstyle

While trap music may be slowly taking hold of the Americas, its older cousin hardstyle is alive and well in Europe. By using Python scikit-learn K-Means clustering and my limited knowledge of a few hardstyle artists, I was able to decipher which other artists fell into this genre. For me personally, exploring cluster 12 on the Tableau dashboard led to some entertaining artist (“Phuture Noize“) and song (“Destination“) discoveries. Worth noting: hardstyle is very high energy and is definitely not for everyone.

Note: I plan on writing a post that goes into the clustering in more detail, drawing a few more insights from the data and explaining the methodology.

4. Heads Will Roll (A-Trak Remix)

While mining this data set for new and exciting remixes I ran across this track, which was tied for 2nd as the most commonly played remix at Tomorrowland 2017 (12 plays across 11 DJs). Personally, I found this incredibly amusing since it was released 8 years ago (2009, if you cannot find your calculator). Therefore, this track is a winner for its popularity and longevity at a festival well known for revealing tracks never heard before.

It is also be worth noting that Don Diablo Remixes were incredibly popular (as can be seen on the left).

What is Next?

As I alluded to before, I tested out whether I could use python to cluster various DJs based upon the the tracks and artists they played. I plan on providing a more detailed analysis of this output soon.

Spoiler Alert: The clusters are on the Tableau Dashboard already… if you agree/disagree, leave a comment!

A Full Analysis of Every Song Played at EDC 2017

Every year I always enjoy the wealth of full live sets that get released on SoundCloud around this time. However, what I have noticed over the years is that a number of the sets sound quite similar, and there are a few songs/artists each year that dominate the airtime across a wide variety of sets.

So I set out to answer the following questions:

  1. Who were the hottest artists at EDC 2017?
  2. What were the the hottest songs?
  3. How can I investigate the relationships between all the sets that were played?

Getting the Tracklist Data

Since there is no central repository or database where you can simply download structured tracklist data, I was forced to web scrape the data from 1001tracklists.comusing python. Luckily, the BeautifulSoup module helped simplify the data extraction process for the 59 sets that I scraped. Once I had all the csvs, I used Pandas to combine, and parse the important fields out of the data such as: track artist, simple track title (without any of the remix information), all the featured artists, and the set it was played in. Parts of this code have been made publicly available on my github page.

Initial Insights

Link to interactive dashboard on Tableau Public: HERE

I was surprised to see that Boombox Cartel got the most plays of any artists at EDC (28 plays across 11 sets, with the most popular song being “Jefe”).

Additionally, I found it amusing that the second most played artist at EDC was not even there… Kendrick Lamar got a whopping 26 plays across 15 separate sets.

When the track_basic field is expanded within the “Most Played Songs” chart, you can find the most commonly played remixes. I personally have found this a gold mine for new twists on old classics, especially for some of the songs which were beginning to feel a bit overplayed as originals (I’m looking at you Propaganda).

Going One Deeper

Although the summary values were interesting, I wanted to explore the complex connections between all the sets, songs, and artists a bit more. To do this, I used Python NetworkX, which allowed me to use the power of graph analysis to explore the complex relationships between all the entities.

(Above) Details from the Noisecontrollers set

Nodes:

  • DJ (red icon): The artist playing the set at EDC (can also create songs)
  • Track Artist (blue icon): An artist not at EDC that contributed in some way to creating the song
  • Song (gray icon): The track that was played in the set at EDC

Connections (aka Edges):

  • Played (red edge): When a DJ plays a track they did not create in their set
  • Created/Played (purple edge): When a DJ plays a track they created in their own set
  • Created (blue edge): When an Artist contributes to the creation of a song that was played during an EDC set

Link to the interactive dashboard on Tableau Public: HERE

This type of view allows me to search “Kendrick Lamar”, and see all the DJ’s who played his songs, which songs they played, and any other artists that may have collaborated with Kendrick on the various songs. (See example below)

I could also search the song “M.A.A.D City”, to see which sets it was played during, and which artists helped create it.

The views become increasingly complex if you search an artist (like Flosstradamus) who played a full set, but also had their songs played in many other sets (see below).

This is where some of the summary values at the top of the dashboard help. We can see Flosstradamus songs were involved in one way or another in 18 sets, and that there were 50 artists that either collaborated on Flosstradamus songs, or created songs that Flosstradamus played during their set. I find that this is illustrative of the fact that each of the sets played does not happen in a bubble, and that it really involves many members of the community to make it happen (everyone from Darude to Pitbull).

This network view is quite dynamic and there are many more interesting nuggets that are still undiscovered… check it out and let me know what you find!