A Full Analysis of Every Song Played at EDC 2017

Every year I always enjoy the wealth of full live sets that get released on SoundCloud around this time. However, what I have noticed over the years is that a number of the sets sound quite similar, and there are a few songs/artists each year that dominate the airtime across a wide variety of sets.

So I set out to answer the following questions:

  1. Who were the hottest artists at EDC 2017?
  2. What were the the hottest songs?
  3. How can I investigate the relationships between all the sets that were played?

Getting the Tracklist Data

Since there is no central repository or database where you can simply download structured tracklist data, I was forced to web scrape the data from 1001tracklists.comusing python. Luckily, the BeautifulSoup module helped simplify the data extraction process for the 59 sets that I scraped. Once I had all the csvs, I used Pandas to combine, and parse the important fields out of the data such as: track artist, simple track title (without any of the remix information), all the featured artists, and the set it was played in. Parts of this code have been made publicly available on my github page.

Initial Insights

Link to interactive dashboard on Tableau Public: HERE

I was surprised to see that Boombox Cartel got the most plays of any artists at EDC (28 plays across 11 sets, with the most popular song being “Jefe”).

Additionally, I found it amusing that the second most played artist at EDC was not even there… Kendrick Lamar got a whopping 26 plays across 15 separate sets.

When the track_basic field is expanded within the “Most Played Songs” chart, you can find the most commonly played remixes. I personally have found this a gold mine for new twists on old classics, especially for some of the songs which were beginning to feel a bit overplayed as originals (I’m looking at you Propaganda).

Going One Deeper

Although the summary values were interesting, I wanted to explore the complex connections between all the sets, songs, and artists a bit more. To do this, I used Python NetworkX, which allowed me to use the power of graph analysis to explore the complex relationships between all the entities.

(Above) Details from the Noisecontrollers set

Nodes:

  • DJ (red icon): The artist playing the set at EDC (can also create songs)
  • Track Artist (blue icon): An artist not at EDC that contributed in some way to creating the song
  • Song (gray icon): The track that was played in the set at EDC

Connections (aka Edges):

  • Played (red edge): When a DJ plays a track they did not create in their set
  • Created/Played (purple edge): When a DJ plays a track they created in their own set
  • Created (blue edge): When an Artist contributes to the creation of a song that was played during an EDC set

Link to the interactive dashboard on Tableau Public: HERE

This type of view allows me to search “Kendrick Lamar”, and see all the DJ’s who played his songs, which songs they played, and any other artists that may have collaborated with Kendrick on the various songs. (See example below)

I could also search the song “M.A.A.D City”, to see which sets it was played during, and which artists helped create it.

The views become increasingly complex if you search an artist (like Flosstradamus) who played a full set, but also had their songs played in many other sets (see below).

This is where some of the summary values at the top of the dashboard help. We can see Flosstradamus songs were involved in one way or another in 18 sets, and that there were 50 artists that either collaborated on Flosstradamus songs, or created songs that Flosstradamus played during their set. I find that this is illustrative of the fact that each of the sets played does not happen in a bubble, and that it really involves many members of the community to make it happen (everyone from Darude to Pitbull).

This network view is quite dynamic and there are many more interesting nuggets that are still undiscovered… check it out and let me know what you find!