EDA On Attribute variation with spotify sub genres

Different Approaches:

  • Content Based: Song are recommended based on content similarity to other songs
  • Collaborative: Models recommends based on preferences of similar users

Data Preprocessing

Find relevant columns and how to weigh them

  • Numeric
track_popularityyeardanceabilityenergykeyloudnessmodespeechinessacousticnessinstrumentalnesslivenessvalencetempoduration_ms
  • String / Categorical
track_artistplaylist_genreplaylist_subgenre

Normalize / Standardize the numeric data
Vectorize the categorical data

Data Processing

Perform PCA
Save each song as its combination of PCA

Finding Similarity

Perform cosine similarity operations and find most similar songs