Spotify Integration: Playlists and Listening History
Spotify.RmdThis vignette covers the Spotify side of lyricsLab: authenticating, pulling your listening data, analyzing entire playlists, and visualizing listening patterns.
Authentication
You need a Spotify app before you start. Go to https://developer.spotify.com/dashboard, create an app,
and set the redirect URI to exactly
http://127.0.0.1:1410/.
Store your credentials in your environment — the easiest way is to
add them to ~/.Renviron so they load automatically every
session:
SPOTIFY_CLIENT_ID=your_client_id
SPOTIFY_CLIENT_SECRET=your_client_secret
GENIUS_API_TOKEN=your_genius_token
Then authenticate:
This opens a browser window. Log in, authorize the app, and you’ll be
redirected back automatically. The token is saved to your global
environment for the session. It expires after about an hour — just run
spotify_auth() again if you hit auth errors.
The scopes requested are: recently played, top tracks, saved library, and private playlists.
Your listening history
get_recently_played() returns your 50 most recent plays
(the maximum Spotify allows per request):
history <- get_recently_played()
#>
#> === Recently Played ===
#> Tracks retrieved: 50
#> From: 2024-01-10 08:14
#> To: 2024-01-12 23:47
#> Time span: 63.5 hoursThe returned data frame has these columns: played_at,
track_name, track_id,
artist_name, artist_id,
album_name, album_id,
duration_ms, popularity.
Quick summaries directly from the data frame:
# Most played artists
sort(table(history$artist_name), decreasing = TRUE)
# Most played tracks
sort(table(history$track_name), decreasing = TRUE)
# Average popularity of what you've been listening to
mean(history$popularity, na.rm = TRUE)Listening patterns
plot_listening_patterns() takes the history data frame
and a type argument:
# By hour of day
plot_listening_patterns(history, type = "hourly")
# By date (trend over time)
plot_listening_patterns(history, type = "daily")
# By day of week
plot_listening_patterns(history, type = "weekly")Listening diversity
calculate_listening_diversity() measures how spread out
your listening is across artists using Shannon entropy, producing a
0–100 diversity score:
calculate_listening_diversity(history)
#>
#> === Listening Diversity ===
#> Unique artists: 23
#> Unique tracks: 47
#> Total plays: 50
#> Top artist: Taylor Swift (18.0% of plays)
#> Diversity score: 71.4 / 100
#> Interpretation: Moderately diverse - good mix of variety and favoritesA score above 80 means you explore broadly; below 40 means you mostly stick to a handful of artists.
Your top tracks
get_top_tracks() pulls your Spotify top tracks. The
time_range argument controls the window:
# Last 4 weeks
recent_tops <- get_top_tracks(time_range = "short_term", limit = 20)
# Last 6 months (default)
medium_tops <- get_top_tracks(time_range = "medium_term", limit = 20)
# All time
all_time <- get_top_tracks(time_range = "long_term", limit = 50)
# View ranked list
all_time[, c("rank", "track_name", "artist_name", "popularity")]Playlist tracks
get_playlist_tracks() accepts a full Spotify URL or just
the playlist ID:
playlist <- get_playlist_tracks(
"https://open.spotify.com/playlist/37i9dQZF1DXcBWIGoYBM5M"
)
# Or just the ID
playlist <- get_playlist_tracks("37i9dQZF1DXcBWIGoYBM5M")This works for playlists of any size. The returned data frame
includes track_name, artist_name,
album_name, popularity,
duration_ms, added_at, and
added_by.
Playlist lyrics analysis
analyze_playlist() combines playlist fetching with
Genius lyrics retrieval and full analysis for every song. It calls
get_lyrics(), sentiment_score(),
complexity_score(), and theme_detection() for
each track, with rate limiting between requests to avoid hitting API
limits.
analysis <- analyze_playlist(
"https://open.spotify.com/playlist/37i9dQZF1DXcBWIGoYBM5M",
max_songs = 20, # limit for speed, default 50
delay = 2 # seconds between requests
)Progress is printed as each song is processed:
Analyzing lyrics for 20 songs...
[1/20] Anti-Hero by Taylor Swift
+ Sentiment: -0.21 | Complexity: 28.4/100 | Theme: Self-Reflection
[2/20] Flowers by Miley Cyrus
+ Sentiment: 0.84 | Complexity: 31.2/100 | Theme: Empowerment
...
=== PLAYLIST ANALYSIS COMPLETE ===
Playlist: Today's Top Hits
Analyzed: 17/20 tracks
Avg sentiment: 0.12
Avg complexity: 29.8/100
Most common theme: Love/Romance
Songs that can’t be found on Genius (instrumentals, very new releases, non-English tracks) are skipped and listed at the end.
Playlist plots
Each plot function takes the analysis result directly:
# Distribution of sentiment scores across the playlist
plot_sentiment(analysis)
# Distribution of complexity scores
plot_complexity(analysis)
# Which themes appear most across the playlist
plot_themes(analysis)
# Is there a relationship between complexity and sentiment?
plot_scatter(analysis)
# The 10 most lyrically complex songs
plot_top_complex(analysis)