Work · 2026 · active

SpotifyScraper

Extract public Spotify data in Python — tracks, albums, playlists, podcasts, and lyrics — without an account, API key, or OAuth.

260 Stars
29 Forks
396 Commits

SpotifyScraper turns Spotify’s public web player into a typed Python interface. Point it at any public Spotify URL and it returns structured data — tracks, albums, artists, playlists, and podcasts, along with cover art and preview audio — without an account, an API key, or OAuth. No app registration, no client ID and secret: just the data the web player already shows any browser.

Try it live → — paste any track, album, or playlist link and watch it resolve in your browser, with the exact Python that does it. No install required.

The problem

Spotify’s official Web API is capable, but gated: it expects app registration, OAuth tokens, and rate-limited credentials, and it withholds some of what the public web player renders freely. For research, analytics, and small tools that only need to read public metadata, that overhead is a poor fit. SpotifyScraper closes the gap — the data the player already shows a browser, available as clean Python objects.

The approach

One early decision set the project’s character: read the public web player directly over HTTP rather than driving a headless browser. Browser automation is heavy, slow, and brittle under load; a focused HTTP client is faster, lighter, and easier to reason about. SpotifyScraper is built on httpx, with sync and async clients sharing one sans-io core, a single runtime dependency, and typed, frozen models. Extraction is two-tier — an anonymous access token plus Spotify’s GraphQL pathfinder, with an embed-page fallback when that shifts — behind a typed API and a command-line interface. Search rides the same anonymous tier; the authenticated extras (lyrics, transcripts, account state) ride a cookie-derived web token. Going direct means understanding the web player’s structure closely rather than rendering it blindly; that depth-over-brute-force tradeoff is the point.

What it does

  • Reads tracks, albums, artists, playlists, shows, and podcast episodes — plus an anonymous aggregate search across all of them.
  • Downloads cover art and ~30-second previews; fetches time-synced lyrics and podcast transcripts with a Spotify account cookie.
  • Exposes a typed API and a command-line interface, with sync and async clients sharing one sans-io core.
  • Speaks to agents and LLMs through an MCP server (spotifyscraper-mcp) — the same reads as batch tools, plus a one-call get_track_visuals for cover art, palette, and Canvas — shipped as a package extra and a container image.
  • Batch helpers (get_tracks, get_albums, …) resolve many inputs at once with per-item error handling, and an optional on-disk response cache keeps repeat reads cheap.
  • Adds localized display names (BCP-47), account-awareness (get_account / is_premium), polite rate-limit handling, and browser-assisted login when a cookie isn’t enough.
  • Ships to PyPI as spotifyscraper (MIT-licensed), with optional media / browser / cli extras and reference docs on Read the Docs.

Reliability is the hard part

Scraping a product you do not control means the target moves: the web player changes, and a scraper that worked yesterday can break tomorrow. SpotifyScraper treats that as an ongoing discipline rather than a one-time build — typed parsing that fails loudly instead of silently, documented error recovery, and a public issue history that keeps the project honest. The latest round shipped as v3.7.0: an MCP server that hands the whole library to Claude and other LLMs — batch tools plus a one-call get_track_visuals — on top of the anonymous search, podcast transcripts, and response cache added along the way, all tracking the current web player. That maintenance loop, not the first release, is what the project is really about.

Where it is used

Music analytics, academic and data-science research, content tooling, and personal projects — anywhere a typed, credential-free read of public Spotify data is simpler than the official API.

Guides

Step-by-step, each with runnable code:

Common questions

Can I get Spotify data without an account or API key? Yes. SpotifyScraper reads Spotify’s public web player, so it pulls track, album, artist, playlist, and podcast data without an account, an API key, app registration, or OAuth.

How do I download a Spotify track preview or cover art? The library can download cover art and the ~30-second preview clip whenever Spotify publishes one — see the live demo, which shows the exact Python for any link.

Does it use the official Spotify API? No. It’s an independent, unofficial reader of the public web player, not affiliated with or endorsed by Spotify, intended for public metadata, research, and small tools.

boiler room — ali@aliakhtari.com