
An activist archiving group says it has scraped a massive portion of Spotify’s $SPOT ( ▼ 8.49% ) music catalog, potentially creating one of the largest unauthorized music archives ever assembled.
Anna’s Archive, which describes itself as an “open source search engine for shadow libraries,” says the project is meant as a long-term “preservation archive” of modern music rather than a commercial piracy effort. Spotify says it’s actively investigating.
The numbers are massive
According to Anna’s Archive, the scrape includes roughly 86 million audio tracks, compared with Spotify’s stated library of more than 100 million songs. The group has already released a database containing 256 million rows of metadata, with plans to distribute the actual audio files later via peer-to-peer networks.
If fully released, the archive could total hundreds of terabytes, dwarfing existing open music databases like MusicBrainz, which contains around five million tracks.
Why this is bigger than piracy
Beyond copyright concerns, the scale of the data immediately raised eyebrows across the AI world. A publicly accessible corpus of tens of millions of labeled music files would be an enormous prize for companies training generative audio and music models.
Some industry observers noted that, in theory, anyone with enough storage could recreate a personal, offline version of Spotify, with the main barriers being copyright enforcement and legal risk.
Spotify responds
Spotify confirmed it identified and disabled accounts involved in the scrape, saying the group used illicit tactics to bypass protections. The company added it has implemented new safeguards and reiterated its stance against piracy, emphasizing protection of artists and rights holders.
As of now, only metadata has been publicly released. But if the audio files follow, this could become one of the most consequential copyright clashes the streaming industry has faced in years.