4000 Middlefield Road, Room H-1
Palo Alto, California 94105


Gracenote: A Case Study in Massively Scalable B2B2C System Architecture

Gracenote is the home of CDDB. When you insert a CD into your player, there is nothing on the CD indicating its disc name or track titles or artists. There's only the length of each track to identify it. But it turns out that's a virtual fingerprint -- almost unique -- and Gracenote has created a database and recognition technology that allows them to automatically identify virtually any CD ever made. Their technology is used in almost every CD and MP3 player application, including Apple's iTunes, as well as in many consumer electronic devices. Since the initial debut of their CD recognition service in 1996, then known as CDDB, Gracenote has expanded its technology into new forms of media recognition, including music search engines and audio file recognition.

Gracenote has built a system that can scale easily to serve a massive number of users (essentially without practical limit), though their current design doesn't allow for their dataset to scale without restriction.

How did they do that? One of their simplifying assumptions is that their dataset, consisting mostly of information of limited dimensions (there are only so many albums and movies in the world), will fit comfortably into the space afforded by each individual server of a given type (there are many types, however). Data is cleverly segmented to fit in this way, rather than arbitrarily divided as in a cluster. While they're a B2B company, what they do is mostly direct end user interaction -- most of their traffic is through partners, but initiated by end users. Thus B2B2C.

Free for SDForum Members, $15.00 for non-members.

