Curt Monash recently attended a “bit of a bash” with, among others, Ken Rudin of Zynga, the social gaming company. Of particular interest was Zynga’s approach to analytic database design:
- Data is divided into two parts. One part has a pretty ordinary schema; the other is just stored as a huge list of name-value pairs. (This is much like eBay‘s approach with its Teradata-based Singularity, except that eBay puts the name-value pairs into long character strings.) About half the data is in each part, but I don’t think that’s by deliberate choice.
- Zynga adds data into the real schema when it’s clear it will be needed for a while. This isn’t a matter of query volumes, for the most part; rather, it’s when Zynga’s tests (e.g. of new games?) have determined that the data will keep being collected and used for a while.
- Zynga only adds columns to its analytic database; it never goes through the more complex process of deleting them.
Read the Full Story