PNUTS is a geographically distributed database developed and in use at Yahoo!. PNUTS works off a simple relational model in which data is organized as tables and columns. It allows modifications to the schema without halting queries and updates but the current version does not seem to support constraints.
Consistency Model: It is based on a consistency model that could be placed between the two extremes of serializability and eventual consistency. PNUTS provides per record timeline consistency where each record is designated a master replica and all updates to that record are forwarded to the master. Based on workload characteristics the master replica for a record can adaptively shift.
Each record has a sequence number which is made of a generation identifier and a version number of that record. Whenever a new record is inserted the generation changes where as version numbers change on updates. These two together form the basis for a timeline scheme. In this record level mastering mechanism mastership is assigned on a per record basis and this implies that different records of the same table can be mastered in very different machines/clusters. This is desirable because the web workloads showed high write locality on per record basis. About 85 percent of the writes, within a week, to a record originated in the same data center.
A read of this record from any of the replicas will return a version of the record that is consistent with this timeline. Using this consistency model a number of APIs with varying consistency guarantees. Read-any (returns any version of the record), Read-critical(required-version) (returns a version of the record that is newer than the supplied version number), Read-latest, Write etc are some of the APIs that it offers.
It also has interesting publish-subscribe features by which applications can subscribe to updates happening on a table. Its good to see a database engine offer such a feature natively. In PNUTS this topic based pub/sub mechanism (they use Yahoo! Message Broker developed in house) forms the backbone for reliability, replication and recovery.
In case of multi-record requests, a specialized component called the Scatter Gather Engine, is responsible for orchestrating all the activity that includes issuing the calls to different storage units in parallel, aggregating their responses and responding back to the client. When a request deals with only a single record then it is forwarded directly to the storage unit that holds the record.
When Range queries are issued the top K results are returned from the server along with a Continuation object. This object, essentially a forward pointer, contains a modified range query which when issued restarts the range scan at the point where the previous query left off. As of now no complex queries are supported.
Its currently offered as a centrally hosted service. They seem to workaround the problems created by varying workloads of different applications by assigning applications to different storage units. Applications within Yahoo! that use PNUTs include social applications, comparison shopping applications. In a few cases its also used as a generic session state server.
Abstract: We describe PNUTS, a massively parallel and geographically distributed database system for Yahoo!â€™s web applications. PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of concurrent requests including updates and queries, and novel per-record consistency guarantees. It is a hosted, centrally managed, and geographically distributed service, and utilizes automated load-balancing and failover to reduce operational complexity. The ï¬rst version of the system is currently serving in production. We describe the motivation for PNUTS and the design and implementation of its table storage and replication layers, and then present experimental results.
Previewing from http://research.yahoo.com/files/pnuts.pdf