Abstract :
We have developed Ceph, a distributed ï¬le system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs). We leverage device intelligence by distributing data replication, failure detection and recovery to semi-autonomous OSDs running a specialized local object ï¬le system. A dynamic distributed metadata cluster provides extremely efï¬cient metadata management and seamlessly adapts to a wide range of general purpose and scientiï¬c computing ï¬le system workloads. Performance measurements under a variety of workloads show that Ceph has excellent I/O performance and scalable metadata management, supporting more than 250,000 metadata operations per second.