Haystack was the primary storage storage system designed initially for Facebook’s Photos application. Its been around for almost 7 years … More
Category: Distributed Storage
From research to practice: experiences engineering a production metadata database for a scale out ï¬le system
Abstract HP’s StoreAll with Express Query is a scalable commercial file archiving product that offers sophisticated file metadata management and … More
Replicated Data Consistency Explained Through Baseball
A key feature of all distributed storage systems is their ability to replicate data not just across machines within a … More
Split Query Processing in Polybase
Abstract This paper presents Polybase, a feature of SQL Server PDW V2 that allows users to manage and query data … More
F1: A Distributed SQL Database That Scales
With both the F1 and Spanner papers out its now possible to understand their interplay a bit holistically. So lets … More
imGraph: A distributed in-memory graph database
Abstract Diverse applications including cyber security, social networks, protein networks, recommendation systems or citation networks work with inherently graph-structured data. … More
Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams
Abstract A geographically distributed system for joining multiple continuously flowing streams of data in real-time with high scalability and low … More
Avatara: OLAP for Web-scale Analytics Products
The highlight of this system is a clear separation of the cube computation engine and the query serving engine of … More
An Efï¬cient Multi-Tier Tablet Server Storage Architecture
This work presents a new, highly scalable, and efficient TSSL architecture called the General Tablet Server Storage Layer or GTSSL. … More
To BLOB or Not To BLOB
To decide on a mechanism for storing a large number of files and querying them based on metadata we have … More