Skip to content

Systems We Make

Curating Complex Systems

Category: Distributed Storage

F4: Facebook's Warm BLOB Storage System

Haystack was the primary storage storage system designed initially for Facebook’s Photos application. Its been around for almost 7 years … More

From research to practice: experiences engineering a production metadata database for a scale out file system

Abstract HP’s StoreAll with Express Query is a scalable commercial file archiving product that offers sophisticated file metadata management and … More

Replicated Data Consistency Explained Through Baseball

A key feature of all distributed storage systems is their ability to replicate data not just across machines within a … More

distributed database, distributed storage system, eventual consistency

Split Query Processing in Polybase

Abstract This paper presents Polybase, a feature of SQL Server PDW V2 that allows users to manage and query data … More

distributed data warehouse, distributed database, hadoop data warehouse, split query processing, sql on hadoop

F1: A Distributed SQL Database That Scales

With both the F1 and Spanner papers out its now possible to understand their interplay a bit holistically. So lets … More

distributed database, massively parallel databases

imGraph: A distributed in-memory graph database

Abstract Diverse applications including cyber security, social networks, protein networks, recommendation systems or citation networks work with inherently graph-structured data. … More

distributed graph processing

Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams

Abstract A geographically distributed system for joining multiple continuously flowing streams of data in real-time with high scalability and low … More

fault tolerance, paxos, stream processing

Avatara: OLAP for Web-scale Analytics Products

The highlight of this system is a clear separation of the cube computation engine and the query serving engine of … More

distributed data warehouse, distributed database, hadoop data warehouse

An Efficient Multi-Tier Tablet Server Storage Architecture

This work presents a new, highly scalable, and efficient TSSL architecture called the General Tablet Server Storage Layer or GTSSL. … More

distributed database

To BLOB or Not To BLOB

To decide on a mechanism for storing a large number of files and querying them based on metadata we have … More

blob storage

Posts navigation

Older posts
Blog at WordPress.com.
Systems We Make
Blog at WordPress.com.
Cancel