Skip to content

Systems We Make

Curating Complex Systems

F1: A Distributed SQL Database That Scales

With both the F1 and Spanner papers out its now possible to understand their interplay a bit holistically. So lets…

distributed database, massively parallel databases

MillWheel: Fault-Tolerant Stream Processing at Internet Scale

MillWheel is a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation…

imGraph: A distributed in-memory graph database

Abstract Diverse applications including cyber security, social networks, protein networks, recommendation systems or citation networks work with inherently graph-structured data.…

distributed graph processing

Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams

Abstract A geographically distributed system for joining multiple continuously flowing streams of data in real-time with high scalability and low…

fault tolerance, paxos, stream processing

Phoenix, an implementation of MapReduce for shared-memory systems

Abstract : Phoenix, an implementation of MapReduce for shared-memory systems that includes a programming API and an efficient runtime system.…

map reduce

Avatara: OLAP for Web-scale Analytics Products

The highlight of this system is a clear separation of the cube computation engine and the query serving engine of…

distributed data warehouse, distributed database, hadoop data warehouse

Ceph: A Scalable, High-Performance Distributed File System

Abstract : We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the…

distributed file systems

SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets

Companies providing cloud-scale services have an increasing need to store and analyze massive data sets such as search logs and…

data-parallel programming

An Efficient Multi-Tier Tablet Server Storage Architecture

This work presents a new, highly scalable, and efficient TSSL architecture called the General Tablet Server Storage Layer or GTSSL.…

distributed database

Iterative Map Reduce – Prior Art

There have been several attempts in the recent past at extending Hadoop to support efficient iterative data processing on clusters.…

data-parallel programming, distributed programming

Posts navigation

Older posts
Newer posts
Blog at WordPress.com.
Systems We Make
Blog at WordPress.com.
Cancel