Skip to content

Systems We Make

Curating Complex Systems

Author: Hari

F1: A Distributed SQL Database That Scales

With both the F1 and Spanner papers out its now possible to understand their interplay a bit holistically. So lets … More

distributed database, massively parallel databases

MillWheel: Fault-Tolerant Stream Processing at Internet Scale

MillWheel is a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation … More

imGraph: A distributed in-memory graph database

Abstract Diverse applications including cyber security, social networks, protein networks, recommendation systems or citation networks work with inherently graph-structured data. … More

distributed graph processing

Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams

Abstract A geographically distributed system for joining multiple continuously flowing streams of data in real-time with high scalability and low … More

fault tolerance, paxos, stream processing

Phoenix, an implementation of MapReduce for shared-memory systems

Abstract : Phoenix, an implementation of MapReduce for shared-memory systems that includes a programming API and an efficient runtime system. … More

map reduce

Avatara: OLAP for Web-scale Analytics Products

The highlight of this system is a clear separation of the cube computation engine and the query serving engine of … More

distributed data warehouse, distributed database, hadoop data warehouse

Ceph: A Scalable, High-Performance Distributed File System

Abstract : We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the … More

distributed file systems

SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets

Companies providing cloud-scale services have an increasing need to store and analyze massive data sets such as search logs and … More

data-parallel programming

An Efficient Multi-Tier Tablet Server Storage Architecture

This work presents a new, highly scalable, and efficient TSSL architecture called the General Tablet Server Storage Layer or GTSSL. … More

distributed database

Iterative Map Reduce – Prior Art

There have been several attempts in the recent past at extending Hadoop to support efficient iterative data processing on clusters. … More

data-parallel programming, distributed programming

Posts navigation

Older posts
Newer posts
Blog at WordPress.com.
Systems We Make
Blog at WordPress.com.
Cancel