I work on high-performance and scalable systems with a focus on storage and database infrastructure. You can find some of my open source code at GitHub and the Apache projects mentioned below. Most of my academic publications are indexed at DBLP, or scroll to the bottom of this page for a less-complete list. I also serve on various program committees.
I am currently at Apple, working on CloudKit and FoundationDB.
I was an early member of the FlashBlade team at Pure Storage. It is an all-flash scale-out storage appliance for big data workloads, with native support for S3 and a POSIX-compliant NFSv3 server.
Instead of using commodity flash translation layers, we built a custom variant of NVMe that makes it much easier to get predictable performance and implement distributed transactions. Filesytem metadata is stored in LSM-Tree indexes that are managed transactionally.
Our team also designed custom boards, chasses, and network switches to simplify administration, and reduce hardware costs and energy consumption.
While at Microsoft, I worked on various part of Apache REEF, which provides infrastructure for applications that run atop Hadoop and other cloud computing systems. As part of that project, I designed and implemented Tang, a Java / C# dependency injector. Unlike most things in its space, it supports strong typing for application configurations and allows applications to be written in multiple languages.
In grad school at UC Berkeley, and at Yahoo! Research, I built Stasis, a flexible implementation of the ARIES recovery protocol, and bLSM ("blossom"), a read-optimized log structured index that targets low-latency serving environments, and provides extremely high random write throughput.
While at Yahoo!, I also contributed to the YCSB benchmarking framework, and MapKeeper standardized key-value storage API.
Other publications (DBLP)