I work on high-performance and scalable systems with a focus on storage and database infrastructure. You can find some of my open source code at GitHub and the Apache projects mentioned below. Most of my academic publications are indexed at DBLP, or scroll to the bottom of this page for a less-complete list. I also serve on various program committees.
I'm currently working on FlashBlade over at Pure Storage. It is an all-flash scale-out storage appliance for big data workloads. Instead of using commodity flash translation layers, we have a custom variant of NVMe that makes it much easier to get predictable performance, and implement distributed transactions. Our team also designed custom boards, chasses, and network switches to simplify administration, and reduce hardware costs and energy consumption.
While at Microsoft, I worked on various part of Apache REEF, which provides infrastructure for applications that run atop Hadoop and other cloud computing systems. As part of that project, I designed and implemented Tang, a Java / C# dependency injector. Unlike most things in its space, it supports strong typing for application configurations, and allows applications to be written in multiple languages.
In grad school at UC Berkeley, and at Yahoo! Research, I built Stasis, a flexible implementation of the ARIES recovery protocol, and bLSM ("blossom"), a read-optimized log structured index that targets low-latency serving environments, and provides extremely high random write throughput.
While at Yahoo!, I also contributed to the YCSB benchmarking framework, and MapKeeper standardized key-value storage API.
Other publications (DBLP)