Second International Workshop on Composable Data Management Systems, 2023

Workshop Venue: Vancouver, Canada - Co-located with VLDB 2023

Workshop Date: 28th August 2023


Keynote Talks


Title: Horizons of Composability
Speaker: Orri Erling, Software Engineer, Meta Inc.

Speaker Bio: Erling co-founded the Velox composable query execution project at Meta. Prior to this, he worked on Google's F1 and before then created OpenLink Virtuoso, a relational/graph store, best known for its applications in linked data and knowledge graphs. Research interests include benchmarking and generalizing query processing to fuse with neighboring graph, AI and HPC domains. The mission is to create a line of components from execution to query optimization to distributed computing.

Abstract: Composability is coming up for many parts of the data management stack. At the same time, data volumes keep growing and AI is becoming the main customer of data. Data management and accelerators have been on the table for years. Are we approaching an inflection point where data management becomes cost competitive on accelerators? What about colocating it with GPUs already used for AI? Composability, on its side, is maybe best established in query optimization with Calcite and coming to execution with projects like Arrow, Velox and Tril. How does composability address the inflections in data center architectures, AI and workloads? We discuss the composability efforts at Meta, including Velox, and Velox Wave, a new approach for portable, composable hardware acceleration for query. We briefly cover Verax, an early concept for a query optimizer companion for Velox. We point out interesting research outcomes and future/ongoing collaborations for strengthening the composability field, as in file formats and abstracting query engine design. Drawing on our experience and insight into workloads and the evolution of the data center, we outline wins, challenges and opportunities for the composability movement.



 
Title: Unbundling of the DBMS stack
Speakers: Mosha Pasumansky, CTO, Firebolt & Benjamin Wagner, Engineering Manager, Firebolt

Speakers Bio:
Mosha Pasumansky is the CTO of Firebolt Analytics. Previously he worked on BigQuery in Google, on Cosmos in Microsoft Bing, and on Microsoft Analysis Services. Mosha is the co-inventor of MDX query language. He received a M.Sc. in Computer Science from the University of Washington.

Benjamin leads the query processing teams at Firebolt. The teams are working on query optimization, distributed query execution, and Firebolt’s single-node runtime. Benjamin first fell in love with database systems while studying computer science at the Technical University of Munich.

Abstract: There has been an explosion in the number of new databases in the recent years - it is easier than ever to build a new database system. In this talk we will look at the components of the DBMS stack, how they can be reused and/or replaced. Using examples of existing commercial and academic databases on the market, we will see how a database system can be composed from different existing components, or how novel ideas can be implemented by replacing one or more components in the existing system.



Title: Hybrid Query Execution; What is a database client, anyway?
Speaker: Jordan Tigani, Co-Founder & CEO, MotherDuck

Speaker Bio: Jordan is co-founder and chief duck-herder at MotherDuck, a startup building a serverless analytics platform based on DuckDB. He spent a decade working on Google BigQuery, as a founding engineer, book author, engineering leader, and product leader. More recently, as SingleStore’s Chief Product Officer, Jordan helped them build a cloud-native SaaS business. Jordan has also worked at Microsoft Research, the Windows Kernel team, and at a handful of star-crossed startups. His biggest claim to fame is predicting world cup matches using machine learning with a better record than Paul the Octopus.

Abstract: Running a full-fledged analytical database inside the client opens up new ways of executing your query; you can run parts of your query locally and part remotely. Once you can split the query plan into two pieces, the same mechanism works with N stages, which can be in series or a tree. This talk will discuss the hybrid execution system based on DuckDB that we've built at MotherDuck, but also discuss some further query topologies that are enabled by this pattern.



Title: Taking Postgres into the 21st Century
Speaker: Nikita Shamgunov, CEO, Neon Database

Speaker Bio: Nikita co-founded SingleStore, a unicorn data and analytics company valued over $1.3 billion. He served as a founding CTO and then CEO, successfully scaling the company to over 40 million in ARR and near profitability. For the first nine months, Nikita lived in the office coding next to the servers. Prior to founding SingleStore, Nikita worked as a senior engineer at Facebook, and before that at Microsoft on the SQL Server product.
Nikita is incubating Neon – a new database company building serverless Postgres – that raised $104 million.
Nikita has a Ph.D. in computer science from St. Petersburg. During college years Nikita received a bronze medal in ACM ICPC - an international student programming competition.

Abstract: Postgres continues to punch way above its weight in 2023. Despite being one of the oldest open-source databases in the world, recent surveys place it among one of the most popular databases for modern developers.
As we showcase the advancements of Postgres as an open-source software, we also show that with the separation of storage and compute, we can deliver Serverless Postgres in the cloud, and scale it up and down with the load without the need for manual intervention by the user all the way from 0 to infinity and back to 0.
With Postgres being a ubiquitous platform we will explore cloud architectures for Postgres for multi-cloud and edge deployments. These changes show that a modern platform can maintain its open-source roots and provide the utility that is so desired in today's cloud computing.