photo of Dominic

Dominic Burkart is a software engineer at Datadog. Interests: backend, full-stack, and data engineering.



Turbolift 🚡

(source code)

Turbolift is a Rust distribution manager currently in development. Turbolift lets you automatically turn functions into microservices using a custom macro. These microservices are automatically distributed and managed, providing a new way to distribute Rust programs.

Turbolift is designed to make distribution an afterthought. Instead of forcing developers to refactor their code into multiple small repos, Turbolift automatically extracts the relevant source code at compile-time and internally handles communication with the distribution platform. This mitigates the architecture work and cognitive burden associated with microservice development and orchestration, at the expense of increased compile times.

Currently, Turbolift development is only targeting Kubernetes, but Turbolift was built to be extensible to different platforms without significant API changes. Swarm, AWS Lambda, and SLURM are all viable targets for future development. Read more.

Keywords: Rust, Async Rust, Metaprogramming, Orchestration, Distributed Computing, API Design, DevOps, Infrastructure as Code, Kubernetes, K8s, Docker, Open Source, Flagrant Macro Abuse.

Wikipedia Revisions Server 🗃

(source code)

Download, compactly store, and quickly serve every edit to Wikipedia. By leveraging Brotli compression and manual storage management, this project reduces storage requirements from ~60 TB using a postgres database to less than 6 TB.

Revisions can be requested by time period or by revision ID. Using Actix, a highly performant web framework written in Rust, the server can yield multiple compressed JSON data streams concurrently.

When run using the Docker engine, the user can specify a "fast" and a "large" directory, so that the index files can be stored separately on a drive with faster I/O than the primary data drive for finer-tuned performance control.

The decreased storage and memory requirements reduce hardware needs considerably. The server can run on a Raspberry Pi 4 with 4 GB RAM and an external hard drive. Read more.

Keywords: Data Pipeline, Data Engineering, Docker, Rust, Actix, Python, PyPy, Wikipedia, Open Data, Open Source, Optimization, Ode to the File System.

Birdie 🐦

The New York City Council oversees the city's budget ($77 billion in 2017). As stewards of the most populous city in the United States, the 51 New York City Council members have significant legislative authority.

Birdie is a command line tool that generates static webpage reports on proposed council legislation, using open data to find similar prior bills. By listing similar prior bills, Birdie gives policy advocates a starting point while researching long-time supporters, successful past strategies, and previous failures.

Birdie also attempts to estimate the likelihood that a bill will succeed, leveraging several forecasting algorithms to predict which council members are likely to sponsor the bill. While imperfect, these predictions are useful for setting priorities and expectations within advocacy organizations, and include cross‑validation analyses to help advocates understand how reliable each prediction is for their specific use case.

Keywords: Data Pipeline, Data Engineering, Machine Learning, Contagion Modeling, Sequence Prediction, CLI, command line interface, Docker, Python, Open Data, Civics.