Open Source - MIT

Mako

Real-time data pipelines

Declarative framework for orchestrating data pipelines. Configure your sources, transforms and sinks in YAML — Mako handles the rest.

go install github.com/Stefen-Taime/mako@latest
GitHub

Features

Declarative

Everything is configured in YAML. No code needed for standard pipelines.

Real-time

Native Kafka, PostgreSQL CDC and HTTP streaming support.

WASM Transforms

Go/Rust/TinyGo plugins compiled to WebAssembly for performant transformations.

Observability

Built-in Prometheus metrics, Grafana dashboards and Slack alerts.

Quick Start

# pipeline.yaml
source:
  type: http
  url: https://api.example.com/data
  format: json

transforms:
  - type: field
    operations:
      - rename: { from: "old_name", to: "new_name" }
      - drop: ["unused_field"]

sink:
  type: postgres
  connection: postgres://user:pass@host/db
  table: events
  mode: upsert

Sources

  • HTTP/REST APIs (pagination, OAuth2, rate limiting)
  • JSON, CSV, Parquet files (gzip)
  • Apache Kafka
  • PostgreSQL CDC (Change Data Capture)
  • DuckDB (embedded SQL queries)

Sinks

  • PostgreSQL, Snowflake, BigQuery, ClickHouse
  • S3, Google Cloud Storage
  • DuckDB, Kafka, stdout

Transforms

  • SQL enrichment via DuckDB
  • WASM plugins (Go/Rust/TinyGo)
  • Schema validation (Confluent Schema Registry)
  • Data quality checks
  • PII masking (SHA-256)
  • Field operations (rename, drop, cast, flatten, deduplicate)

Orchestration

DAG-based workflow engine with parallel execution and SQL quality gates.

MIT License - Maintained by mcsEdition

godatapipelineskafkayamlwasm

Go 1.21+ required

In brief

How does Mako orchestrate a real-time data pipeline?

Mako is an open-source Go framework that describes real-time data pipelines as declarative YAML files, with no code. A Mako pipeline has 3 sections: sources (Kafka, Change Data Capture on Postgres or MySQL, HTTP endpoints), transforms (WASM modules compiled from Rust, Go, or Python), and sinks (Kafka, S3, relational database, webhook). At runtime, Mako builds a directed acyclic graph (DAG) in memory and applies automatic backpressure between stages — if a sink is slow, the sources slow down without saturating memory. Mako exposes 14 standard Prometheus metrics (throughput, p50/p95/p99 latency, queue sizes, errors per stage) and ships with a reference Grafana dashboard. The binary is under 30 MB and starts in under 200 ms.

How does Mako differ from Kafka Streams and Airflow?

Kafka Streams is a Java library for streaming inside a JVM, tightly coupled to Kafka. Airflow and Dagster orchestrate scheduled batch jobs, not continuous streaming. Mako sits in the middle: it handles continuous streaming like Kafka Streams or Apache Flink, but with a declarative YAML definition — no Java compilation required — and a standalone Go binary that runs on a bare server, in a Kubernetes container, or on an edge node. WASM transforms let teams write business logic in Rust, Go, AssemblyScript, or compiled Python without recompiling the engine. Mako is distributed under the MIT license on GitHub and is designed for teams of 1 to 5 engineers who do not want to operate a full Flink cluster.

Frequently asked questions

Is Mako a Kafka Streams, Flink or Airflow alternative?

Mako sits between Kafka Streams/Flink (streaming) and Airflow/Dagster (DAGs). It defines real-time pipelines in YAML with WASM transforms and Prometheus observability, under MIT license.

Which sources does Mako support?

Kafka, Change Data Capture (Postgres, MySQL) and HTTP as sources; WASM transforms; configurable sinks.