Selective replication for Kafka, with Redpanda as a compatibility target. Filter, transform, redact, and route data between topics and clusters without Kafka Connect.
StreamForge helps data teams move only the records and fields downstream systems actually need. Instead of mirroring whole topics, StreamForge lets you filter, reshape, redact, and route messages before they land in analytics, lake, or lower-trust environments.
- Replicate only analytics-safe fields instead of whole topics
- Split one source topic into multiple downstream topics
- Hash or drop PII before data crosses trust boundaries
- Keep the deployment surface small with a single binary, operator, and Helm chart
This UI-driven demo shows the path most teams actually want to see first: install with Helm, create a pipeline in the browser, review the generated YAML, deploy the CRD, then verify transformed output on Kafka.
UI Demo | 5-Minute CLI Demo | Examples | Compatibility | Documentation Index
Use StreamForge when you need:
- selective replication to analytics or data lake pipelines
- PII-safe replication across environments
- topic fan-out with payload shaping
- a smaller operational footprint than Kafka Connect
Do not position StreamForge as:
- a full replacement for MirrorMaker 2 active-active or offset-sync workflows
- a general-purpose stateful stream processor
For concrete usage patterns and configs, see docs/USAGE.md and examples/README.md.
- Start the local Redpanda demo broker:
docker compose -f examples/redpanda/docker-compose.yml up -d
- Validate the selective replication config:
cargo run --quiet --bin streamforge-validate -- examples/redpanda/selective-replication.yaml
- Run StreamForge with the same config:
Leave StreamForge running in this terminal.
CONFIG_FILE=examples/redpanda/selective-replication.yaml \ cargo run --release --bin streamforge
- Open a second terminal and follow docs/QUICKSTART.md to create the demo topics, produce a sample order, and inspect
analytics-ordersandpii-safe-orders.
If you want the Kubernetes + UI path instead of the local CLI path, use docs/UI_MINIKUBE_DEMO.md.
- At-least-once delivery with retry and DLQ support
- Native Prometheus metrics and lag monitoring
- Kubernetes operator, Helm chart, and web UI
- Kafka-first examples for standalone configs and Kubernetes pipelines
StreamForge is built for Kafka-compatible brokers. Kafka is the primary target in current docs and examples, and Redpanda is documented here as a compatibility target for the selective replication workflows covered in this repo.
- Content-based filtering across payload, key, headers, and timestamps
- Field extraction, reshaping, and PII hashing before downstream delivery
- Topic fan-out from one source topic to multiple destination topics
- At-least-once delivery with retry, DLQ handling, and observability hooks
- Standalone binary and Kubernetes operator deployment modes
- examples/configs/config.example.yaml for a minimal standalone pipeline
- examples/redpanda/selective-replication.yaml for the validated local Redpanda selective replication demo
- examples/production/pii-redaction.yaml for analytics-safe redaction
- examples/production/cdc-to-datalake.yaml for CDC-to-lake shaping
- examples/pipelines/README.md for operator-backed Kubernetes manifests
- docs/DEPLOYMENT.md for deployment patterns
- docs/OPERATIONS.md for production runbooks
- docs/OBSERVABILITY_QUICKSTART.md for Prometheus and lag monitoring
- docs/SECURITY_CONFIGURATION.md for TLS and SASL setup
- helm/streamforge-operator/README.md for Helm-based installs
- docs/QUICKSTART.md for the first local run
- docs/USAGE.md for deployment patterns and use cases
- docs/YAML_CONFIGURATION.md for config structure and format guidance
- docs/ADVANCED_DSL_GUIDE.md for the full filtering and transform DSL
- docs/DOCUMENTATION_INDEX.md for the broader doc set
Contribution and development setup are documented in docs/CONTRIBUTING.md.
Apache License 2.0. See LICENSE for details.
