Elastickv is an experimental project undertaking the challenge of creating a distributed key-value store optimized for cloud environments, in a manner similar to DynamoDB. This project is currently in the planning and development phase, with the goal to incorporate advanced features like Raft-based data replication, dynamic node scaling, and automatic hot spot re-allocation. Elastickv aspires to be a next-generation cloud data storage solution, combining efficiency with scalability.
THIS PROJECT IS CURRENTLY UNDER DEVELOPMENT AND IS NOT READY FOR PRODUCTION USE.
- Raft-based Data Replication: KV state replication is implemented on Raft, with leader-based commit and follower forwarding paths.
- Shard-aware Data Plane: Static shard ranges across multiple Raft groups with shard routing/coordinator are implemented.
- Durable Route Control Plane (Milestone 1): Durable route catalog, versioned route snapshot apply, watcher-based route refresh, and manual
ListRoutes/SplitRange(same-group split) are implemented. - Protocol Adapters: gRPC (
RawKV/TransactionalKV), Redis (core commands +MULTI/EXECand list operations), and DynamoDB-compatible API (PutItem/GetItem/DeleteItem/UpdateItem/Query/Scan/BatchWriteItem/TransactWriteItems) implementations are available (runtime exposure depends on the selected server entrypoint/configuration). - DynamoDB Compatibility Scope:
CreateTable/DeleteTable/DescribeTable/ListTables/PutItem/GetItem/DeleteItem/UpdateItem/Query/Scan/BatchWriteItem/TransactWriteItemsare implemented. - Basic Consistency Behaviors: Write-after-read checks, leader redirection/forwarding paths, and OCC conflict detection for transactional writes are covered by tests.
- Dynamic Node Scaling: Automatic node/range scaling based on load is not yet implemented (current sharding operations are configuration/manual driven).
- Automatic Hot Spot Re-allocation: Automatic hotspot detection/scheduling and cross-group relocation are not yet implemented (Milestone 1 currently provides manual same-group split).
Elastickv is in the experimental and developmental phase, aspiring to bring to life features that resonate with industry standards like DynamoDB, tailored for cloud infrastructures. We welcome contributions, ideas, and feedback as we navigate through the intricacies of developing a scalable and efficient cloud-optimized distributed key-value store.
Architecture diagrams are available in:
docs/architecture_overview.md
Deployment/runbook documents:
docs/docker_multinode_manual_run.md(manualdocker run, 4-5 node cluster on multiple VMs, no docker compose)
Elastickv now exposes Prometheus metrics on --metricsAddress (default: localhost:9090 in main.go, 127.0.0.1:9090 in cmd/server/demo.go single-node mode). The built-in 3-node demo binds metrics on 0.0.0.0:9091, 0.0.0.0:9092, and 0.0.0.0:9093, and uses the bearer token demo-metrics-token unless --metricsToken is set.
The exported metrics cover:
- DynamoDB-compatible API request rate, success/error split, latency, request/response size, and per-table read/write item counts
- Raft local state, leader identity, current members, commit/applied index, and leader contact lag
Provisioned monitoring assets live under:
monitoring/prometheus/prometheus.ymlmonitoring/grafana/dashboards/elastickv-cluster-overview.jsonmonitoring/grafana/provisioning/monitoring/docker-compose.yml
If you bind --metricsAddress to a non-loopback address, --metricsToken is required. Prometheus must send the same bearer token, for example:
scrape_configs:
- job_name: elastickv
authorization:
type: Bearer
credentials: YOUR_METRICS_TOKENTo scrape a multi-node deployment, bind --metricsAddress to each node's private IP and set --metricsToken, for example --metricsAddress "10.0.0.11:9090" --metricsToken "YOUR_METRICS_TOKEN".
For the local 3-node demo, start Grafana and Prometheus with:
cd monitoring
docker compose up -dmonitoring/prometheus/prometheus.yml assumes the demo token demo-metrics-token. If you override --metricsToken when running go run ./cmd/server/demo.go, update authorization.credentials in that file to match.
This section provides sample commands to demonstrate how to use the project. Make sure you have the necessary dependencies installed before running these commands.
To start the server, use the following command:
go run cmd/server/demo.goTo expose metrics on a dedicated port:
go run . \
--address "127.0.0.1:50051" \
--redisAddress "127.0.0.1:6379" \
--dynamoAddress "127.0.0.1:8000" \
--metricsAddress "127.0.0.1:9090" \
--raftId "n1"To start the client, use this command:
go run cmd/client/client.goTo start the Redis client:
redis-cli -p 63791To set a key-value pair and retrieve it:
set key value
get key
quitTo connect to a follower node:
redis-cli -p 63792
get keyredis-cli -p 63792
set bbbb 1234
get bbbb
quit
redis-cli -p 63793
get bbbb
quit
redis-cli -p 63791
get bbbb
quitMilestone 1 includes manual control-plane APIs on proto.Distribution:
ListRoutesSplitRange(same-group split only)
Use grpcurl against a running node:
# 1) Read current durable route catalog
grpcurl -plaintext -d '{}' localhost:50051 proto.Distribution/ListRoutes
# 2) Split route 1 at user key "g" (bytes are base64 in grpcurl JSON: "g" -> "Zw==")
grpcurl -plaintext -d '{
"expectedCatalogVersion": 1,
"routeId": 1,
"splitKey": "Zw=="
}' localhost:50051 proto.Distribution/SplitRangeExample SplitRange response:
{
"catalogVersion": "2",
"left": {
"routeId": "3",
"start": "",
"end": "Zw==",
"raftGroupId": "1",
"state": "ROUTE_STATE_ACTIVE",
"parentRouteId": "1"
},
"right": {
"routeId": "4",
"start": "Zw==",
"end": "bQ==",
"raftGroupId": "1",
"state": "ROUTE_STATE_ACTIVE",
"parentRouteId": "1"
}
}Notes:
expectedCatalogVersionmust match the latestListRoutes.catalogVersion.splitKeymust be strictly inside the parent range (not equal to range start/end).- Milestone 1 split keeps both children in the same Raft group as the parent.
Jepsen tests live in jepsen/. Install Leiningen and run tests locally:
curl -L https://raw.githubusercontent.com/technomancy/leiningen/stable/bin/lein > ~/lein
chmod +x ~/lein
(cd jepsen && ~/lein test)These Jepsen tests execute concurrent read and write operations while a nemesis injects random network partitions. Jepsen's linearizability checker verifies the history.
git config --local core.hooksPath .githooks