ClusterService
Cluster management operations.
Proto source
Transport
All methods are available via:
- gRPC on port 7080 — native high-performance API
- REST/JSON on port 7081 — HTTP/JSON transcoding via embedded structured-proxy
Methods
GetClusterStatus
Get the current cluster status.
HTTP: GET /v1/admin/cluster/status
Request: GetClusterStatusRequest
Empty request (no fields).
Response: ClusterStatus
| Field | Type | Description |
|---|---|---|
nodes | ClusterNode[] | — |
leader_id | string | Node ID of the current Raft leader. Empty string if no leader elected. |
raft_term | uint64 | Current Raft term. Monotonically increasing across leader elections. |
JoinNode
Initiate a cluster join lifecycle for a new node. Adds the target node as a Learner (non-voting) and starts a background monitor that promotes it to Voter once replication lag drops below READINESS_LAG_THRESHOLD (1000 entries). Must be called on the leader. Returns immediately after the Learner is added. Use JoinProgress to stream phase events until COMPLETE or FAILED.
HTTP: POST /v1/admin/cluster/join
Request: JoinNodeRequest
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Numeric node ID to assign. Must be unique across the cluster. The target node must already be running and listening on address. |
address | string | gRPC address of the new node in "host:port" format. The leader will replicate log entries to this address. |
pre_seeded | bool | If true, the new node has pre-seeded data from an offline snapshot backup. Used as a hint: the leader logs "Tier 3 skip expected" in the initial progress event. Actual Tier-3 skip behavior runs on the joining node side. |
Response: JoinNodeResponse
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Numeric node ID of the joining node (echoed from request). |
status | string | Human-readable status. Currently always "JOIN_INITIATED". |
JoinProgress
Stream join progress events for a node join in progress. Returns a server-streaming response of JoinStatus messages until the join reaches COMPLETE or FAILED phase. The stream closes automatically after either terminal phase is emitted. Returns NOT_FOUND if JoinNode was not called for this node_id or if the join has already completed.
Transport: Server-streaming gRPC only (no HTTP transcoding)
Request: JoinProgressRequest
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Node ID whose join lifecycle to observe (must match a JoinNode call). |
Response: stream JoinStatus
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Node ID of the joining node. |
phase | JoinPhase | Current join phase. |
lag_entries | uint64 | Number of Raft log entries behind the leader. Zero when unknown or complete. |
percent | uint32 | Estimated completion percentage (0–100). 100 means complete. |
message | string | Human-readable status message for display in CLI --follow output. |
AddNode
Add a node to the cluster (low-level, auto-assigned ID). Deprecated: use JoinNode with an explicit node_id and address instead. This RPC is retained for API compatibility but returns UNIMPLEMENTED.
HTTP: POST /v1/admin/cluster/nodes
Request: AddNodeRequest
| Field | Type | Description |
|---|---|---|
address | string | — |
Response: ClusterNode
| Field | Type | Description |
|---|---|---|
node_id | string | Numeric node ID as string (matches --id flag in CLI). |
address | string | gRPC address of the node (host:port). May be empty for follower self-report. |
role | NodeRole | — |
state | NodeState | — |
lag_entries | uint64 | Number of Raft log entries this node is behind the leader. Zero for the leader itself. Used for read routing staleness checks. |
RemoveNode
Remove a node from the cluster.
HTTP: DELETE /v1/admin/cluster/nodes/{node_id}
Request: RemoveNodeRequest
| Field | Type | Description |
|---|---|---|
node_id | string | — |
Response: RemoveNodeResponse
Empty response (no fields).
DecommissionNode
Gracefully decommission a node from the cluster. Implements the Phase 0-2 decommission protocol: Phase 0: Quorum gate — verify removing node_id still leaves ≥ 2 voters. Aborts with FAILED_PRECONDITION if quorum would be lost. Phase 1: Leadership transfer — if node_id is the current Raft leader, leadership is transferred to a peer before removal. For self-decommission (leader removing itself), the request is internally forwarded to the new leader after transfer. Phase 2: Membership remove — change_membership(remove: node_id). Node stops receiving log replication and is removed from quorum. CE behaviour: single Raft group, all nodes hold full data. Pruning (Phase 3) is advisory — operator must delete data on the decommissioned node manually. Must be called on the current Raft leader, or on the node being decommissioned when that node is the leader (self-decommission path). Use force=true for emergency removal of an unreachable node (may lose data).
HTTP: POST /v1/admin/cluster/decommission
Request: DecommissionNodeRequest
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Numeric node ID to decommission. Must be in the current voter set. The node will be removed from Raft membership after quorum and drain checks pass. |
pruning | bool | If true, the decommissioned node's data should be wiped after removal. In CE, this is advisory — operator must delete data on node_id manually. The response will set operator_cleanup_required=true as a reminder. In EE, storage is freed progressively as each shard move completes (Phase 1). |
force | bool | Emergency decommission: skip quorum gate and drain, force membership remove even if node_id is unreachable. May cause permanent data loss if node_id held the only copy of any shard. Requires skip_confirmation=true to guard against accidental invocation. |
skip_confirmation | bool | Must be set to true when force=true to confirm awareness of potential data loss. Prevents accidental use of --force without understanding the consequences. Ignored when force=false. |
Response: DecommissionNodeResponse
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Numeric node ID that was decommissioned. |
message | string | Human-readable status message describing the phases executed. |
operator_cleanup_required | bool | Set when pruning=true was requested. In CE, operator must manually delete data on the decommissioned node. In EE, storage is freed automatically during Phase 1 shard moves. |
Types
AddNodeRequest
| Field | Type | Description |
|---|---|---|
address | string | — |
ClusterNode
| Field | Type | Description |
|---|---|---|
node_id | string | Numeric node ID as string (matches --id flag in CLI). |
address | string | gRPC address of the node (host:port). May be empty for follower self-report. |
role | NodeRole | — |
state | NodeState | — |
lag_entries | uint64 | Number of Raft log entries this node is behind the leader. Zero for the leader itself. Used for read routing staleness checks. |
ClusterStatus
| Field | Type | Description |
|---|---|---|
nodes | ClusterNode[] | — |
leader_id | string | Node ID of the current Raft leader. Empty string if no leader elected. |
raft_term | uint64 | Current Raft term. Monotonically increasing across leader elections. |
DecommissionNodeRequest
Request to gracefully decommission a node from the cluster.
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Numeric node ID to decommission. Must be in the current voter set. The node will be removed from Raft membership after quorum and drain checks pass. |
pruning | bool | If true, the decommissioned node's data should be wiped after removal. In CE, this is advisory — operator must delete data on node_id manually. The response will set operator_cleanup_required=true as a reminder. In EE, storage is freed progressively as each shard move completes (Phase 1). |
force | bool | Emergency decommission: skip quorum gate and drain, force membership remove even if node_id is unreachable. May cause permanent data loss if node_id held the only copy of any shard. Requires skip_confirmation=true to guard against accidental invocation. |
skip_confirmation | bool | Must be set to true when force=true to confirm awareness of potential data loss. Prevents accidental use of --force without understanding the consequences. Ignored when force=false. |
DecommissionNodeResponse
Response from DecommissionNode.
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Numeric node ID that was decommissioned. |
message | string | Human-readable status message describing the phases executed. |
operator_cleanup_required | bool | Set when pruning=true was requested. In CE, operator must manually delete data on the decommissioned node. In EE, storage is freed automatically during Phase 1 shard moves. |
GetClusterStatusRequest
No fields.
JoinNodeRequest
Request to initiate a cluster join for a new node.
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Numeric node ID to assign. Must be unique across the cluster. The target node must already be running and listening on address. |
address | string | gRPC address of the new node in "host:port" format. The leader will replicate log entries to this address. |
pre_seeded | bool | If true, the new node has pre-seeded data from an offline snapshot backup. Used as a hint: the leader logs "Tier 3 skip expected" in the initial progress event. Actual Tier-3 skip behavior runs on the joining node side. |
JoinNodeResponse
Response to JoinNode — the join lifecycle has been initiated.
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Numeric node ID of the joining node (echoed from request). |
status | string | Human-readable status. Currently always "JOIN_INITIATED". |
JoinProgressRequest
Request to subscribe to join progress events.
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Node ID whose join lifecycle to observe (must match a JoinNode call). |
JoinStatus
A single join progress event, emitted during the join lifecycle. The stream emits events at each phase transition and every ~500ms lag poll. The stream closes after COMPLETE or FAILED is emitted.
| Field | Type | Description |
|---|---|---|
node_id | uint64 | Node ID of the joining node. |
phase | JoinPhase | Current join phase. |
lag_entries | uint64 | Number of Raft log entries behind the leader. Zero when unknown or complete. |
percent | uint32 | Estimated completion percentage (0–100). 100 means complete. |
message | string | Human-readable status message for display in CLI --follow output. |
RemoveNodeRequest
| Field | Type | Description |
|---|---|---|
node_id | string | — |
RemoveNodeResponse
No fields.
Enums
JoinPhase
Phase progression for a node join lifecycle.
| Value | Number | Description |
|---|---|---|
JOIN_PHASE_UNSPECIFIED | 0 | — |
JOIN_PHASE_LEARNER | 1 | Node added as Learner — receiving log replication, replication lag closing. |
JOIN_PHASE_READY_CHECK | 2 | Lag below threshold (1000 entries) — promoting to Voter imminently. |
JOIN_PHASE_PROMOTING | 3 | change_membership in progress — node becoming a Voter. |
JOIN_PHASE_COMPLETE | 4 | Node is now a Voter. Join complete. Stream closes after this event. |
JOIN_PHASE_FAILED | 5 | Join failed (see message). Stream closes after this event. |
NodeRole
| Value | Number | Description |
|---|---|---|
NODE_ROLE_UNSPECIFIED | 0 | — |
NODE_ROLE_LEADER | 1 | Leader — handles all writes and coordinates replication. |
NODE_ROLE_FOLLOWER | 2 | Voting follower — participates in elections and quorum. |
NODE_ROLE_LEARNER | 3 | Non-voting learner — receives replication but does not vote. |
NodeState
| Value | Number | Description |
|---|---|---|
NODE_STATE_UNSPECIFIED | 0 | — |
NODE_STATE_HEALTHY | 1 | Healthy — lag within acceptable threshold. |
NODE_STATE_DEGRADED | 2 | Degraded — lag exceeds threshold or heartbeat delayed. |
NODE_STATE_DOWN | 3 | Down — not reachable or no heartbeat received. |
