Architecture
Overview
Section titled “Overview”mc-operator is a Kubernetes Operator built with .NET 10 and KubeOps 10. It manages MinecraftServer and MinecraftServerCluster custom resources through continuous reconciliation loops, ensuring the actual cluster state matches the desired state expressed in the CRD specs.
API group
Section titled “API group”The operator uses the API group mc-operator.dhv.sh. This domain is owned and maintained under the dhv.sh umbrella.
Repository layout
Section titled “Repository layout”mc-operator/├── .github/│ ├── dependabot.yml # Automated dependency updates│ └── workflows/│ ├── ci.yml # Build + test on push/PR│ ├── release-image.yml # Publish container image on tag│ └── release-chart.yml # Package and publish Helm chart on tag├── charts/│ └── mc-operator/ # Helm chart (OCI-published to GHCR)├── docs/ # Astro Starlight documentation site├── examples/ # Example MinecraftServer manifests├── manifests/│ ├── crd/ # CustomResourceDefinition YAML (MinecraftServer + MinecraftServerCluster)│ ├── rbac/ # ClusterRole + ClusterRoleBinding│ └── operator/ # Deployment, Service, webhook configs (Kustomize)└── src/ ├── McOperator/ # Main operator application └── McOperator.Tests/ # Unit tests (TUnit)Core components
Section titled “Core components”MinecraftServer CRD
Section titled “MinecraftServer CRD”The MinecraftServer CRD (mc-operator.dhv.sh/v1alpha1) is the operator’s public API. Every field in the spec maps directly to a Kubernetes resource or container configuration. The CRD is fully documented in the CRD reference.
Controller (reconciler)
Section titled “Controller (reconciler)”MinecraftServerController implements IEntityController<MinecraftServer>. On each reconcile it:
- Sets the status phase to
Provisioning - Reconciles the ConfigMap (audit-visible
server.properties) - Reconciles the Service (ClusterIP/NodePort/LoadBalancer)
- Pre-pulls the new image and server jar when
spec.prePull: true(see Pre-pull Jobs below) - Reconciles the StatefulSet (the actual server workload)
- Reads StatefulSet status to determine the current phase
- Updates the
statussubresource with endpoint and PVC info
All child resources are labeled with owner references, so they are automatically garbage-collected when the parent MinecraftServer is deleted.
Admission webhooks
Section titled “Admission webhooks”Validating webhook (MinecraftServerValidationWebhook): Rejects invalid specs at admission time, before resources are created or updated. It validates:
- EULA acceptance
- Minecraft version not blank
- JVM memory values parseable and
maxMemory >= initialMemory - NodePort value only provided for NodePort service type
- NodePort in the 30000–32767 range
- Storage size a valid Kubernetes resource quantity
- Mount path absolute
- Immutable fields not changed on update (
storage.enabled,storage.size,storage.storageClassName) - Server port and view distance in range
Mutating webhook (MinecraftServerMutationWebhook): Normalizes specs on create/update:
- Applies a default MOTD based on server name and type if not set
- Normalizes
levelTypeto uppercase - Trims whitespace from string fields
Finalizer
Section titled “Finalizer”MinecraftServerFinalizer runs during deletion when spec.storage.deleteWithServer: true. It:
- Identifies the PVC created by the StatefulSet (
data-<name>-0) - Deletes it explicitly (PVCs from VolumeClaimTemplates are not owned by the MinecraftServer, so they won’t be garbage-collected automatically)
When deleteWithServer: false (the default), the finalizer runs but intentionally does nothing — the PVC is retained.
Key design decisions
Section titled “Key design decisions”StatefulSet over Deployment
Section titled “StatefulSet over Deployment”Minecraft is a stateful singleton. StatefulSets provide:
- Stable pod identity (
server-name-0): deterministic PVC naming - VolumeClaimTemplates: native pod-to-PVC relationship management
- Ordered updates: prevents concurrent pod replacement that could corrupt state
WhenScaled: Retain: PVC not deleted on scale-down
itzg/minecraft-server as the base image
Section titled “itzg/minecraft-server as the base image”Rather than maintaining separate images per distribution, the operator sets the TYPE and VERSION environment variables on the itzg/minecraft-server image. This image is the community standard, handles all distributions, and is actively maintained.
Users can override with spec.image for full control (e.g. custom images with pre-installed plugins).
Data safety defaults
Section titled “Data safety defaults”PVCs use ReadWriteOnce access mode by default. When spec.prePull: true is set, the access mode is switched to ReadWriteMany so the running server pod and a short-lived pre-pull Job can both mount the same volume simultaneously during a version upgrade. Most production CSI drivers (NFS, Longhorn, Rook-Ceph, cloud block storage) and single-node setups (k3s local-path) support ReadWriteMany.
PVCs are retained by default on deletion (deleteWithServer: false). World data is irreplaceable; the operational cost of a retained PVC is negligible compared to the risk of accidental deletion.
Storage fields (enabled, size, storageClassName) are immutable after creation, enforced by the validating webhook. This matches PVC semantics: you cannot resize a claim once bound.
API version (v1alpha1)
Section titled “API version (v1alpha1)”The API is at v1alpha1 even though the software is a working v1. This signals that the spec may evolve before it stabilizes. The graduation path is: v1alpha1 → v1beta1 → v1.
Reconciliation model
Section titled “Reconciliation model”Reconcile(server): 1. status.phase = Provisioning 2. ConfigMap: get-or-create / update 3. Service: get-or-create / update 4. Pre-pull: if spec.prePull is true and image changed on an active server → create Job (mounts PVC, pulls image + downloads jar) requeue 30s until Job complete, then proceed 5. StatefulSet: get-or-create / update 6. Read StatefulSet.status.readyReplicas - If replicas==0 → phase=Paused - If readyReplicas==1 → phase=Running - Else → phase=Provisioning 7. Update status (phase, endpoint, PVC info, conditions) 8. Requeue after 5m (drift detection)On error, the controller requeues after 30 seconds.
Pre-pull Jobs
Section titled “Pre-pull Jobs”When spec.prePull: true is set and a MinecraftServer version or image is changed, the operator creates a short-lived batch/v1 Job (<server-name>-prepull) before updating the StatefulSet. This minimises downtime by ensuring the new image layers and server jar are already present when the rolling update begins. Pre-pull is disabled by default.
The Job runs in one of two modes:
| Mode | Condition | Behaviour |
|---|---|---|
| Jar-download | Default itzg image + storage.enabled: true | Mounts the data PVC, runs the itzg /start script with a fake java stub — startup scripts download the server jar to the PVC, then the stub exits 0 |
| OCI-only | Custom spec.image or storage.enabled: false | Runs sh -c "exit 0" to force OCI layers onto the node’s image cache; no jar download |
The Job targets the specific node where the server pod is running (spec.nodeName), so cached image layers are available on exactly the right node. SKIP_SERVER_PROPERTIES=true prevents the startup scripts from overwriting config files that the live server is actively using.
Pre-pull is skipped when spec.prePull is false (the default), status.currentImage is unset (fresh server), the desired image matches the current image, or spec.replicas == 0 (paused server). If the Job fails, the upgrade proceeds anyway — pre-pull failure is a warning, not a blocker.
MinecraftServerCluster CRD
Section titled “MinecraftServerCluster CRD”The MinecraftServerCluster CRD (mc-operator.dhv.sh/v1alpha1) manages a fleet of backend MinecraftServer instances behind a Velocity proxy. It is fully documented in the CRD reference.
Cluster controller (reconciler)
Section titled “Cluster controller (reconciler)”MinecraftServerClusterController implements IEntityController<MinecraftServerCluster>. On each reconcile it:
- Determines the desired server count from the scaling configuration
- Reconciles backend MinecraftServer instances (create, update, or delete to match the desired count)
- Builds the server address list for the Velocity configuration
- Reconciles the proxy ConfigMap (
velocity.toml+forwarding.secret) - Reconciles the proxy Service (the player-facing endpoint)
- Reconciles the proxy Deployment (the Velocity proxy workload)
- Updates the
statussubresource with backend server status, proxy endpoint, and conditions
This order ensures backend servers exist before the proxy is configured to route to them.
Velocity proxy deployment strategy
Section titled “Velocity proxy deployment strategy”The Velocity proxy runs as a Kubernetes Deployment, not a StatefulSet. This is because the proxy is stateless — it maintains only in-memory player sessions and can be replaced at any time without data loss. A Deployment provides:
- Rolling restarts without complex orchestration
- Standard horizontal scaling (though the operator currently deploys 1 replica)
- No need for PVCs or stable pod identities
Server auto-registration in velocity.toml
Section titled “Server auto-registration in velocity.toml”When backend servers are created or removed, the cluster controller regenerates the velocity.toml configuration in a ConfigMap. The [servers] section lists all backend server addresses (using Kubernetes internal DNS), and the [servers.try] list controls the order players are assigned to servers.
The proxy Deployment mounts this ConfigMap as a volume. Configuration changes trigger a pod restart via annotation-based rollout.
Forwarding secret management
Section titled “Forwarding secret management”When the player forwarding mode is Modern or BungeeGuard, the operator generates a random forwarding secret (UUID) and stores it in the proxy ConfigMap as forwarding.secret. The same ConfigMap is mounted into the proxy pod. Backend servers are configured with onlineMode: false so that Velocity handles player authentication.
Cluster reconciliation model
Section titled “Cluster reconciliation model”Reconcile(cluster): 1. desiredCount = scaling.mode == Static ? scaling.replicas : scaling.minReplicas 2. Reconcile backend MinecraftServers (scale up/down, update template) 3. Build server address list from live MinecraftServer instances 4. ConfigMap: velocity.toml + forwarding.secret → get-or-create / update 5. Service: proxy service → get-or-create / update 6. Deployment: Velocity proxy → get-or-create / update 7. Read backend server phases + proxy Deployment status - If all servers Running + proxy ready → phase=Running - If some servers not ready → phase=Degraded - Else → phase=Provisioning 8. Update status (phase, endpoint, server statuses, conditions) 9. Requeue after 5m (drift detection)On error, the cluster controller requeues after 30 seconds.
Cluster admission webhooks
Section titled “Cluster admission webhooks”Validating webhook (MinecraftServerClusterValidationWebhook): Validates the template (same rules as MinecraftServer), scaling configuration (replicas ≥ 1 for Static, minReplicas ≤ maxReplicas for Dynamic, policy required for Dynamic), and proxy settings (port range, NodePort rules).
Mutating webhook (MinecraftServerClusterMutationWebhook): Applies default MOTD, normalizes levelType, and trims whitespace — the same normalization as the MinecraftServer webhook, applied to the cluster template and proxy spec.