Node Management
Cordon, drain, and migrate workloads between K8s nodes with zero data loss
Overview
PodWarden provides node-level operations for K8s cluster maintenance — cordoning, draining, and workload migration. These are available from the cluster detail page and through the API.
Cordon & Uncordon
Cordoning a node marks it as unschedulable. Existing pods continue running, but the Kubernetes scheduler won't place new pods on it.
| Operation | Effect |
|---|---|
| Cordon | Node becomes unschedulable — new pods go elsewhere |
| Uncordon | Node becomes schedulable again |
Cordoning is non-disruptive. Use it as the first step before draining or when you want to temporarily stop scheduling without evicting running workloads.
Draining a Node
Draining evicts all pods from a node, moving them to other nodes in the cluster. PodWarden automatically cordons the node before draining.
Options
| Option | Default | Description |
|---|---|---|
force | false | Force eviction even for unmanaged pods |
ignore_daemonsets | true | Skip DaemonSet-managed pods (expected on every node) |
delete_emptydir_data | true | Allow deletion of pods using emptyDir volumes |
timeout_seconds | 120 | Maximum wait time for graceful eviction |
What happens during drain
- Node is cordoned (if not already)
- All non-DaemonSet pods are evicted via the Kubernetes eviction API
- Kubernetes respects PodDisruptionBudgets during eviction
- Evicted pods are rescheduled to other available nodes
Workload Migration
To move a deployment from one node to another:
- Cordon the source node
- Drain the source node (evicts all pods)
- Update placement on the deployment to target the new node
- Redeploy the workload
Persistent volume handling
With Longhorn distributed storage (the default for PodWarden-managed clusters), persistent volumes are replicated across nodes. Migration does not require manual data copying — Longhorn handles volume availability on the target node.
Without Longhorn (e.g., local-path provisioner), volumes are pinned to a single node. Migration requires manual data handling or using undeploy → change placement → redeploy, which creates new empty PVCs on the target node.
Pre-flight checks
Before deploying to a new node, PodWarden checks:
- PV node affinity: If the persistent volume has a node affinity constraint, PodWarden verifies the target node is in the allowed list. Returns a
409 Conflictif the node is incompatible.
Verify Longhorn CSI is healthy on the target node before migrating stateful workloads. A node can show as Ready in Kubernetes while its Longhorn CSI registration is still initializing. If the driver.longhorn.io storage driver is not registered on the target node, PVC attachment will fail and the workload will not start — even though PodWarden reports migration as successful.
Before migrating, verify Longhorn is ready on the target node:
kubectl get pods -n longhorn-system -o wide | grep <target-node-name>All Longhorn pods on the target node should be in Running state (not CrashLoopBackOff or Init). If they are not, wait for them to stabilize before migrating stateful workloads.
Symptom of this issue: Ingress stays active but backend traffic returns 503. The workload appears deployed in PodWarden but pods are not running because the PVC cannot attach.
Post-deploy health check
After deploying to the new node, PodWarden polls pod readiness for up to 90 seconds. If pods don't reach a healthy state, the deployment status transitions to error with a diagnostic message.
Force-Delete Stuck Namespaces
Occasionally a Kubernetes namespace gets stuck in Terminating state — typically because a finalizer can't complete (e.g. Longhorn namespace after an unclean uninstall). PodWarden provides a force-delete operation that removes finalizers and deletes the namespace.
When to use: Only when a namespace has been stuck in Terminating for several minutes and you've confirmed the underlying resources are gone or no longer needed.
Protected namespaces: default, kube-system, kube-public, and kube-node-lease cannot be force-deleted.
DELETE /clusters/{cluster_id}/namespaces/{namespace}?force=trueRequires the admin role.
API Reference
All node management endpoints require the operator role.
POST /clusters/{cluster_id}/nodes/{node_name}/cordon
POST /clusters/{cluster_id}/nodes/{node_name}/uncordon
POST /clusters/{cluster_id}/nodes/{node_name}/drain
Body (drain only):
{
"force": false,
"ignore_daemonsets": true,
"delete_emptydir_data": true,
"timeout_seconds": 120
}
DELETE /clusters/{cluster_id}/nodes/{node_name}
DELETE /clusters/{cluster_id}/namespaces/{namespace}?force=trueRelated docs
- Clusters — Cluster detail page with node list
- Storage — Volume types and StorageClass provisioners
- Infrastructure Canvas — Visual topology with pod placement edges
- Networking — Mixed network considerations for node placement