Beforehand, modifying useful resource requests or limits required destroying the pod and creating a brand new one with up to date specs. Purposes went offline throughout the transition. Community connections dropped. The method required upkeep home windows for routine operational duties.
The brand new implementation modifies cgroup (management group) settings immediately on operating containers. When useful resource specs change, Kubernetes updates the prevailing cgroup moderately than recreating the pod. Purposes proceed operating with out interruption.
The characteristic significantly advantages AI coaching workloads and edge computing deployments. Coaching jobs can now scale vertically with out restarts. Edge environments achieve useful resource flexibility with out the complexity of pod recreation.
“For AI, that’s a very huge coaching job that may be scaled and adjusted vertically, after which for edge computing, that’s actually huge to the place there’s added complexity and really adjusting these workloads,” Hagen mentioned.
The characteristic requires cgroups v2 on the underlying Linux nodes. Kubernetes 1.35 deprecates cgroups v1 help. Most present enterprise Linux distributions embrace cgroups v2, however older deployments may have OS upgrades earlier than utilizing in-place useful resource changes.
Gang Scheduling helps distributed AI workloads
Among the many preview options that’s within the new launch is a functionality often called gang scheduling. The characteristic (tracked as KEP-4671) is meant to assist distributed functions that require a number of pods to begin concurrently.
