July 27th, 2016 by Guest Author
Container Orchestration And The Case for Advanced Scheduling
The process of scheduling containers in a container orchestration framework such as Kubernetes or Docker SWARM can be described as simply allocating runtime resources to workloads. In a world where there are unlimited (or at least sufficient) resources and all workloads are equal, current container scheduling systems would be considered adequate. However, in the real world of enterprise computing, we know not all workloads are equal and most organizations have resource constraints. Further, large organizations have unique needs and want to run their workloads in a specific manner, hence we make the business case for advanced scheduling.
When experimenting with containers or running a couple of small pilot projects, container scheduling really can be very simple. But once you move beyond trivial use cases, scheduling containers becomes much more complex and an extremely crucial challenge.
The following sections discuss inherent complexities in more detail and illustrate the need for powerful, automated container scheduling.
Container environments are dynamic, not static
While it might seem that service-based architectures are static and involve long running services that don’t change much at all, this is far from the case. Due to the dynamic nature of components, which include replication controllers and dynamic load balancing, the number and nature of executing service components may change numerous times throughout a day. Requirements and boundary conditions such as those listed below have to be expected to change at any time. This will commonly require a shift in the workload placement due to:
- System failures and thereby a need to reschedule a service or part of its components
- Re-allocation, addition or removal of resources, e.g. in a cloud environment
- Demand and priority of workloads may change, especially in relation to other workloads, e.g. time of day shifts in demand
Software updates to a specific service or some of its components can cause dynamic changes to an environment too. Also, non-service, more transient workloads can require priority and create resource constraints. Workloads like batch jobs, interactive work, software builds, (e.g. triggered through CI/CD frameworks) or test suite tasks, again stemming from CI/CD frameworks, can also create contention and may make it necessary to rebalance workloads.
Container scheduling is highly complex and not simple
With containers, there are many more (and much smaller) moving parts than with traditional, more monolithic application development approaches. For example, microservice application components have interdependencies, diverse needs and utilize replicas of service components to achieve scale. There are also more dependencies on software-defined layers for networking and storage. Every check-in by an engineer can trigger an avalanche of build/deploy/test steps and automated orchestration ensuring service readiness and health will create or wind down service components on demand. With DevOps involved, there are more stakeholders influencing operations - every applications engineer and any single development step can impact operations directly. Consistent and dependable service delivery based on these moving puzzle pieces becomes a challenge especially when scaling up and wide with higher service rates and more services.
Container scheduling needs to handle resource constraints
The world of cloud computing creates the illusion of limitless resource availability but these cloud resources are often inherently constrained. They are commonly constrained by budget or by availability and horsepower of the desired nodes. Limitations can also be introduced by the scalability of workloads and cost/benefit trade-offs may limit economic return beyond a certain threshold. If you need to run more work than you have resources to accommodate, then scheduling becomes a tough decision making process of “who runs and who doesn’t?”, “who goes first, who goes next?” and “who gets more and who gets less?”
A need for powerful and automated scheduling
It is no secret that containers are being adopted at a record-setting rate and while most organizations are still in pilot, dev or test modes, as deployments get to production and ultimately to scale, constraints will dictate the need for advanced and automated scheduling tools.
- Automating the decision-making to handle any combination of challenges like resources constraints, complex container use cases, or dynamically changing environments and boundary conditions is the biggest challenge in container orchestration
- Automated scheduling requires powerful policies and an efficient implementation of them
- Without powerful and automated scheduling you will be forced to accept resource wastage or you have to make adjustments manually all the time while still never achieving an optimal outcome
So without advanced scheduling you’d be wasting time, effort and money. And the more dynamic your environment becomes (think cloud) and the more advanced your use cases, the more you are going to waste.
Introducing Navops Command – Superpowers for Kubermetes
The engineers at Google involved in the creation of Kubernetes saw in the very early days that advanced scheduling would be required and took into account in their Kubernetes modular architecture the ability to readily replace the Kubernetes scheduler or utilize multiple schedulers in parallel. Univa has many years of scheduling experience in high performance computing and technical computing applications. We are building upon our proven expertise and have architected the Navops Command scheduling capability to be Kubernetes-aware with a slick container-oriented web interface as well as APIs, and a command line interface. The system brings to the world of containers concepts like proportional resource share management, resource quotas, access control lists and resource inter-leaving. Visit www.navops.io/command.html to learn more and to apply for early access.
By Fritz Ferstl, CTO and Business Development, EMEA, Univa
Fritz brings more than 23 years as a leading expert in distributed workload and resource management. His experience ranges from grid computing to cloud computing, high performance computing, virtualization and container optimization. As the CTO of Univa Corporation, he is defining Univa’s product and technology direction in support of several hundreds of large enterprise customers across all industry verticals and on behalf of their workload management requirements. Within these customers many of the world’s largest computing environments enterprises can be found.