Today we will tell you about what virtualization actually is, its types, and we will present you with information about good design principles proposed by RedHat in 2017, which concern containers deployed in orchestration environments, including cloud environments. Very similar principles are observed throughout the world of computing, including programming, such as DRY, KISS, or SOLID principles. If you're interested in reading about clean code, we encourage you to explore our series on Clean Code.

 

What is a Hypervisor?

To understand the essence of virtualization in operating systems, we need to explain the fundamental concept related to this subject - hypervisor, also known as a hypermonitor. It is a software component that enables the creation and management of virtual machines (VMs) on a physical machine. The hypervisor resides between the physical hardware and the virtual machines, allowing multiple operating systems to run concurrently on a single physical machine.

 

There are two primary types of hypervisors:

Type 1 Hypervisor (Bare Metal Hypervisor) - more commonly found in server solutions, including data centers. It operates directly on the physical hardware, without the need for a hosting operating system. It has direct access to hardware resources. Examples of software that work this way include Microsoft Hyper-V, Citrix XenServer, and VMware ESXi.

 

Type 2 Hypervisor (Hosted Hypervisor) - more common on user computers. It operates on a hosting operating system (host OS). It relies on the host OS to manage hardware resources and provides an additional layer of abstraction. Examples of software that work this way include VMware Workstation, Oracle VirtualBox, and Parallels Desktop.


Figure 1 - type 2 architecture vs type 1 architecture, source vgyan.in

 

What is Virtualization and What are its Types?

Virtualization is a common technique in computing that involves creating instances of hardware or software on a single physical computer system. In other words, it's about simulating the existence of logical resources that utilize underlying physical components. The goal of virtualization is to efficiently utilize hardware resources and enable the isolation of different environments and applications, which can lead to improved performance, security, and ease of management. There are different divisions of virtualization, such as paravirtualization and full virtualization.

 

In the approach of full virtualization, the hypervisor replicates hardware. The main benefit of this method is the ability to run the original operating system without any modifications. During full virtualization, the guest operating system remains completely unaware that it is virtualized. This approach uses techniques like direct execution (less critical commands use the kernel bypassing the hypervisor) and binary translation, which translates commands. As a result, simple CPU instructions are executed directly, while more sensitive CPU instructions are dynamically translated. To improve performance, the hypervisor can store recently translated instructions in a cache.

 

In paravirtualization, on the other hand, the hypervisor does not simulate the underlying hardware. Instead, it provides a hypercall mechanism. The guest operating system uses these hypercalls to execute sensitive CPU instructions. This technique is not as universal as full virtualization because it requires modifications in the guest operating system, but it provides better performance. It's worth noting that the guest operating system is aware that it's virtualized.

 

Figure 2 - Difference between Virtualization and Paravirtualization

 

What is Containerization?

Containerization, on the other hand, is a type of operating system-level virtualization, where containers share the resources of the same operating system while maintaining some level of isolation between them. Compared to other forms of virtualization, containers introduce minimal overhead (due to the shared host system), which means the consumption of additional resources is much lower than, for example, virtual machines. Resources can be allocated and released dynamically by the host operating system. Unfortunately, the isolation of containers is limited compared to classical virtualization solutions.



Showing the difference between virtualization and containerization

Figure 3 - Virtualization vs. Containerization, source: blackmagicboxes.com

 

The most popular ways of utilizing containerization technology are:

 

  • System Containers (Infrastructure Containers) - similar to virtual machines, they offer an environment of a specific operating system along with installed libraries and tools. Just like systems on virtual machines, their purpose is to run for an extended period.

 

  • Application Containers - used to run a specific application or a portion of an application in the case of microservices architecture. Application containers are also known as ephemeral since the instance is momentary. With a new application deployment, the application image is modified, and a new container is deployed, while the previous one is deleted.

 

 

Figure 4 - Comparison of application container, system container, and virtual machine, source: nestybox

 

Nested virtualization

Another interesting solution is nested virtualization. Using system containers or virtual machines, we can introduce an additional layer of abstraction. On the right, you can see an architecture built on bare-metal (without a host OS), three system containers, and a Docker Engine managing the containers.


Figure 5 - Example of containerization and nested virtualization, source: docker.com



Microservices and Cloud-Native

In today's era, applications can be categorized and divided based on various technologies or execution architectures. One of the most distinct and well-known divisions is monolithic architecture versus microservices architecture. The latter is closely associated with containers, where an application is deployed across dozens, even hundreds or thousands of containers (Netflix has over 1000 microservices), managed by an orchestrator and cloud platform tools on which such an application is deployed. These types of applications are often referred to as cloud-native because they were designed with the assumption that they would operate in the cloud. As a result, they anticipate failures, operate, and scale reliably, even when their underlying infrastructure encounters disruptions or outages.

 

Good Containerization Principles from RedHat

To support cloud-native applications, RedHat proposed principles for application containers in 2017, which can be compared to principles like SOLID in object-oriented programming. The principles are as follows:

 

  • SCP - SINGLE CONCERN PRINCIPLE - This principle states that each container should address a single concern and do it well. The concept is similar to the Single Responsibility Principle (SRP) in SOLID principles, but here instead of a single responsibility for a class, we refer to a container that should solve a specific problem and have a single process addressing that problem. If there's a need to address different concerns within a single service, there are specific containerization patterns like sidecar or init-containers that allow organizing such work within separate containers in a single deployment unit (pod).

 

 

  • HOP - HIGH OBSERVABILITY PRINCIPLE - Due to containers being managed by orchestrators like Kubernetes, they should expose appropriate API interfaces through which the managing platform can inspect the state of an application. The application should allow, for example, checking for liveness and readiness. Important events within the application should be logged in both the standard error stream (STDERR) and standard output stream (STDOUT), enabling log aggregation using tools like Fluentd and Logstash.

 

 

  • LCP - LIFE-CYCLE CONFORMANCE PRINCIPLE - This principle dictates that an application running on a platform should be able to respond to commands from that platform. The orchestrator manages the lifecycle of a given application container. Therefore, the application should capture signals like SIGTERM or SIGKILL and shut down the application in a controlled manner (such a shutdown can then be recorded in logs). There are also other events, like PostStart and PreStop, that might have significance for managing an application's lifecycle.

 

 

  • IP - IMAGE IMMUTABILITY PRINCIPLE - Containerized applications should be immutable and remain unchanged between different environments. To achieve this, it's recommended to store runtime data externally and use externally-configured settings that adapt to specific environments. Instead of creating and modifying containers for each environment, it's suggested to use the same containers and images across all environments. Making changes to an application inside a container should result in creating a new container image, which would then be used across all environments.

 

 

  • PDP - PROCESS DISPOSABILITY PRINCIPLE PDP - Containers should be ephemeral, transient, and ready for replacement at any time for various reasons, such as health check failures, scaling down (resource constraints), or migration to a different host. Container-based applications should store their state externally, be distributed, and provide redundancy. Swift startup and shutdown of applications are crucial, even in the event of sudden physical hardware failures in data centers.

 

 

  • S-CP - SELF-CONTAINMENT PRINCIPLE -  This principle states that a container should contain all necessary components during image building. It should consist of a base system based on the Linux kernel, include dependencies and libraries necessary for launching the application. An exception to this rule is configuration differences, which might depend on the deployment environment and should be supplied at runtime. Kubernetes ConfigMap can be used for this purpose.

 

 

 

  • RUNTIME CONFINEMENT PRINCIPLE (RCP) - This principle asserts that each container should specify its resource requirements and communicate them to the platform. The container should provide a resource profile including parameters such as CPU, memory, network, and disk impact. These data are crucial for efficient planning, automatic scaling, and management. If a container exceeds the planned resource allocation, the orchestrator can take appropriate actions such as scaling or shutting it down.

 

 

All images related to these good principles were extracted from a document provided by RedHat.

https://dzone.com/whitepapers/principles-of-container-based-application-design 

 

Summary

We hope that in this comprehensive article, we managed to provide you with insights into containerization and virtualization topics. We also encourage you to explore the links from the sources explaining the concepts discussed in a more detailed manner. If you're interested in containerization, we recommend our series of articles about Docker.

 

Sources: