TNW Amsterdam 2025® in CEO’s eyes: the future of innovative technologies
🔥 Hot news from Amsterdam! Innovation, lots of good conversations... and memories of a silent bartender.🤷♂️💡
After a series of previous articles, we now understand how neural networks work and what classification means in the context of computer vision. For those still unfamiliar with this topic, we encourage you to refer to our literature:
https://www.innokrea.com/machine-learning-part-1-is-it-worth-it/
https://www.innokrea.com/neural-networks-introduction/
Today, we will take it a step further and focus on object detection – what it is, how it differs from “regular” image classification, its uses, and we will discuss the application of one of the detection algorithms (YOLO) in practice.
Object detection is a process that involves both classifying and locating objects within an image (or a series of images, such as video). The goal of detection, in addition to recognizing what is in the image (e.g., a dog, a car, a license plate), is to specify exactly where these objects are by defining a surrounding bounding box. In practice, this means that an object detection model returns the coordinates of rectangular bounding boxes and assigns the appropriate label to the object contained within it. With this approach, multiple object classes can be identified in a single image, and they can also repeat or even contain each other within one sample. It is also possible that no objects are detected on the image at all.
For example: in an image of a street with cars, pedestrians, and cyclists, a model trained to recognize vehicles and faces will be able to identify where the cars, bicycles, and road participants are – sometimes even if the driver’s face is completely enclosed within the car’s bounding box.
One of the most popular object detection algorithms is YOLO (You Only Look Once). YOLO is a model that identifies objects and their locations in a single pass over the image by attempting to define and check many bounding boxes, evaluating the likelihood of an object being inside each box (according to a chosen confidence threshold), and then discarding some of these boxes to avoid duplicates (for one object, we aim to define exactly one bounding box that best describes it).
YOLO first divides the image into a grid with a fixed number of cells, and then for each cell, it searches for a bounding box whose center is inside the cell and has the highest probability of representing one of the classes. The dimensions of the bounding box (x, y – coordinates of the corner, width and height) are determined, along with the probability of the object belonging to specific classes.
YOLO works differently from traditional object detection methods. In traditional approaches, the image is divided into regions, and detection occurs separately for each of those regions, sometimes recursively. YOLO, however, treats the image as a whole and, as described above, allows the entire process to evaluate both the object classes and their locations in a single pass.
Fig. 1: Example output of YOLO object detection.
The main advantage of using YOLO is its speed – it is one of the fastest algorithms for object detection, making it suitable for real-time applications. YOLO also achieves high accuracy – it can detect objects of various sizes, even in cases where objects partially obscure each other. Another advantage is that YOLO can be easily trained on new datasets and adapted to various detection problems.
While YOLO is one of the most commonly chosen object detection algorithms, it does have some limitations. One of them is the difficulty in detecting very small objects compared to other algorithms (e.g., Faster R-CNN). This limitation arises from the mechanics of the model – each grid cell on the image can define only one bounding box. Therefore, if the center of two objects falls within the same cell, only one of those objects will be detected. Additionally, depending on the version of YOLO, there may be a trade-off between speed and detection accuracy – YOLO models come in several generations and various sizes (tiny, small, medium, etc.), with newer and larger models offering more precise performance at the cost of slower processing times.
There are many ready-to-use libraries that offer pre-implemented versions of the YOLO model and repositories of pre-trained weights for various use cases. Regardless of the chosen class set or network version, the detection process generally follows a similar approach. It typically involves:
A Python library that makes it easy to use YOLO is OpenCV. A script example can be found in the package documentation [2].
Object detection is one of the key challenges in the field of computer vision, allowing systems to understand what is in an image and exactly where the objects are located. Compared to image classification, object detection not only assigns labels but also locates the objects. To this day, algorithms are available that can efficiently detect objects in images, and their application in basic cases requires only a few lines of Python code. That’s all for today!
[1] https://medium.com/analytics-vidhya/non-max-suppression-nms-6623e6572536
[2] https://opencv-tutorial.readthedocs.io/en/latest/yolo/yolo.html
TNW Amsterdam 2025® in CEO’s eyes: the future of innovative technologies
🔥 Hot news from Amsterdam! Innovation, lots of good conversations... and memories of a silent bartender.🤷♂️💡
AIEventsGreen ITInnovation
Green Data Centers – How to Choose an Eco-Friendly IT Partner?
A Guide for Companies Looking for IT Providers Using Energy-Efficient Infrastructure and Renewable Energy
Green IT
Helm for the Second Time – Versioning and Rollbacks for Your Application
We describe how to perform an update and rollback in Helm, how to flexibly overwrite values, and discover what templates are and how they work.
AdministrationInnovation