Training Yolov8 On Custom Dataset

Lama Alosaimi
Dev Genius
Published in
5 min readFeb 12, 2023

--

Yolo is an object detection model, that is able to detect the object on the image and their exact place by drawing a bounding box on the object. Yolo first developed by Joseph Redmon and Ali Farhadi. The first version of yolo released in 2015. In 2021 Ultralytics released Yolov5 which improved the model gradually. and since then Ultralytics continued improving the yolo till this moment when they released Yolov8 which we will use in this article.

Yolo stands for “You Only Look Once”. Which means it looks to the frame of an image one time and do its calculations. We’re not going to dig more in the math and calculation of the model as this article targets the developers more than the researchers.

Yolo is like any other model first it needs to be trained on a prepared dataset. And by prepared I mean cleaned, labeled and splitted in a proper way. For example, 75% train | 15% valid | 10% test. But the splitting depends on your dataset size after all. So, that sizing might not work best for your case.

Dataset

First we will take a video of parked cars and then upload the video to Roboflow for image labeling. Now, follow the steps to create your own labeled dataset in Roboflow.

  • Create New Project

select (Explore Solo)

Now that you created your project, click your project name and select upload button, then new files. Upload a video you have taken already of cars passing by you.

after uploading the video, you will have the opportunity to remove the frames with no car in it. You should remove those frames because each frame will be considered an input image to your model.

Now that you’ve upload your video and removed useless frames, You will start to draw the bounding box on the car object. Just left click on the mouse and drag it to the object till you have a complete square box on the object.

Finally, after annotating (labeling or draw bounding box) your images, you can now send them to your dataset store by clicking next. It will asks you to set the sizing for train/valid/test. You should select what best fits your case. Because my dataset was small, I gave the training the majority of the images.

Notice the image size in Generate section is 640x640 which will differ from project to another.

Now to download your labeled dataset locally, click the next buttons on Generate section and it will automatically sends you to versions section where you can download the dataset as .zip file.

click on Export button to render the download pop-up, and select the format to Yolov8.

Finally, the dataset is prepared, labeled and downloaded locally. Let’s explore the downloaded dataset.

cars-dataset folder

The dataset has three directories: train, test, valid based on our previous splitting. The data.yaml is the file we care about and we will refer to in the training process. The data.yaml file has the info of the path of the training, testing, validation directories along with the number of classes that we need to override the yolo output classification. The three directories have the images beside their corresponding labels in the yolov8 format as we selected when exporting the dataset.

!! Note

change the path in the data.yaml file to the absolute path so yolo can easily find them.

Yolov8

Yolov8 model exists on Ultralytics github, clone the project and follow the instructions below to start it.

there are two ways to start yolov8, the first one is through CLI which we will use in this article. The second one is using python which we will not include in this article but you may follow the steps on how to use it from here.

CLI

by default the cli will not work for you, you need to install the required packages to do so. So, let’s do it.

Training Process

to start training the yolo execute the following in the ultralytics directory you just cloned.

You will print out some metadata along with the model for you to see.

under that you can see each epoch and how good it’s performing.

I trained for 5 epochs so the model is scoring 0.704 on mAP50 and 0.442 on mAP50–95

finally it will print the summary of the model.

Now let’s test our model on a video of cars passing by …

results …

And with that, the article comes to an end. We have seen how to build a new dataset from scratch, label it, export it to yolov8 format, feed the dataset to the yolov8 model and predict with it.

Thank you so much for reading this article and please connect with me on Linkedin https://www.linkedin.com/in/lama-alosaimi/

and follow me on twitter https://twitter.com/LAMA_ALOSAIMI1

--

--