You are here

Kalliopi H. Vitlari

2021 Diploma Thesis Title: Classification of Traffic Signs Based on Object Recognition                                                                                                                                          


Traffic lights recognition and classification play an important role in the realization of autonomous vehicles. This automated process uses video (frames) to recognize and classify traffic lights along the vehicle’s path in real time. In this thesis, we adapt a proven deep learning model to recognize three classes (states) of traffic lights: green, red and yellow. The model is  YOLOv3, it includes Darknet-53 and combines object detection and classification. The deep learning algorithms are implemented in Google Colab (a cloud platform developed by Google). The resulting Convolutional Neural Network (CNN) is trained using publicly available data sets that we modify to enhance the available training data.
Firstly, the training and validation datasets are generated. Secondly, ground truth bounding boxes -which define the class and the object in the images- are created and uploaded to the Colab environment that runs the algorithm for object detection. The algorithm preprocesses the images, creates bounding boxes that contain the object and adjusts the weights of some model layers. To obtain the most appropriate weights, we perform various training/validation experiments using combinations of available datasets.  The experiments indicate that injecting very clear photos that contain only traffic lights in training datasets with more general photos in which the traffic lights are part of a generic environment, helps significantly the training process. 
We use the best performing weights to conduct a large case study that uses as input video footage taped from the streets of Thessaloniki, which contains numerous traffic lights in all three states. We divide the predictions into 3 categories: True, False and No predictions. Initially the study indicated a relatively low performance of the model caused by a high percentage of No predictions.  To address this issue, we used more than one photographs of each traffic light-state combination and combined the related predictions.  As a result, the percent of No predictions was reduced significantly, and so did the performance of the process.