Automatic License Plate Detection & Recognition using deep learning
- OpenCV: OpenCV is a library of programming functions mainly aimed at real-time computer vision plus its open-source, fun to work with and my personal favorite. I have used version 4.1.0 for this project.
- Python: aka swiss army knife of coding. I have used version 3.6.7 here.
- IDE: I’ll be using Jupyter here.
- Keras: Easy to use and widely supported, Keras makes deep learning about as simple as deep learning can be.
- Scikit-Learn: It is a free software machine learning library for the Python programming language.
let’s import the libraries:
Let’s start simple by importing a sample image of a car with a license plate and define some functions:
The above function works by taking image as input, here the parameter scaleFactor stands for a value by which input image can be scaled for better detection of license plate. minNeighbors is just a parameter to reduce false positives, if this value is low, the algorithm may be more prone to giving a misrecognized outputs.
Segmenting the alphanumeric characters from the license plate:
In this step, we will be applying some more image processing to extract the individual characters from the license plate. The steps involved will be-
Finding all the contours in the input image. The function cv2.findContours returns all the contours it finds in the image. Contours can be explained simply as a curve joining all the continuous points (along the boundary), having the same color or intensity.
Find characters in the resulting images:
The above function takes in the image as input and performs the following operation on it-
- resizes it to a dimension such that all characters seem distinct and clear
- convert the colored image to a grey scaled image i.e instead of 3 channels (BGR), the image only has a single 8-bit channel with values ranging from 0–255 where 0 corresponds to black and 255 corresponds to white. We do this to prepare the image for the next process.
- now the threshold function converts the grey scaled image to a binary image i.e each pixel will now have a value of 0 or 1 where 0 corresponds to black and 1 corresponds to white. It is done by applying a threshold that has a value between 0 and 255, here the value is 200 which means in the grayscaled image for pixels having a value above 200, in the new binary image that pixel will be given a value of 1. And for pixels having value below 200, in the new binary image that pixel will be given a value of 0.
- The image is now in binary form and ready for the next process Eroding.
Eroding is a simple process used for removing unwanted pixels from the object’s boundary meaning pixels that should have a value of 0 but are having a value of 1. It works by considering each pixel in the image one by one and then considering the pixel’s neighbor (the number of neighbors depends on the kernel size), the pixel is given a value 1 only if all its neighboring pixels are 1, otherwise it is given a value of 0.
- The image is now clean and free of boundary noise, we will now dilate the image to fill up the absent pixels meaning pixels that should have a value of 1 but are having value 0. The function works similar to eroding but with a little catch, it works by considering each pixel in the image one by one and then considering the pixel’s neighbor (the number of neighbors depends on the kernel size), the pixel is given a value 1 if at least one of its neighboring pixels is 1.
- The next step now is to make the boundaries of the image white. This is to remove any out of the frame pixel in case it is present.
- Next, we define a list of dimensions that contains 4 values with which we’ll be comparing the character’s dimensions for filtering out the required characters.
- Through the above processes, we have reduced our image to a processed binary image and we are ready to pass this image for character extraction
Let’s see the segmented characters:
The binary images of 10 extracted characters:
Create CNN Model:
- Here We Use ‘Relu’ as activation function.
- Next, we’ll be adding a max-pooling layer with a pool size of (4,4).
Max pooling is a sample-based discretization process. The objective is to down-sample an input representation (image, hidden-layer output matrix, etc.), reducing its dimensionality and allowing for assumptions to be made about features contained in the sub-regions binned.
- We have chosen a dropout rate of 0.4 meaning 60% of the node will be retained.
Training our CNN model.
- It’s time to train our model now!
we will use ‘sparse_categorical_crossentropy’ as loss function, ‘Adam’ as optimization function.
- After training for 16 epochs, the model achieved an accuracy of 99.55%.
Predicting the output:
Segmented characters and their predicted value:
The full project is available on my Github: