Flower classification using CNN

You know how machine learning is developing and emerging daily to provide efficient and hurdle-free solutions to day-to-day problems. It covers all possible solutions, from building recommendation systems to predicting something.

In this article, we are discussing one such machine-learning classification application i.e. Flower classification using CNN.

We all come across a number of flowers daily, but we may face the problem of identifying them. And this is where we are going to use ML techniques. We are going to build a model which can classify the flowers. We provide an image as input to the model and it will predict the flower as output.


With the help of CNN i.e. convolution neural network, we are going to build a model that classifies flowers into 5 different species (as per the dataset).


The image classification done using machine learning generally falls under deep learning where neural networks are encountered. Hence, before going ahead, one must know what exactly neural networks and convolution neural networks are.

What are Neural Networks?

Neural Networks or Artificial Neural Networks (ANN) form the base of deep learning, a subfield of machine learning. Their structure is inspired by the human brain where neurons signal each other. Neural networks recognize data, train themselves to recognize the patterns in the data, and generate the output or a new set of similar data.

neural network

Architecture and working of Neural Networks

Let’s understand the architecture of neural networks and how they work.

Consider constructing a neural network that is trying to figure out the shape, let’s say triangle, square, or circle.
Neural networks are made up of layers of Neurons. A neuron is the core processing unit of this network. The first layer is the input layer which receives the input and the last layer is the output layer which predicts the final output. In between input and output, there exist hidden layers that perform computations to predict the output.
Neurons of one layer are connected to the neurons of the other layer through Channels. Each of these channels is assigned a numerical value, known as weight.

Architecture of Neural Networks

Let’s say we are feeding the image of the circle as our input. The image is 28×28 i.e. 784 pixels. Each of these pixels is fed as input to the neurons of the first layer. The inputs are multiplied with their corresponding weights and their sum is sent as the input in the hidden layer. Each of these neurons of the hidden layer is associated with a numerical value known as Bias which is then added to the input sum. This value is then passed to the activation function (also known as the threshold function).

The result of this function determines if the particular neuron will get activated or not. If the neuron is activated, it transmits its data to the neurons of the next layer over the channels. In this manner, the data is propagated through this network. This propagation is known as forward propagation.
In the output layer, the neuron with the highest value (highest probability) determines the output.

Now here, in our example neural network, we can see the neurons with weights and bias. The activated neuron is shown by the pink circle.

working of Neural Networks

Here, as we discussed, the highest probability is of the square, which means the prediction generated by the model is wrong. So, how the network will figure this out? The network has to be trained yet. During the training process, along with the input, the network is also fed by the output. The predicted and actual output are compared together to realize the actual error. The magnitude of the error depicts how wrong the model is and the sign associated depicts if the predicted values are higher or lower than the actual values. This information is then transferred back to the network. This is known as backward propagation.

Based on this information, the weights are adjusted. This cycle of forward and backward propagation is iteratively performed with multiple inputs and continues till the weights are assigned such that the network can predict the output correctly.

working of Neural Networks

The time required to train the neural network models can be too long for eg. hours, weeks, or even months, but it is worthier with respect to the applications they provide.

What are Convolutional Neural Networks(CNN)?

In neural networks, each neuron is connected to every other neuron. Convolutional neural networks work differently. The term convolutional means filtering. The convolutional neural network simplifies the complex input (images or language). They treat data as spatial. Instead of neurons being connected to every other neuron in the previous layer, they are instead only connected to neurons close to it and all have the same weight. This simplification in the connection means the network upholds the spatial aspect of the dataset.

Convolutional neural networks have three layers. The convolutional layer, the pooling layer, and a fully connected layer. One more layer ReLU (Rectified Linear Unit) performs the activation functions.

Rectified Linear Unit

The convolutional layer works by placing a filter over an array of image pixels, thus creating a convolved feature map. It is like shrinking the image into a specific part or value. The pooling layer reduces the sample size of a particular feature map. It also makes processing faster as it reduces the number of parameters the network has to process. The output of this layer is pooled feature map which can be obtained either by max pooling or average pooling.

These steps perform the feature extraction where the network builds up the picture of image data according to its own mathematical rules. The fully connected layer allows for performing classification on the dataset. The ReLU layer acts as an activation function. Without it, the data being fed into each layer would lose dimensionality.

Code for Flower classification using CNN:

Now that you got an idea about what CNN and neural networks are, let’s discuss the approach behind the flower classification.


Notice that our dataset has 5 folders for 5 different flower species for which we are building the model – rose, daisy, dandelion, sunflower, and tulip.

Performing Flower classification using CNN with code and dataset

First, we need to manually prepare the dataset. After you have downloaded the dataset, make sure to have a folder ‘data’ in the same location where your ipynb is.
Here, to separate out the data, the rule followed is 60-20-20, where 60% of the data from each of the species should be inside the ‘train’ folder, 20% of the data should be inside the ‘test’ and the rest is inside the ‘validation’.

Folder structure for the dataset
Folder structure for the dataset

You don’t need to have separate folders for species in each of these three folders (!important). We are just mixing the data together and then dividing it into three different folders for training, testing, and validating.

In the code, you can see the VGG16 model used to build the CNN.

VGG16 is a CNN architecture with 16 convolutional layers. It has a very large number of hyperparameters (approx 138 million). The network is a pretty large network and here the input size of the image is fixed and it should be 224×224.

Import all the libraries necessary for VGG16. Make sure you have Tensorflow and Keras installed.

Define the epoch and the batch size and load the vgg16. Supplying weights=”imagenet” indicates that we want to use the pre-trained ImageNet weights for the respective model. 

Now we are going to augment the data, i.e diversifying it. Keras has ImageDataGenerator class which allows the users to perform image augmentation. This class has three methods, out of which, ‘flow_from_directory()’ is used here.
It takes parameters as:
directory, target_size ( every image of the input will be resized as per the given size), batch_size (no. of images to be yielded from the generator per batch), class_mode (whether the classes are binary or categorical), shuffle (True if images should be shuffled else False), color_mode (if the images are rgb or grayscale), and seed (random seed for applying random image augmentation and shuffling the order of the image).

After resizing the images, it is now converted to numpy array and saved as ‘.npy’ file.

Repeat the above steps for train, test, and validation data. It can take around 2 hours.

Now, load the data that is saved in the ‘.npy’ file and get their label names. Repeat these steps for train, test and validation.

It is time to train the model. The model used here is sequential which means all the layers of the model will be arranged in a sequence. The code adds neurons layer by layer as specified. First, it flattens the data using flatten(). dense() adds the hidden layer with an activation function using ReLU. The Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting. Compile the model and save as ‘.h5’. Training the model may require 2-2:30 hours.

Using model.summary() you can view the summary of the neural network built using the model. The code also plots the graph for training and testing accuracy and loss. From the accuracy graph, for epoch 0-1, there might be some problem with the data, but overall the model behaves well.

We can now evaluate the model, and can see the accuracy as 90.16% which is pretty good with a loss of 28%. You can view the predictions for test data and check the precision, recall, f1-score, and support for each of the categories.

The model is ready. The code also builds a confusion matrix. The confusion matrix shows how accurate and lossy the predictions are for each species. For eg., look at the first confusion matrix,

output for Flower classification using CNN

Here, 7 times it has happened, that the image was of a tulip but the model predicted a daisy, and so on. The diagonal elements show that the predicted output and true output is correct.

The last part of the code tests the model by providing the image and the model predicts it accurately.

Note: In the code, the datetime module is used to print the time required to compute the processing in each step.


CNN are widely used in classifying images due to its ability to simplify complex structures. The flower recognition system we have discussed above is based on VGG16 which uses CNN.

Thank-you for visiting our website.

Also Read:


Author: Ayush Purawr