2

Compressing Images using K-Means Clustering

Share this article!

K-means clustering is an unsupervised learning algorithm which segments the unlabeled data into different clusters. One of the most interesting applications of K means clustering is compressing images. In a coloured image, each pixel is a combination 3 bytes (RGB), where each colour can have intensity values from 0 to 255. Therefore, the total number of colours which can exist in an image are 256 x 256 x 256, which is almost 16.7 million. Practically, only a fraction of those colors exist in an image which contribute to the size of it.

The K- means clustering works by randomnly initialisinsg k-cluster centers from all the data points. Then in every iteration, it proceeds to find the distance of each point from the cluster centers and assigns each point the coordinate of the cluster center which is nearest to it. After all points have been assigned to the cluster, it takes the mean of the points in every cluster and sets it as the new cluster center. It keeps on doing this for a specified number of iterations.

In an image, there are many different colors and hence many different RGB pixel combinations. If we apply K-means and assign each pixel the RGB values of its assigned cluster center, we can reduce the number of RGB combinations or in other words reduce the number of colors in the image, thereby compressing it and reducing its size.

Let us start by importing all the required libraries. sklearn library provides us with the KMeans implementation class and PIL is an image processing library, which is used here to obtain the pixel values from the image. Matplotlib is also used to plot the original and compressed images and finally, os is used to find the size in bits of the images.

In [4]:
from sklearn.cluster import KMeans
import numpy as np
from PIL import Image
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import os
%matplotlib inline

We open the image using PIL and get the RGB pixel values.

In [5]:
img = Image.open('dt.png')
img_np = np.asarray(img)
img_np[0:2]
Out[5]:
array([[[10, 28, 52],
        [ 4, 22, 46],
        [ 2, 17, 40],
        ..., 
        [76, 84, 86],
        [75, 85, 86],
        [77, 87, 88]],

       [[ 9, 27, 51],
        [ 4, 22, 46],
        [ 2, 17, 40],
        ..., 
        [76, 84, 86],
        [75, 85, 86],
        [77, 87, 88]]], dtype=uint8)
In [111]:
img_np.shape
Out[111]:
(641, 961, 3)

The dimensions of the image are 641 X 961 pixels with each pixel having 3 values(RGB). For feeding the data into our algorithm we will change the shape of this data into a dataset with 641*961=616001 rows and 3 columns.

In [6]:
pixels = img_np.reshape(img_np.shape[0] * img_np.shape[1],img_np.shape[2])
pixels.shape
Out[6]:
(616001, 3)

We define the kmeans model with number of clusters =16 i.e. k=16. This means that we want to keep just 16 colors in our compressed image. You can play around with this keeping in mind that the lesser the number of clusters the more compressed will the image be but the quality of the image will also decrease.

In [7]:
model = KMeans(n_clusters =16)
model.fit(pixels)
Out[7]:
KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,
    n_clusters=16, n_init=10, n_jobs=1, precompute_distances='auto',
    random_state=None, tol=0.0001, verbose=0)

The rest of the parameters of the model such as number of iterations are 300. n_init=10 implies that it will randomnly initialise the clusters 10 times, run the algorithm each time and display the best result. It is useful to initialise the clusters more than once as a bad random initialisation may cause the algorithm to be stuck and assign the clusters improperly.

After the model is trained we use model.labels_ to obtain the cluster number that is assigned to each data point or each pixel. model.cluster_centers_ gives us the coordinates or the RGB values of the 16 cluster centers.

In [8]:
pixel_centroids = model.labels_
cluster_centers = model.cluster_centers_
pixel_centroids
Out[8]:
array([0, 0, 0, ..., 0, 0, 0])
In [9]:
cluster_centers
Out[9]:
array([[   1.91054429,   17.80051227,   54.13144077],
       [ 189.51429273,  132.66038257,  102.7068992 ],
       [ 245.81484011,  241.66314479,  237.21950828],
       [  23.93298189,   47.88333608,  107.10594102],
       [ 134.13041864,   81.13195025,   54.34702497],
       [  84.49980999,  111.55453164,  187.23760213],
       [   2.41781489,    2.41997908,   10.4407553 ],
       [   9.75902711,   27.23656595,   73.75279795],
       [  86.26367588,   89.36545559,   93.08135816],
       [  21.92316217,   21.28230768,   32.58652536],
       [ 164.34685202,  109.21885376,   77.82688429],
       [  17.01017475,   37.01158686,   89.77580251],
       [ 217.40282302,  210.4691911 ,  203.96498371],
       [ 207.63164447,  163.28501183,  136.76378688],
       [  82.06127429,   55.57035698,   41.86760054],
       [  34.26601668,   62.34267495,  122.73340573]])

Now we want to assign the cluster centers to the data points corresponnding to the cluster to which they belong. We take a matrix of shape (616001,3) corresponding to the pixels and initalise it with zeros. We then iterate through all the clusters and assign the cluster centroids(RGB values) to each data point, thereby reducing our image into 16 colors.

In [11]:
final = np.zeros((pixel_centroids.shape[0],3))
final.shape
Out[11]:
(616001, 3)
In [12]:
for cluster_no in range(16):
    final[pixel_centroids==cluster_no] = cluster_centers[cluster_no]
final[0:5]
Out[12]:
array([[  1.91054429,  17.80051227,  54.13144077],
       [  1.91054429,  17.80051227,  54.13144077],
       [  1.91054429,  17.80051227,  54.13144077],
       [  1.91054429,  17.80051227,  54.13144077],
       [  2.41781489,   2.41997908,  10.4407553 ]])

After we have obtained our compressed image in pixel values we reshape it to the original dimensions, convert the pixel values into an image and save it.

In [13]:
comp_image = final.reshape(img_np.shape[0],img_np.shape[1],3)
comp_image.shape
Out[13]:
(641, 961, 3)
In [14]:
comp_image = Image.fromarray(np.uint8(comp_image))
comp_image.save('dt_compressed.png')

We use matplotlib to plot the original and compressed images

In [15]:
img1 = mpimg.imread('dt.png')
img2 = mpimg.imread('dt_compressed.png')
In [16]:
fig, (ax1,ax2) = plt.subplots(1,2, figsize = (20,20))
ax1.imshow(img1)
ax1.set_title('Original image')
ax2.imshow(img2)
ax2.set_title('Compressed image')
plt.show()
 

Let us now compare the size of the original and compressed images.

In [17]:
print('size of original image:' ,int(os.stat('dt.png').st_size/1024), 'kB')
print('size of compressed image:' ,int(os.stat('dt_compressed.png').st_size/1024), 'kB')
size of original image: 412 kB
size of compressed image: 70 kB

As you can see, we get a almost a 6 times reduction in the size keeping most of the features of the image intact. Let us try this with another image, this time choosing 32 clusters.

In [20]:
img = Image.open('autumn.png')
img_np = np.asarray(img)
pixels = img_np.reshape(img_np.shape[0] * img_np.shape[1],img_np.shape[2])
model = KMeans(n_clusters =32)
model.fit(pixels)
pixel_centroids = model.labels_
cluster_centers = model.cluster_centers_
final = np.zeros((pixel_centroids.shape[0],3))
for cluster_no in range(32):
    final[pixel_centroids==cluster_no] = cluster_centers[cluster_no]
comp_image = final.reshape(img_np.shape[0],img_np.shape[1],3)
comp_image = Image.fromarray(np.uint8(comp_image))
comp_image.save('autumn_compressed.png')
img1 = mpimg.imread('autumn.png')
img2 = mpimg.imread('autumn_compressed.png')
fig, (ax1,ax2) = plt.subplots(1,2, figsize = (20,20))
ax1.imshow(img1)
ax1.set_title('Original image')
ax2.imshow(img2)
ax2.set_title('Compressed image')
plt.show()
In [21]:
print('size of original image:' ,int(os.stat('autumn.png').st_size/1024), 'kB')
print('size of compressed image:' ,int(os.stat('autumn_compressed.png').st_size/1024), 'kB')
size of original image: 839 kB
size of compressed image: 437 kB

This time we took 32 colors and reduced the size by half. We can see that we get a much better quality image now. That is all about using clustering to compress an image. Any questions and suggestions are welcome in the comments. Happy learning 🙂

Share this article!

Saksham Malhotra

Saksham Malhotra

After learning python 2 years ago and dabbling in web development, I encountered data science and felt 'Yes, this is what I want to do.' I strongly believe that the world can be changed through the power of data. Among my other interests, I love reading fiction and owning more books than I can possibly read at once.

Linkedin - www.linkedin.com/in/saksham-malhotra-9bb69513b
Saksham Malhotra

Latest posts by Saksham Malhotra (see all)

Saksham Malhotra

After learning python 2 years ago and dabbling in web development, I encountered data science and felt 'Yes, this is what I want to do.' I strongly believe that the world can be changed through the power of data. Among my other interests, I love reading fiction and owning more books than I can possibly read at once. Linkedin - www.linkedin.com/in/saksham-malhotra-9bb69513b

2 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *