The K- means clustering works by randomnly initialisinsg k-cluster centers from all the data points. Then in every iteration, it proceeds to find the distance of each point from the cluster centers and assigns each point the coordinate of the cluster center which is nearest to it. After all points have been assigned to the cluster, it takes the mean of the points in every cluster and sets it as the new cluster center. It keeps on doing this for a specified number of iterations.
In an image, there are many different colors and hence many different RGB pixel combinations. If we apply K-means and assign each pixel the RGB values of its assigned cluster center, we can reduce the number of RGB combinations or in other words reduce the number of colors in the image, thereby compressing it and reducing its size.
Let us start by importing all the required libraries. sklearn library provides us with the KMeans implementation class and PIL is an image processing library, which is used here to obtain the pixel values from the image. Matplotlib is also used to plot the original and compressed images and finally, os is used to find the size in bits of the images.
from sklearn.cluster import KMeans import numpy as np from PIL import Image import matplotlib.image as mpimg import matplotlib.pyplot as plt import os %matplotlib inline
We open the image using PIL and get the RGB pixel values.
img = Image.open('dt.png') img_np = np.asarray(img) img_np[0:2]
array([[[10, 28, 52], [ 4, 22, 46], [ 2, 17, 40], ..., [76, 84, 86], [75, 85, 86], [77, 87, 88]], [[ 9, 27, 51], [ 4, 22, 46], [ 2, 17, 40], ..., [76, 84, 86], [75, 85, 86], [77, 87, 88]]], dtype=uint8)
(641, 961, 3)
The dimensions of the image are 641 X 961 pixels with each pixel having 3 values(RGB). For feeding the data into our algorithm we will change the shape of this data into a dataset with 641*961=616001 rows and 3 columns.
pixels = img_np.reshape(img_np.shape * img_np.shape,img_np.shape) pixels.shape
We define the kmeans model with number of clusters =16 i.e. k=16. This means that we want to keep just 16 colors in our compressed image. You can play around with this keeping in mind that the lesser the number of clusters the more compressed will the image be but the quality of the image will also decrease.
model = KMeans(n_clusters =16) model.fit(pixels)
KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300, n_clusters=16, n_init=10, n_jobs=1, precompute_distances='auto', random_state=None, tol=0.0001, verbose=0)
The rest of the parameters of the model such as number of iterations are 300. n_init=10 implies that it will randomnly initialise the clusters 10 times, run the algorithm each time and display the best result. It is useful to initialise the clusters more than once as a bad random initialisation may cause the algorithm to be stuck and assign the clusters improperly.
After the model is trained we use model.labels_ to obtain the cluster number that is assigned to each data point or each pixel. model.cluster_centers_ gives us the coordinates or the RGB values of the 16 cluster centers.
pixel_centroids = model.labels_ cluster_centers = model.cluster_centers_ pixel_centroids
array([0, 0, 0, ..., 0, 0, 0])
array([[ 1.91054429, 17.80051227, 54.13144077], [ 189.51429273, 132.66038257, 102.7068992 ], [ 245.81484011, 241.66314479, 237.21950828], [ 23.93298189, 47.88333608, 107.10594102], [ 134.13041864, 81.13195025, 54.34702497], [ 84.49980999, 111.55453164, 187.23760213], [ 2.41781489, 2.41997908, 10.4407553 ], [ 9.75902711, 27.23656595, 73.75279795], [ 86.26367588, 89.36545559, 93.08135816], [ 21.92316217, 21.28230768, 32.58652536], [ 164.34685202, 109.21885376, 77.82688429], [ 17.01017475, 37.01158686, 89.77580251], [ 217.40282302, 210.4691911 , 203.96498371], [ 207.63164447, 163.28501183, 136.76378688], [ 82.06127429, 55.57035698, 41.86760054], [ 34.26601668, 62.34267495, 122.73340573]])
Now we want to assign the cluster centers to the data points corresponnding to the cluster to which they belong. We take a matrix of shape (616001,3) corresponding to the pixels and initalise it with zeros. We then iterate through all the clusters and assign the cluster centroids(RGB values) to each data point, thereby reducing our image into 16 colors.
final = np.zeros((pixel_centroids.shape,3)) final.shape
for cluster_no in range(16): final[pixel_centroids==cluster_no] = cluster_centers[cluster_no] final[0:5]
array([[ 1.91054429, 17.80051227, 54.13144077], [ 1.91054429, 17.80051227, 54.13144077], [ 1.91054429, 17.80051227, 54.13144077], [ 1.91054429, 17.80051227, 54.13144077], [ 2.41781489, 2.41997908, 10.4407553 ]])
After we have obtained our compressed image in pixel values we reshape it to the original dimensions, convert the pixel values into an image and save it.
comp_image = final.reshape(img_np.shape,img_np.shape,3) comp_image.shape
(641, 961, 3)
comp_image = Image.fromarray(np.uint8(comp_image)) comp_image.save('dt_compressed.png')
We use matplotlib to plot the original and compressed images
img1 = mpimg.imread('dt.png') img2 = mpimg.imread('dt_compressed.png')
fig, (ax1,ax2) = plt.subplots(1,2, figsize = (20,20)) ax1.imshow(img1) ax1.set_title('Original image') ax2.imshow(img2) ax2.set_title('Compressed image') plt.show()
Let us now compare the size of the original and compressed images.
print('size of original image:' ,int(os.stat('dt.png').st_size/1024), 'kB') print('size of compressed image:' ,int(os.stat('dt_compressed.png').st_size/1024), 'kB')
size of original image: 412 kB size of compressed image: 70 kB
As you can see, we get a almost a 6 times reduction in the size keeping most of the features of the image intact. Let us try this with another image, this time choosing 32 clusters.
img = Image.open('autumn.png') img_np = np.asarray(img) pixels = img_np.reshape(img_np.shape * img_np.shape,img_np.shape) model = KMeans(n_clusters =32) model.fit(pixels) pixel_centroids = model.labels_ cluster_centers = model.cluster_centers_ final = np.zeros((pixel_centroids.shape,3)) for cluster_no in range(32): final[pixel_centroids==cluster_no] = cluster_centers[cluster_no] comp_image = final.reshape(img_np.shape,img_np.shape,3) comp_image = Image.fromarray(np.uint8(comp_image)) comp_image.save('autumn_compressed.png') img1 = mpimg.imread('autumn.png') img2 = mpimg.imread('autumn_compressed.png') fig, (ax1,ax2) = plt.subplots(1,2, figsize = (20,20)) ax1.imshow(img1) ax1.set_title('Original image') ax2.imshow(img2) ax2.set_title('Compressed image') plt.show()
print('size of original image:' ,int(os.stat('autumn.png').st_size/1024), 'kB') print('size of compressed image:' ,int(os.stat('autumn_compressed.png').st_size/1024), 'kB')
size of original image: 839 kB size of compressed image: 437 kB
This time we took 32 colors and reduced the size by half. We can see that we get a much better quality image now. That is all about using clustering to compress an image. Any questions and suggestions are welcome in the comments. Happy learning 🙂
Linkedin - www.linkedin.com/in/saksham-malhotra-9bb69513b
Latest posts by Saksham Malhotra (see all)
- ML Algos from Scratch – KMeans Clustering - June 22, 2018
- Compressing Images using K-Means Clustering - April 20, 2018
- Introduction to SoftMax Regression (with codes in Python) - September 24, 2017