Member-only story

Hierarchical Clustering in Python: Unraveling Data Structures

Tahera Firdose
4 min readDec 3, 2023

--

Hierarchical clustering is a powerful technique used in data analysis and machine learning for uncovering hierarchical structures within datasets. In this blog, we’ll explore how to perform hierarchical clustering using Python on a real dataset, calculate distances, create a dendrogram, and understand the linkage methods involved.

In my previous blog, we learnt the concepts of hierarchical clustering, including its types and operational principles. Today, we will take a hands-on approach and explore how to practically implement hierarchical clustering using Python.

What is Hierarchical Clustering?

Hierarchical clustering is a method that groups similar data points into clusters, forming a tree-like structure known as a dendrogram. It starts with each data point as its own cluster and iteratively merges or divides clusters based on the similarity between data points.

Real-world Applications

Hierarchical clustering finds applications in various domains:

  • Biology: It is used for gene expression analysis and classifying species based on genetic similarities.
  • Customer Segmentation: Businesses use it to segment customers based on their purchasing behavior.
  • Image Processing: In computer vision, it is used for image segmentation and object recognition.
  • Social Sciences…

--

--

Tahera Firdose
Tahera Firdose

Written by Tahera Firdose

Datascience - Knowledge grows exponentially when it is shared

No responses yet