Manhattan Distance, also known as L1 or taxicab distance, measures how far apart two points are by summing the absolute differences of their coordinates. Unlike straight-line (Euclidean) distance, it calculates distance along grid-like paths like a taxi navigating city streets rather than cutting through buildings.
For two points P = (x_{1}, y_{1}), Q = (x_{2}, y_{2}), the Manhattan distance is:
- For 2-dimensions: |x_{1} - x_{2}| + |y_{1} - y_{2}|
- For 3-dimensions: points |x_{1} - x_{2}| + |y_{1} - y_{2}| + |z_{1} - z_{2}|
- For n-dimensions: \sum_{i=1}^{n} |x_i - y_i|
= |x_1 - y_1| + |x_2 - y_2| + \cdots + |x_n - y_n|
Examples of Manhattan Distance
Example 1: For 2- dimensions
For 2 dimensionsPoints P=(1,2),Q=(4,0)
D(P, Q) = |x₁ - x₂| + |y₁ - y₂| = |1 - 4| + |2 - 0| = 3 + 2 = 5
Example 2: For 3-dimensions
For 3 dimensionsPoints: P=(1,2,3),Q=(4,0,1)
D(P, Q) = |x₁ - x₂| + |y₁ - y₂| + |z₁ - z₂|
=> |1 - 4| + |2 - 0| + |3 - 1|
= 3 + 2 + 2
= 7
Example 4: For 4 dimensions
For 4 dimensionsPoints: P=(1,2,3,4),Q=(4,0,1,2)
D(P, Q) = |x₁ - x₂| + |y₁ - y₂| + |z₁ - z₂| + |w₁ - w₂|
=>|1 - 4| + |2 - 0| + |3 - 1| + |4 - 2|
= 3 + 2 + 2 + 2
= 9
Comparison with Euclidean Distance
While Manhattan distance measures movement along a grid (like a taxi navigating streets), Euclidean distance represents the direct, straight-line distance between points (like a bird flying from start to end).
In general, Euclidean distance is always less than or equal to Manhattan distance, because it takes the shortest possible path rather than following the axes of the grid.
Example 1: 2D Coordinates
2D CoordinatesPoints: P=(1,2),Q=(4,0)
D_{\text{Manhattan}}(P,Q) = |x_1 - x_2| + |y_1 - y_2| \\ => |1 - 4| + |2 - 0| \\ = 3 + 2 \\ = 5
D_{\text{Euclidean}}(P,Q) = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2} \\ => \sqrt{(1 - 4)^2 + (2 - 0)^2} \\ = \sqrt{9 + 4} \\ = \sqrt{13} \\ \approx 3.61
When to Use Manhattan vs. Euclidean Distance
Manhattan distance is ideal when:
- Movement is restricted to grid-like paths (e.g., city streets, circuit boards).
- Diagonal movement is not allowed or is costly.
- Working with high-dimensional datasets in machine learning, where computation efficiency matters.
- Analyzing discrete or ordinal data.
Euclidean distance is better when:
- Measuring physical distances in open spaces.
- Diagonal movement is relevant or allowed.
- Working with continuous data, where straight-line measurement reflects reality.
Manhattan Distance in Python
1. Using NumPy arrays
Python
import numpy as np
a = np.array([2, 3, 5])
b = np.array([7, 1, 9])
d = np.sum(np.abs(a - b))
print(d)
Explanation: np.abs(a - b) finds the absolute differences for each coordinate and np.sum() adds them to get the Manhattan distance.
2. Using SciPy’s cityblock function
Python
from scipy.spatial.distance import cityblock
a = (2, 3, 5)
b = (7, 1, 9)
d = cityblock(a,b)
print(d)
Output
11
Explanation: cityblock() directly computes the sum of absolute differences between coordinates of two points, returning the Manhattan distance.
Applications of Manhattan Distance
- Pathfinding Algorithms : Used as a heuristic in grid-based searches like A*. Guides efficient routing in city streets, mazes, or game AI where movement is horizontal/vertical.
- Clustering Techniques: Works well with high-dimensional or sparse data, e.g., K-Means clustering, text classification, or document clustering. Helps reduce outlier impact and produces balanced clusters.
- Image and Pattern Recognition: Compares pixel values or feature vectors for template matching, facial recognition, and object detection. Fast computation often outweighs minor precision loss compared to Euclidean distance.
- Outlier Detection: Less sensitive to extreme values in one dimension, making it valuable in fraud detection, network security, or financial anomaly detection.
- Geographic Information Systems (GIS): Models movement along street grids, helping urban planners optimize delivery routes, accessibility, or facility locations in cities.
Explore
Introduction to Machine Learning
Python for Machine Learning
Introduction to Statistics
Feature Engineering
Model Evaluation and Tuning
Data Science Practice