Given an array of n points, where each point contains an x coordinate and a y coordinate, we want to find the closest pair of points. To do that, we need to use a proximity measure. Here, we will use the Euclidean distance, which is given as:
Where p=(x1,y1) and q=(x2 ,y2).
Brute Force Approach
We can easily find the nearest pair of points using the brute force approach. We calculate the distance from each point to all others and return the pair with the minimum distance. However, this is a naïve approach, and it has a time complexity of O(n2).
Code
Consider its implementation given below.
import math #this function finds the euclidean distance between two points def euclidean(a, b): return math.sqrt((a[0]-b[0])**2+(a[1]-b[1])**2) def bruteForce(points, n): result = { 'p1':(None, None), 'p2':(None, None), 'min_dist': float('inf') } for i in range(0, n-1): for j in range(i+1, n): dist = euclidean(points[i], points[j]) if dist < result['min_dist']: result['p1'] = points[i] result['p2'] = points[j] result['min_dist'] = dist return result points = [(2, 3), (5, 5), (4, 10), (6, 1), (12, 10), (7, 4)] n = len(points) closest = bruteForce(points, n) print(f"The closest pair of points are {closest['p1']} and {closest['p2']}. The distance is {closest['min_dist']:.3f}")
Output
The closest pair of points are (5, 5) and (7, 4). The distance is 2.236
Divide and Conquer Approach
We can design an algorithm that performs better than the brute force approach. It solves the problem in O(nlogn) time by using the divide and conquer methodology.
Let’s see how this algorithm works.
- The first step is a preprocessing step. Here, we sort the given array by x and y coordinates, i.e., we will get two arrays. Let’s name them Px and Py, respectively.
- Then, we divide the sorted Px array into two equal (or roughly equal) parts. We find the nearest pair of points and distances in both subarrays, recursively. We will keep splitting the array until we get less than four elements. Then, we will calculate distances using the brute force approach.
So far, we are only considering those pairs of points, which are either on the left side or the right side together. We can also get unlucky, i.e., one point on the left and the other on the right side. Therefore, we also need to consider this case.
- First, we find the minimum of distances obtained from the left subarray and the right subarray. Let’s denote the minimum distance by delta. If the nearest pair of points exists across the left and right side, then the x coordinate of these points will be within the range of delta from the x coordinate of the middle point that divides the left and right subarray.
- We add those points in another array, let’s say, strip. This array gets calculated using the Py array (points sorted by the y coordinate), and it allows us to calculate the minimum distance in O(n) time. Moreover, we only consider the next seven points when calculating the minimum distance in the strip array.
- If the distance obtained from the above step is less than delta, then it means one point is on the right side and one is on the left. Otherwise, both points lie on the same side. We return the nearest pair of points and distance accordingly.
The implementation of this approach is given below.
import math #this function finds the euclidean distance between two points def euclidean(a, b): return math.sqrt((a[0]-b[0])**2+(a[1]-b[1])**2) def bruteForce(points, n): result = { 'p1':(None, None), 'p2':(None, None), 'min_dist': float('inf') } for i in range(0, n-1): for j in range(i+1, n): dist = euclidean(points[i], points[j]) if dist < result['min_dist']: result['p1'] = points[i] result['p2'] = points[j] result['min_dist'] = dist return result def closestPairStrip(strip, length, delta_points): result = delta_points for i in range(0, length-1): for j in range(i+1, min(7, length)): #only calculate distance for the next 7 points dist = euclidean(strip[i], strip[j]) if dist < result['min_dist']: result['p1'] = strip[i] result['p2'] = strip[j] result['min_dist'] = dist return result def closestPair(Px, Py, length): #base case if length < 4: result = bruteForce(Px, length) return result #find the mid and split the x-sorted array into left and right parts mid = length //2 Qx = Px[:mid] Rx = Px[mid:] Qy = [] Ry = [] # split Py into two arrays for point in Py: if point[0] in Qx: Qy.append(point) else: Ry.append(point) #recursion left_min = closestPair(Qx, Qy, mid) right_min = closestPair(Rx, Ry, length-mid) #find closest obtained from the left sub array and the right sub array if (left_min['min_dist'] < right_min['min_dist']): delta_points = left_min else: delta_points = right_min delta = delta_points['min_dist'] #find points whose x-coordinate is within the range of delta from the x coordinate of the middle point that splits the array strip = [] for point in Py: if (abs(point[0]-Px[mid][0]) < delta): strip.append(point) #find the closest pair of points in the strip array closest_pair_strip = closestPairStrip(strip, len(strip), delta_points) #return the closest pair of points in the array if (closest_pair_strip['min_dist'] < delta): return closest_pair_strip else: return delta_points def preprocessing(array, length): Px = sorted(array, key=lambda x:x[0]) #sort points by x coordinate Py = sorted(array, key=lambda x:x[1]) #sort points by y coodinate return closestPair(Px, Py, length) points = [(2, 3), (5, 5), (4, 10), (6, 1), (12, 10), (7, 4)] n = len(points) closest = preprocessing(points, n) print(f"The closest pair of points are {closest['p1']} and {closest['p2']}. The distance is {closest['min_dist']:.3f}")
Output
The closest pair of points are (7, 4) and (5, 5). The distance is 2.236