Find Closest Pair of Points in an Array - Learn Coding Online

Given an array of n points, where each point contains an x coordinate and a y coordinate, we want to find the closest pair of points. To do that, we need to use a proximity measure. Here, we will use the Euclidean distance, which is given as:

Where p=(x₁,y₁) and q=(x₂,y₂).

Brute Force Approach

We can easily find the nearest pair of points using the brute force approach. We calculate the distance from each point to all others and return the pair with the minimum distance. However, this is a naïve approach, and it has a time complexity of O(n2).

Code

Consider its implementation given below.

import math
#this function finds the euclidean distance between two points
def euclidean(a, b):
  return  math.sqrt((a[0]-b[0])**2+(a[1]-b[1])**2)

def bruteForce(points, n):
  result = {
    'p1':(None, None),
    'p2':(None, None),
    'min_dist': float('inf')
  }

  for i in range(0, n-1):
    for j in range(i+1, n): 
      dist = euclidean(points[i], points[j])
      if dist < result['min_dist']: 
        result['p1'] = points[i]
        result['p2'] = points[j]
        result['min_dist'] = dist

  return result  

  points = [(2, 3), (5, 5), (4, 10), (6, 1), (12, 10), (7, 4)]
  n = len(points)
  closest = bruteForce(points, n)
  print(f"The closest pair of points are {closest['p1']} and {closest['p2']}. The distance is {closest['min_dist']:.3f}")

Output

The closest pair of points are (5, 5) and (7, 4). The distance is 2.236

Divide and Conquer Approach

We can design an algorithm that performs better than the brute force approach. It solves the problem in O(nlogn) time by using the divide and conquer methodology.

Let’s see how this algorithm works.

The first step is a preprocessing step. Here, we sort the given array by x and y coordinates, i.e., we will get two arrays. Let’s name them Px and Py, respectively.
Then, we divide the sorted Px array into two equal (or roughly equal) parts. We find the nearest pair of points and distances in both subarrays, recursively. We will keep splitting the array until we get less than four elements. Then, we will calculate distances using the brute force approach.

So far, we are only considering those pairs of points, which are either on the left side or the right side together. We can also get unlucky, i.e., one point on the left and the other on the right side. Therefore, we also need to consider this case.

First, we find the minimum of distances obtained from the left subarray and the right subarray. Let’s denote the minimum distance by delta. If the nearest pair of points exists across the left and right side, then the x coordinate of these points will be within the range of delta from the x coordinate of the middle point that divides the left and right subarray.
We add those points in another array, let’s say, strip. This array gets calculated using the Py array (points sorted by the y coordinate), and it allows us to calculate the minimum distance in O(n) time. Moreover, we only consider the next seven points when calculating the minimum distance in the strip array.
If the distance obtained from the above step is less than delta, then it means one point is on the right side and one is on the left. Otherwise, both points lie on the same side. We return the nearest pair of points and distance accordingly.

The implementation of this approach is given below.

import math

#this function finds the euclidean distance between two points
def euclidean(a, b):
  return  math.sqrt((a[0]-b[0])**2+(a[1]-b[1])**2)

def bruteForce(points, n):
  result = {
    'p1':(None, None),
    'p2':(None, None),
    'min_dist': float('inf')
  }

  for i in range(0, n-1):
    for j in range(i+1, n): 
      dist = euclidean(points[i], points[j])
      if dist < result['min_dist']: 
        result['p1'] = points[i]
        result['p2'] = points[j]
        result['min_dist'] = dist

  return result

def closestPairStrip(strip, length, delta_points):
  result =  delta_points
  for i in range(0, length-1):
    for j in range(i+1, min(7, length)): #only calculate distance for the next 7 points
      dist = euclidean(strip[i], strip[j])
      if dist < result['min_dist']:
        result['p1'] = strip[i]
        result['p2'] = strip[j]
        result['min_dist'] = dist
  return result

def closestPair(Px, Py, length):
  #base case
  if length < 4:
    result = bruteForce(Px, length)
    return result
  #find the mid and split the x-sorted array into left and right parts
  mid = length //2
  Qx = Px[:mid]
  Rx = Px[mid:]

  Qy = []
  Ry = []
  # split Py into two arrays
  for point in Py:
    if point[0] in Qx:
      Qy.append(point)
    else:
      Ry.append(point)

  #recursion 
  left_min = closestPair(Qx, Qy, mid)
  right_min = closestPair(Rx, Ry, length-mid)

  #find closest obtained from the left sub array and  the right sub array
  if (left_min['min_dist'] < right_min['min_dist']):
    delta_points = left_min
  else:
    delta_points = right_min
  delta = delta_points['min_dist']
  
  #find points whose x-coordinate is within the range of delta from the x coordinate of the middle point that splits the array
  strip = []
  for point in Py:
    if (abs(point[0]-Px[mid][0]) < delta):
      strip.append(point)
  #find the closest pair of points in the strip array
  closest_pair_strip = closestPairStrip(strip, len(strip), delta_points)
   #return the closest pair of points in the array
  if (closest_pair_strip['min_dist'] < delta):
    return closest_pair_strip
  else:
    return delta_points



def preprocessing(array, length):
  Px = sorted(array, key=lambda x:x[0]) #sort points by x coordinate
  Py = sorted(array, key=lambda x:x[1]) #sort points by y coodinate
  return closestPair(Px, Py, length)


points = [(2, 3), (5, 5), (4, 10), (6, 1), (12, 10), (7, 4)]
n = len(points)
closest = preprocessing(points, n)
print(f"The closest pair of points are {closest['p1']} and {closest['p2']}. The distance is {closest['min_dist']:.3f}")

Output

The closest pair of points are (7, 4) and (5, 5). The distance is 2.236

Brute Force Approach

Divide and Conquer Approach

Leave a Reply Cancel reply