2 minute read

Delete duplicates from sorted array and shift - Elements of Programming Interviews (EPI)

Problem:

Write a function to delete duplicates from a sorted array and shift them such that all non duplicates come first in the array and the remaining deleted entries replaced with None come afterwards. Return the number of unique entries. Example if input is [1,1,1,2,2,3] then the shifted array looks like [1,2,3,None,None,None] and the total unique entries is 3.

Solution:

The brute force solution would be to have a separate array. Iterate over our input Array and store unique values into the separate array. Then overwrite all entries in the beggining of our array and replace all entries after that with None. A better solution would be to do it inplace by considering the fact that we have a sorted array so all duplicate entries will come together. We start off with a unique entry as current_idx and then we compare it with entries ahead of it, setting all duplicates to None, until we find a unique entry, in which case we swap that with current_idx + 1 and also remember which entry we swapped with. We can keep moving further this way until we have exhausted swappable entries. An example makes it clearer:

  • Consider input `A = [1,1,1,2,2,3] we start off with: \(current_idx = 0 \\ swap = 1\)
  • A[current_idx] = A[0] = 1
  • We compare forward indices A[swap] == A[current_idx] -> A[1]==A[0] -> 1 == 1 -> True.
  • Since they are equal we set A[swap] = A[1] = None and we continue iterating with swap += 1
  • A[swap] == A[current_idx] -> A[2] == A[0] -> 1 == 1 -> True.
  • A[swap] = A[2] = None , swap += 1.
  • A[swap] == A[current_idx] -> A[3] == A[0] -> 2 == 1 -> False.
  • Since we found another unique entry we swap:
  • A[current_idx+1], A[swap] = A[swap], A[current_idx+1]
  • Our array looks like : [1,2,None,None,2,3]
  • Now we increment the swap and current index which we are testing:
  • current_idx += 1, swap += 1
  • A[swap] == A[current_idx -> A[4] == A[1] -> 2 == 2 -> True.
  • A[swap] = A[4] = None. swap += 1.
  • A[swap] == A[current_idx] -> A[5] == A[1] -> 3 == 1 -> False.
  • Another unique.. swapping:
  • A[current+1], A[swap] = A[swap], A[current_idx+1]
  • current_idx += 1, swap += 1
  • Our array looks like : [1,2,3,None,None,None]
  • We notice that swap is now at the end of the array so we stop iterating.
  • The total number of unique entries will be the number of indexes we have tested which will simply be 1 more than the current index :
  • total = current_idx + 1 In code:
def delete_duplicates(A: List[int]) -> int:
    """Function to remove duplicates from an array"""
    current_idx, swap = 0, 1
    while swap < len(A):
        if A[current_idx] == A[swap]:
            A[swap] = None
            swap += 1
        else:
            A[current_idx + 1], A[swap] = A[swap], A[current_idx + 1]
            swap += 1
            current_idx += 1
    return current_idx + 1