Editorial for K-Distinct Subarrays


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

Approach

It is easier to count the complement.

Let count_at_most(t) be the number of non-empty subarrays containing at most t distinct values. Then the required answer is:

\(\text{all subarrays} - \text{count\_at\_most}(k - 1)\)

To compute count_at_most(t), use a two-pointer window. Expand the right end one step at a time, track frequencies inside the current window, and while the window has too many distinct values, move the left end forward until it becomes valid again.

Once the window [l, r] is the longest valid window ending at r, there are exactly r - l + 1 valid subarrays ending at r. Add that to the total.

This runs in O(n) time.

Solution (Python)

from collections import defaultdict


n, k = map(int, input().split())
values = list(map(int, input().split()))


def count_at_most(limit: int) -> int:
    if limit <= 0:
        return 0

    freq: dict[int, int] = defaultdict(int)
    distinct = 0
    left = 0
    total = 0

    for right, value in enumerate(values):
        if freq[value] == 0:
            distinct += 1
        freq[value] += 1

        while distinct > limit:
            left_value = values[left]
            freq[left_value] -= 1
            if freq[left_value] == 0:
                distinct -= 1
            left += 1

        total += right - left + 1

    return total


all_subarrays = n * (n + 1) // 2
print(all_subarrays - count_at_most(k - 1))

Comments

There are no comments at the moment.