2-D Hypervolume Subset Selection in Python

Hypervolume Subset Selection Problem (HSSP) is a dynamic programming algorithm used to select a subset of points from a non-dominated 2D Pareto front. This subset maximizes the hypervolume (or dominated area in 2-D) covered with respect to a reference point that bounds this area from above to make it finite. Originally proposed by Auger et al. (2009), this blog post provides a Python implementation of the HSSP algorithm and demonstrates its application.

Key Features of the Implementation

  1. Dynamic Programming Approach: The algorithm uses a multi-dimensional numpy array to store intermediate results for the hypervolume calculations, ensuring efficient computation.
  2. Reference Point Handling: The reference point r must be dominated by all points in the Pareto front. This ensures the proper calculation of the hypervolume.
  3. Bit Vector Output: The function returns a bit vector indicating the selected points, which can be easily interpreted and used in further analyses.

The source code is provided at the end of this essay.

Testing the Code

The hssp_test() function demonstrates the algorithm with a sample Pareto front. For each q (number of points to select), the function computes the subset with the maximal hypervolume and prints the results. For instance, running the test with the given data outputs the selection results for q ranging from 1 to 5.

The hypervolume and the extent of the selected subset is highlighted in the picture below.
This visualization highlights:

  • A labeled 10×10 grid for context.
  • The Pareto front points (1,7), (2,6), (3,5), (5,4), (6,3), (8,2) in red.
  • The reference point (10,8) in blue.
%python3 C:/Users/memmerix/hssp.py --wdir
Results for q=1:
[0 0 1 0 0 0]
Results for q=2:
[0 0 1 0 1 0]
Results for q=3:
[1 0 1 0 1 0]
Results for q=4:
[1 0 1 0 1 1]
Results for q=5:
[1 1 1 0 1 1]

Further Reading

The HSSP algorithm by Auger et al. 2009 [1] was the first algorithm that solved the hypervolume subset selection problem in polynomial time.

The hypervolume subset selection problem is NP-hard in 3 and more dimensions. There are faster but more complicated algorithms for hypervolume subset selection in 3-D (Bringmann et al. 2014, Kuhn et al. 2016).

References

Auger, A., Bader, J., Brockhoff, D., & Zitzler, E. (2009, July). Investigating and exploiting the bias of the weighted hypervolume to articulate user preferences. In Proceedings of the 11th Annual conference on Genetic and evolutionary computation (pp. 563-570).

Bringmann, K., Cabello, S., & Emmerich, M. (2017). Maximum Volume Subset Selection for Anchored Boxes. In 33rd International Symposium on Computational Geometry (SoCG 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.

Kuhn, T., Fonseca, C. M., Paquete, L., Ruzika, S., Duarte, M. M., & Figueira, J. R. (2016). Hypervolume subset selection in two dimensions: Formulations and algorithms. Evolutionary Computation24(3), 411-425.

Bringmann, K., Friedrich, T., & Klitzke, P. (2014, July). Two-dimensional subset selection for hypervolume and epsilon-indicator. In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (pp. 589-596).

import numpy as np

def hssp(O, p, q, r):
    """
    Compute the \lambda individuals that exactly select the HV
    Algorithm by Auger et al. 2009, translated to Python.

    Parameters:
        O (np.ndarray): The ordered sequence of points in a 2-D non-dominated set (the 'staircase').
        p (int): Number of points in O.
        q (int): Number of points to be selected for maximal hypervolume.
        r (list): Reference point (must dominate all points in O).

    Returns:
        np.ndarray: A bit vector marking the selected points.
    """
    h = np.zeros((p, p))

    # Initialize h(i,1)
    for i in range(p):
        h[i, 0] = (r[0] - O[i, 0]) * (r[1] - O[i, 1])

    S = np.zeros((p, q, q), dtype=int)

    # Base case for the dynamic programming
    for i in range(q - 1, p):
        S[i, 0, 0] = i

    for t in range(1, q):
        for c in range(q - t - 1, p - t):
            l = np.zeros(p)

            for d in range(c + 1, p - t + 1):
                l[d] = h[d, t - 1] + (O[d, 0] - O[c, 0]) * (r[1] - O[c, 1])

            val = np.max(l)
            ind = np.argmax(l)
            h[c, t] = val
            S[c, t, 0] = c
            for i in range(t):
                S[c, t, i + 1] = S[ind, t - 1, i]

    val = np.max(h[:, q - 1])
    ind = np.argmax(h[:, q - 1])

    subset = [S[ind, q - 1, i] for i in range(q)]
    b_selection = np.zeros(p, dtype=int)
    for i in subset:
        b_selection[i] = 1

    return b_selection

def hssp_test():
    """
    Test the hssp function with a sample dataset.
    """
    PF = np.array([[1, 7], [2, 6], [3, 5], [5, 4], [6, 3], [8, 2]])
    r = [10, 8]
    p = 6

    for q in range(1, 6):
        print(f"Results for q={q}:")
        result = hssp(PF, p, q, r)
        print(result)

# Run the test function
hssp_test()

Leave a comment