Measuring multidimensional inequality: a new proposal based on the Fourier transform (2401.14012v1)

Published 25 Jan 2024 in physics.soc-ph, cs.IT, math.IT, and math.PR

Abstract: Inequality measures are quantitative measures that take values in the unit interval, with a zero value characterizing perfect equality. Although originally proposed to measure economic inequalities, they can be applied to several other situations, in which one is interested in the mutual variability between a set of observations, rather than in their deviations from the mean. While unidimensional measures of inequality, such as the Gini index, are widely known and employed, multidimensional measures, such as Lorenz Zonoids, are difficult to interpret and computationally expensive and, for these reasons, are not much well known. To overcome the problem, in this paper we propose a new scaling invariant multidimensional inequality index, based on the Fourier transform, which exhibits a number of interesting properties, and whose application to the multidimensional case is rather straightforward to calculate and interpret.

Citations (5)

View on Semantic Scholar

Summary

The paper introduces a novel Fourier-transform index that generalizes one-dimensional inequality measures to multivariate data.
It derives a scaling-invariant formulation using a covariance transformation, linking the index to the Mahalanobis distance.
The method offers computational efficiency and clearer empirical insights compared to traditional multidimensional measures.

This paper introduces a new method for measuring inequality in multivariate data, addressing the limitations of existing multidimensional measures like Lorenz Zonoids, which are noted as computationally expensive and difficult to interpret [19]. The proposed approach is based on the Fourier transform of the multivariate probability distribution, generalizing a one-dimensional index previously developed by one of the authors [24].

The foundation of the new index lies in the observation that traditional one-dimensional inequality measures, such as the Gini and Pietra indices, can be expressed in terms of the Fourier transform of the probability distribution. For a one-dimensional probability measure $F$ with mean $m$ , the Gini index can be related to the $L^2$ distance between the Fourier transform of $F$ and the Fourier transform of a step function representing a uniform distribution up to a certain point (Equation 2.6). Building on this, a one-dimensional Fourier-based index $T(F)$ was introduced, related to the supremum distance between the Fourier transform of $F$ and the Fourier transform of a step function centered at the mean (Equation 2.9).

The core proposal is a multivariate extension of this Fourier-based index. For an n-dimensional probability measure $F$ with mean vector $m$ , the initial n-dimensional index $T_n(F)$ is defined based on the gradient of the Fourier transform $f(\xi)$ at $\xi = 0$ (which equals $-i m$ ) and the supremum of an expression involving $f(\xi)$ and its gradient over $\mathbb{R}^n$ (Equation 2.10):

$T_n(F) = \frac{1}{2|m|} \sup_{\xi \in \mathbb{R}^n} | \nabla f(\xi=0) f(\xi) - \nabla f(\xi) |$

This index satisfies $0 \le T_n(F) \le 1$ for measures supported on the non-negative orthant $P_s^+(R^n)$ and is zero if and only if $F$ is a Dirac delta distribution (perfect equality). The paper shows that this index is bounded by multivariate versions of the Pietra $P_n(F)$ and Gini $G_n(F)$ indices (Equations 2.12 and 2.13), where the Euclidean distance $|x-m|$ is used: $0 \le T_n(F) \le P_n(F) \le G_n(F) \le 1$ . A key advantage highlighted is that computing $T_n(F)$ involves taking a supremum over $\mathbb{R}^n$ and potentially point evaluations of the Fourier transform and its gradient, which can be computationally simpler than the multiple integrations required for $P_n(F)$ and $G_n(F)$ in higher dimensions.

A critical challenge for multidimensional inequality measures is satisfying the scaling property, meaning the index should remain unchanged if the units of measurement for one or more components are changed. The initial $T_n(F)$ index is invariant only under uniform scaling $X \to cX$ but not under component-wise scaling $X_k \to a_k X_k$ . To address this, the paper modifies the argument of the Fourier transform. Instead of evaluating $f(\xi)$ , the index is computed using $f^*(\xi) = f(\xi^*)$ , where $\xi^*$ is a transformed version of $\xi$ that incorporates the inverse square root of the covariance matrix $\Sigma$ (Equations 2.19 and 2.24). Specifically, $\xi^* = \Sigma^{-1/2} \xi$ , where $\Sigma^{-1/2}$ is derived from the eigendecomposition of $\Sigma$ . The resulting scaling-invariant index is $T_n(F^*) = \frac{1}{2|m^*|} \sup_{\xi \in \mathbb{R}^n} | \nabla f^*(\xi=0) f^*(\xi) - \nabla f^*(\xi) |$ (Equation 2.25), where $m^* = \Sigma^{-1/2} m$ .

This transformation reveals an interesting connection to the Mahalanobis distance. The term $|m^*| = \sqrt{m^T \Sigma^{-1} m}$ is the Mahalanobis distance of the mean vector from the origin. The paper demonstrates this connection explicitly. This link is significant because Mahalanobis distance is a scale-invariant measure of distance in multivariate space.

The paper provides concrete examples for calculating the index:

Two-point distribution: For a distribution taking values $a$ and $b$ with probabilities $1-p$ and $p$ , the scaling-invariant index $T_n(F)$ (evaluated using $f^*$ ) is explicitly derived as $\frac{p(1-p) \sqrt{(b-a)^T \Sigma^{-1}(b-a)}}{\sqrt{m^T \Sigma^{-1} m}}$ (Equation 2.28). This formula involves the Mahalanobis distance between $a$ and $b$ , and the Mahalanobis distance of the mean $m$ from the origin. This structure is analogous to the one-dimensional case.
Multivariate Gaussian distribution: For a multivariate Gaussian distribution $N$ with mean $m$ and covariance $\Sigma$ , the scaling-invariant index $T_n(N)$ is found to be $\frac{1}{2e} \frac{1}{\sqrt{m^T \Sigma^{-1} m}}$ (Equation 2.32). This result is inversely proportional to the Mahalanobis distance of the mean from the origin and directly proportional to the reciprocal of the Voinov and Nikulin multivariate coefficient of variation, which is also scale-invariant [25].

The paper also discusses several properties of the index $T_n$ . It shows convexity on the set of probability measures with the same mean and covariance matrix. It analyzes the behavior of the index for sums of independent random vectors. Notably, adding a constant positive vector to each observation decreases inequality (Equation 3.5), a property consistent with one-dimensional inequality measures. Adding independent copies of the same distribution also tends to decrease or maintain inequality (Equation 3.6).

Finally, the paper proposes alternative definitions for multivariate Pietra and Gini indices by replacing the Euclidean distance with the Mahalanobis distance in their definitions (Equations 4.1 and 4.2). These Mahalanobis-based versions, denoted $\mathcal{P}_n(X)$ and $\mathcal{G}_n(X)$ , also satisfy the scaling property and provide values proportional to the Voinov-Nikulin coefficient of variation for Gaussian distributions.

Practical Implementation:

To implement the proposed scaling-invariant multidimensional Fourier-based inequality index $T_n(F^*)$ for a dataset consisting of $N$ observations $\{x_i\}_{i=1}^N$ , where each $x_i \in \mathbb{R}^n$ :

Estimate Mean and Covariance: Calculate the sample mean vector $\hat{m} = \frac{1}{N} \sum_{i=1}^N x_i$ and the sample covariance matrix $\hat{\Sigma} = \frac{1}{N-1} \sum_{i=1}^N (x_i - \hat{m})(x_i - \hat{m})^T$ . Ensure the data is in $P_s^+(R^n)$ , meaning components are non-negative. If components can be negative, the bounds might not hold. The mean vector $\hat{m}$ must have $||\hat{m}|| > 0$ , and $\hat{\Sigma}$ must be positive definite for the inverse $\hat{\Sigma}^{-1}$ to exist. Handle cases where $\hat{\Sigma}$ is singular (e.g., using pseudo-inverse or regularization).
Compute $\hat{\Sigma}^{-1/2}$ : Perform an eigendecomposition of $\hat{\Sigma}$ : $\hat{\Sigma} = Z \Lambda Z^T$ , where $\Lambda$ is a diagonal matrix of eigenvalues $\lambda_k > 0$ , and $Z$ is an orthogonal matrix of eigenvectors. Then $\hat{\Sigma}^{-1/2} = Z \Lambda^{-1/2} Z^T$ , where $\Lambda^{-1/2}$ has diagonal elements $1/\sqrt{\lambda_k}$ .
Define the transformed Fourier argument: For any $\xi \in \mathbb{R}^n$ , define $\xi^* = \hat{\Sigma}^{-1/2} \xi$ .
Compute the empirical Fourier Transform: For discrete empirical data, the empirical characteristic function (which is related to the Fourier transform) is given by $\hat{f}(\xi) = \frac{1}{N} \sum_{j=1}^N e^{-i \xi^T x_j}$ . The transformed empirical characteristic function is $\hat{f}^*(\xi) = \hat{f}(\xi^*) = \frac{1}{N} \sum_{j=1}^N e^{-i (\hat{\Sigma}^{-1/2} \xi)^T x_j}$ .
Compute Gradients: The gradient of $\hat{f}^*(\xi)$ with respect to $\xi$ is $\nabla \hat{f}^*(\xi) = \frac{1}{N} \sum_{j=1}^N (-i \hat{\Sigma}^{-1/2} x_j) e^{-i (\hat{\Sigma}^{-1/2} \xi)^T x_j}$ . At $\xi=0$ , $\nabla \hat{f}^*(\xi=0) = \frac{1}{N} \sum_{j=1}^N (-i \hat{\Sigma}^{-1/2} x_j) = -i \hat{\Sigma}^{-1/2} \hat{m} = -i \hat{m}^*$ .
Calculate $||\hat{m}^*||$ : Compute $||\hat{m}^*|| = ||\hat{\Sigma}^{-1/2} \hat{m}|| = \sqrt{\hat{m}^T (\hat{\Sigma}^{-1/2})^T \hat{\Sigma}^{-1/2} \hat{m}} = \sqrt{\hat{m}^T \hat{\Sigma}^{-1} \hat{m}}$ .
Evaluate the term to be maximized: Define $G(\xi) = | \nabla \hat{f}^*(\xi=0) \hat{f}^*(\xi) - \nabla \hat{f}^*(\xi) | = | (-i \hat{m}^*) \left( \frac{1}{N} \sum_{j=1}^N e^{-i (\hat{\Sigma}^{-1/2} \xi)^T x_j} \right) - \left( \frac{1}{N} \sum_{j=1}^N (-i \hat{\Sigma}^{-1/2} x_j) e^{-i (\hat{\Sigma}^{-1/2} \xi)^T x_j} \right) |$ . This simplifies to $G(\xi) = \frac{1}{N} | \sum_{j=1}^N i (\hat{\Sigma}^{-1/2} x_j - \hat{m}^*) e^{-i (\hat{\Sigma}^{-1/2} \xi)^T x_j} | = \frac{1}{N} | \sum_{j=1}^N (\hat{\Sigma}^{-1/2} x_j - \hat{m}^*) e^{-i \xi^T (\hat{\Sigma}^{-1/2} x_j)} |$ .
Find the Supremum: The index requires finding $\sup_{\xi \in \mathbb{R}^n} G(\xi)$ . For empirical data, this supremum might be challenging to find analytically. Numerical optimization techniques (e.g., gradient ascent on $G(\xi)$ ) or sampling over a suitable range of $\xi$ values would be necessary. The optimization is over an $n$ -dimensional space.
Compute the Index: Once the supremum $S = \sup_{\xi \in \mathbb{R}^n} G(\xi)$ is found, the empirical scaling-invariant Fourier-based index is $T_n(\hat{F}^*) = \frac{1}{2||\hat{m}^*||} S$ .

Pseudocode for Empirical Calculation:

import numpy as np
from scipy.linalg import sqrtm, inv

def compute_multivariate_fourier_inequality_index(data):
    """
    Computes the scaling-invariant multivariate Fourier-based inequality index.

    Args:
        data (np.ndarray): A numpy array of shape (N, n) where N is the number
                           of observations and n is the number of dimensions.
                           Assumes data components are non-negative.

    Returns:
        float: The calculated inequality index.
               Returns NaN if mean vector is zero or covariance matrix is singular.
    """
    N, n = data.shape

    # 1. Estimate Mean and Covariance
    mean_vec = np.mean(data, axis=0)
    if np.linalg.norm(mean_vec) == 0:
        print("Mean vector is zero, index is undefined.")
        return np.nan

    cov_mat = np.cov(data.T)

    # Check for singularity
    try:
        cov_inv = inv(cov_mat)
        cov_inv_sqrt = inv(sqrtm(cov_mat))
    except np.linalg.LinAlgError:
        print("Covariance matrix is singular, cannot compute index.")
        return np.nan

    # 2. and 3. Compute Sigma^-1/2 and transformed data points
    transformed_data = (cov_inv_sqrt @ data.T).T # Shape (N, n)

    # 6. Compute ||m*||
    m_star = cov_inv_sqrt @ mean_vec
    norm_m_star = np.linalg.norm(m_star)

    if norm_m_star == 0: # Should not happen if mean_vec is non-zero and cov is non-singular
         print("Transformed mean vector is zero, index is undefined.")
         return np.nan

    # 7. Define the function to maximize: G(xi) = 1/N * | sum_{j=1}^N (x_j^* - m^*) exp(-i xi^T x_j^*) |
    # where x_j^* = Sigma^-1/2 * x_j
    def G(xi):
        xi_vec = np.asarray(xi) # Ensure xi is a numpy array
        exponents = -1j * (transformed_data @ xi_vec) # Shape (N,)
        complex_exp = np.exp(exponents) # Shape (N,)

        # (x_j^* - m^*) terms
        diff_terms = transformed_data - m_star # Shape (N, n)

        # Element-wise product and sum over observations
        sum_terms = np.sum(diff_terms * complex_exp[:, np.newaxis], axis=0) # Shape (n,)
        return np.linalg.norm(sum_terms) / N # Magnitude of the vector sum

    # 8. Find the Supremum of G(xi)
    # This is the most challenging part. A simple approach is to sample randomly.
    # A more robust approach would involve numerical optimization, e.g., using scipy.optimize.minimize
    # with a negative objective function, or sampling + local optimization.
    # For illustration, let's use a simple random sampling approach.
    num_samples = 1000 # Number of random xi vectors to sample
    max_G = 0.0
    # Determine a reasonable range for xi values based on data scale
    # A simple heuristic is to scale by the inverse sqrt of variance in transformed space
    xi_scale = 1.0 / np.sqrt(np.mean(transformed_data**2))
    if np.isnan(xi_scale) or np.isinf(xi_scale) or xi_scale == 0:
         xi_scale = 1.0 # Fallback

    for _ in range(num_samples):
        # Sample xi from a distribution, e.g., standard normal scaled by xi_scale
        random_xi = np.random.randn(n) * xi_scale
        g_val = G(random_xi)
        if g_val > max_G:
            max_G = g_val

    # Note: Random sampling is an approximation and may not find the true supremum.
    # For real applications, consider more sophisticated optimization methods.

    # 9. Compute the Index
    index = (1 / (2 * norm_m_star)) * max_G

    return index

This pseudocode outlines the steps for calculating the empirical index. The main computational challenge lies in finding the supremum in step 8, which requires optimization over an $n$ -dimensional space. The complexity of this optimization depends on the dimension $n$ and the shape of the function $G(\xi)$ . For low dimensions, standard optimization algorithms might work. For higher dimensions, approximation techniques or alternative methods to estimate the supremum might be necessary. The paper mentions that for discrete probability measures, the use of the Fourier transform enables "very fast computational procedures" [2, 3], suggesting that Fast Fourier Transform (FFT) based methods might be applicable if the data can be represented on a grid, although this empirical formulation sums over observations directly.

The proposed index offers a theoretically grounded alternative to existing multidimensional inequality measures, explicitly addressing the critical scaling property via its connection to the Mahalanobis distance. Its tractability for distributions with known Fourier transforms (like Gaussian) and its potential for more efficient empirical estimation compared to methods like Lorenz Zonoids make it a promising tool for applications in economics, social sciences, and other fields dealing with multivariate heterogeneity.

PDF Markdown

Related Papers

Tweets

https://twitter.com/Encoding/status/1750789860518961168