The Lamé class of Lorenz curves

ABSTRACT In this paper, the class of Lamé Lorenz curves is studied. This family has the advantage of modeling inequality with a single parameter. The family has a double motivation: it can be obtained from an economic model and from simple transformations of classical Lorenz curves. The underlying cumulative distribution functions have a simple closed form, and correspond to the Singh–Maddala and Dagum distributions, which are well known in the economic literature. The Lorenz order is studied and several inequality and polarization measures are obtained, including Gini, Donaldson–Weymark–Kakwani, Pietra, and Wolfson indices. Some extensions of the Lamé family are obtained. Fitting and estimation methods under two different data configurations are proposed. Empirical applications with real data are given. Finally, some relationships with other curves are included.

In a recent paper, Henle et al. (2008) introduced a family of LCs, the so-called Lamé family, which is characterized by a single inequality parameter. This family presents several practical and theoretical advantages. The authors obtain the family from an economic model (see Sec. 3). On the other hand, the idea of describing the inequality by a single parameter is quite interesting from an economic point of view, and has been supported by Slottje (2010). The family of Lamé LCs is defined by two simple functional forms, which depend on one single parameter. One of the main advantages of this family is that we can obtain estimates of the Gini index and others inequality measures very similar to those obtained with other more complicated functional forms.
In this paper, the class of Lamé LCs is studied. This family has the advantage of modeling inequality with a single parameter, and many of the most important statistical and economic measures for studying the inequality can be obtained in a closed form. The family has a double motivation: it can be obtained from an economic model and from a simple transformation of the potential and classical Pareto LCs. The underlying cumulative distribution functions (CDFs) have a simple closed form, and correspond to the Singh-Maddala and Dagum distributions, which are very popular in the economic literature. The Lorenz order is studied and several inequality and polarization measures are obtained, including Gini, Donaldson-Weymark-Kakwani, Pietra, and Wolfson indices. Two direct and simple extensions of the Lamé family are obtained. Fitting and estimation methods under two different data configurations are proposed. Two empirical applications with real data are presented. Finally, some relationships with other curves are included.
The rest of this paper is organized as follows. In Sec. 2 the basic results, which will be used along the following paper, will be introduced. The family of Lamé LCs is introduced in Sec. 3 and the underlying CDF is obtained in Sec. 4. The Lorenz ordering and the inequality and polarization measures are studied in Sec. 5. Some extensions of the Lamé LCs are discussed in Sec. 6. Estimation methods are proposed in Sec. 7. Section 8 includes two empirical applications. Finally, some relationships with other curves are included in Sec. 9.

Previous results
In this paper, we consider the LC as defined by Gastwirth (1971). Let L be the class of all non negative random variables with positive finite expectation μ. For a random variable X in L with distribution function F X we define its inverse distribution function F −1 X by Thus, the LC associated with X is defined by Note that μ X = 1 0 F −1 X (y)dy is the expectation of the random variable X. A detailed discussion about these concepts can be found in Arnold (1987), Johnson, Kotz andBalakrishnan (1994, Chapter 12), andSarabia (2008a).
A characterization of an LC attributed to Gaffney and Anstis by Pakes (1986) is given by the following theorem.
Note that L (0 + ) means the value of first derivative of function L(p) for p = 0 to the right. The following lemma considers the composition of LCs.

defines a new genuine LC.
Proof. The proof is direct by checking conditions (2.2) in Theorem 2.1.
Using (2.1) it is clear that F −1 X (y) = μ X L X (y). In relation to the probability density function (PDF) f (x) associated with an LC L(p) we have the following (Arnold, 1987) theorem.
Theorem 2.2. If the second derivative L (p) exists and is positive everywhere in an interval (x 1 , x 2 ), the corresponding CDF F has a finite positive PDF in the interval (μL (x + 1 ), μL (x − 2 )), which is given by .

The family
The Lamé class of LCs is given by the curves: It can be proved using Theorem 2.1 that (3.1) is a genuine LC if 0 < a ≤ 1 and (3.2) is a genuine LC if a ≥ 1. Note that for both the curves the case a = 1 corresponds to the egalitarian line.

Motivation
The family has a double motivation: it can be obtained from an economic model and from simple transformations of classical LCs.
In relation with the economic motivation, Henle et al. (2008) obtained the curves (3.1) and (3.2) making use of two different economic theories, the so-called trickle-up and trickle-down effects. The first approach supposes that an increase in income of the lower-middle class would be more advantageous for the economy since they spend their wealth faster than the upper class. In contrast, trickle-down theory assumes that an increase in income of the upper class would stimulate the investment thus resulting in a fall of unemployment. This situation would benefit the general welfare. Then, the trickle-up effect is related with the following expression: and the trickle-down theory with where I is the income of a family at rank r, A N L is the aggregate income of poorer citizens, N(1 − r) is the number of wealthy individuals, A N (1 − L) is the income of the wealthiest citizens, and Nr is the number of individuals of lower rank. Note that following Equation (3.3), the increase in income of any individual of the society is directly associated with an improvement of the economic situation of low rank citizens. Conversely, Equation (3.4) shows an increment of the upper class income would lead an economic progress of the society as a whole. Finally, the curves (3.1) and (3.2) are the solutions of Equations (3.3) and (3.4).

The underlying CDF
The underlying CDFs are obtained in a closed form in the following theorem.
Theorem 4.1. The underlying CDF associated to the LCs (3.1) and (3.2) are given by and F 2 (x; a, μ) = 0 if x < 0, respectively, where μ represents the mathematical expectation of the population.
Proof. For the LC (3.1) we have Finally, solving previous equation for z and then for x we obtain (4.1). Using a similar reasoning with (3.2), we obtain (4.2).
Both distributions correspond to well-known distributions used in the income and wealth literature. The family (4.1) is a Singh-Maddala distribution (Singh and Maddala, 1976) with shape parameters a 1−a and 1 a and scale parameter μ. The raw moments of (4.1) are The family (4.2) is a Dagum (1977) distribution with shape parameters a a−1 and 1 a and scale parameter μ. In this case, the raw moments are

Lorenz ordering
The study of Lorenz ordering is a crucial aspect in the analysis of income and wealth distributions. Let L be the class of all non negative random variables with positive finite expectation. The Lorenz partial order ≤ L on the class L is defined by If X ≤ L Y , then X exhibits less inequality than Y in the Lorenz sense. We shall show that families (3.1) and (3.2) are ordered with respect to a parameters.
is a differentiable function with respect to a, in consequence: for all p ∈ (0, 1). For the L 2 (p; a) curve: for all p ∈ (0, 1).

Gini index
Lemma 5.2. The Gini indices of the curves L 1 (p, a) and L 2 (p, a) are given by respectively.
Some configurations of the Gini index deserve our attention. If a = 1 n , with n = 1, 2, . . . the Gini index (5.1) can be written in the simple form:

Donaldson-Weymark-Kakwani index
An important generalization of the Gini index was proposed by Donaldson and Weymark (1980) and Kakwani (1980) and studied by Yitzhaki (1983). These authors proposed the generalized Gini index defined as where ν > 1 and L X (·) is the LC. If we set ν = 1 in (5.3) we obtain the Gini index. When ν increases, higher weights are attached to small incomes. The limit case when ν goes to infinity depends on the lowest income, expressing the judgement introduced by Rawls, that social welfare depends only on the poorest society member.
Lemma 5.3. The Donaldson-Weymark-Kakwani indices of the curves L 1 (p, a) and L 2 (p, a) are given by In the case of the curve (3.1), if ν is an integer, expression (5.4) can be written in a simple form. It can be proved (Muliere and Scarsini, 1989) where X 1:ν represents the minimum random variable in a random sample of size ν. In our case, the distribution of the minimum of a Singh-Maddala distribution is again a Singh-Maddala distribution with parameters a 1−a , ν a , and μ. Using (5.6), we obtain the expression: Other simple expressions can also be obtained in the case of L 2 . For example, if ν = 2, Equation (5.5) can be written of the form: , and if ν = 3, Equation (5.5) becomes

Pietra index
The Pietra index is defined as the maximal vertical deviation between the LC and the egalitarian line: If we assume that F X is strictly increasing on its support, the function p − L X (p) will be differentiable everywhere on (0, 1) and its maximum will be reached when 1 − F −1 X (x)/μ X is zero, that is, when x = F X (μ X ). The value of p − L X (p) in this point is given by ( 5.7) This lemma provides a simple expression for the Pietra's indices of the curves L 1 and L 2 .
Lemma 5.4. The Pietra indices of the curves L 1 (p, a) and L 2 (p, a) are given by respectively.
Proof. The proof is direct using formula (5.7) and the expressions (4.1) and (4.2).

Polarization index
The polarization measurement has been recently proposed as an important variable to characterize income and wealth distributions. It is well known, that polarization is widely accepted as a different concept from inequality. A polarization measure concentrates the income distribution on several polar modes. On the other hand, inequality relates to the overall dispersion of the distribution. A more bipolarized income and wealth distribution is one that is more spread out from the middle, thus implying that the middle class is made up of fewer individuals (Wolfson, 1994). The Wolfson's index of bipolarization was originally proposed for a population divided in two groups by the median value. This measure is given by where G X , m X , and μ X represent the Gini index, median, and mean associated to the LC L X , respectively. We have the following lemma. respectively.

Extensions
Recent research about the LC (Basmann et al., 1990;Ryu and Slottje, 1996;Sarabia et al., 2005, Sarabia, 2008a has shown that some families of LCs approximate only some segments of the income distribution but not others. This fact justifies the consideration of more complex models for the LC beginning with an initial LC. The following two direct extensions of (3.1) and (3.2) can be considered: Curve (6.1) is genuine if 0 < a ≤ 1 ≤ b and (6.2) is genuine if a ≥ 1 and 0 < b ≤ 1. Curve (6.1) was proposed by Rasche et al. (1980) and (6.2) was considered (without to be studied) by Arnold (1987). The Gini index of the LC (6.2) is which will be used in the next section. Other kind of extensions can also be considered using the methodologies proposed by Sarabia et al. (1999Sarabia et al. ( , 2005.

Estimation
In this section we consider two different estimation methods for two different data configurations.

Estimation from data of the LC
Let us suppose that we wish to estimate the parameter a from the curves L 1 and L 2 . Let X 1 , . . . , X n be a sample of size n of income data. The observations consist of n pairs of points (p 1 , q 1 ), … ,(p n , q n ), where p i = i/n, q i = s i /s n , and s i = x 1:n + · · · + x i:n for i = 1, 2, . . . , n, being x i:n , i = 1, 2, . . . , n the ith order statistic. The simplest way of estimating a parameter is minimizing the quantity: for L 1 and for L 2 , using a non linear optimization standard algorithm, which is available in any of the existing statistical and econometric software packages, including SAS, SPSS, SHAZAM, EViews, and Mathematica. We can take a 0 = 1 as initial value. An alternative robust method of estimation is given by Castillo et al. (1998).

Estimation with limited information
The practical use of the LC requires information on the income and wealth distribution or at least, the income shares for a number of income classes. Unfortunately, sometimes this information is not available. However, in some economic database the available comparable cross-country information is limited to per capita income and the Gini index (see, for example, Chotikapanich et al., 1997). Then, assuming that the only available information for the estimation of (3.1) and (3.2) (or (4.1) and (4.2)) is the mean and the Gini index. A plausible estimation method which gives place to consistent estimates, consists of solving the system: μ =X, G k (a) = g, k = 1, 2, whereX and g represent the mean and the Gini sample values, respectively, and G k (a), k = 1, 2 the theoretical Gini indices (5.1) and (5.2), respectively. Therefore, the estimate values of μ and a are given byμ =X, Note that the right-hand side of (7.4) is a monotonic function of parameter g, and consequently has only one solution.

Estimation when some point of the LC are available
Results of the estimation of the LCs studied and the corresponding Gini index are presented in this section. The source of data used for this purpose is taken from Shorrocks (1983). The data correspond to cumulated income shares for 19 countries derived from Jain (1975). This data set is relevant since the sample considered is characterized by heterogeneity among countries in terms of income inequality, thus allowing us to draw conclusions regarding the adjustment of the LCs for different levels of inequality. Tables 1-4 present the estimation results for the two Lamé LCs (3.1) and (3.2) and for the two extensions (6.1) and (6.2), respectively. The estimators have been obtained by non linear least squares, according to the method established in the previous section, minimizing expression (7.1) or (7.2) and (7.3). Using a consistent estimate of the covariance matrix of the coefficients (Amemiya, 1985) we computed the corresponding standard errors. These tables also give several error measures. The mean square error (MSE), which is given by the mean absolute error (MAE): and the maximum absolute error (MAX): where L(p;θ ) represents the LC estimated. It can be concluded that the empirical results reported in this study indicate that the four functions, though simple and easy to estimate, are very satisfactory in fitting data. We also present the estimates of the Gini indices associated with the four LCs under study, which are presented in the last column of each table.
From Tables 1 and 2 we observe that almost 50% of the countries considered are better fitted by the curve L 1 , whereas the curve L 2 outperforms the adjustment in the rest of the countries. As consequence, we cannot conclude the superiority of one curve over the other, when the previous three error criteria are considered.
If we consider the two extensions L 3 and L 4 , the superiority of the curve L 4 is concluded, since it adjusts the income distribution more adequately in 13 out of 19 countries included in the sample (Tables 3 and 4). It is important to note that all of these countries are characterized by low levels of inequality (Gini values lower than 0.5). On the contrary, the L 3 LC fits better income distributions of countries with high inequality (Gini values greater than 0.5).
Regarding the comparison of the general models L 3 and L 4 (Equations (3.1) and (3.2)) with its special cases L 1 and L 2 (Equations (6.1) and (6.2)), some comments are included. As might be expected, the extensions considered outperform single-parameter curves since the three error measures are substantially lower for the curves L 4 and L 3 . However, Gini index estimates based on Lamé curves and its extensions are practically identical, differing by no more than 0.035 and, in the majority of cases the pairwise difference is lower than 0.01. In line with other studies that compare the performance of more complex functional forms with one-parameter curves (Ogwang and Rao, 1996), we observe that the greater differences correspond to the countries that present higher levels of inequality. Thus, following the parsimony principle, the use of more complicated expressions would not be relevant for the adjustment of the Lamé LCs and the posterior estimation of the corresponding Gini index, at least for the data set considered.

Estimation with limited information
In this section we focus on international inequality at global level, using the methodology proposed in Sec. 7. We use the information provided by Bourguignon and Morrison (2002), which includes the estimates of several world inequality measures. Using the empirical Gini index and Equation (7.4) with L 1 and L 2 , we have obtained the single-parameter estimates of the inequality indices a. The standard errors were computed using parametric bootstrap with B = 999 bootstrap replications. Using these estimators, we have obtained the Pietra and Polarization indices given in Equations (5.8)-(5.9) and (5.11)-(5.12), respectively. All these estimators are recorded in Tables 5 and 6. The results point out that inequality has increased in the last two centuries, conclusion that is congruent with the evolution of the Gini index. Moreover, the polarization has intensified over the study period thus indicating that the differences among wealthiest and poorer countries have increased. Bonferroni (1930) considered a curve for studying inequality, which is more suitable for the analysis of low income groups. The curve, in terms of the quantile function is given by

Relationships with other curves
(9.1) Table . Estimated inequality and polarization measures with limited information using the curve (.). Standard errors based on parametric bootstrap in parentheses.