Efficient algorithms for the inversion of the cumulative central beta distribution

Accurate and efficient algorithms for the inversion of the cumulative central beta distribution are described. The algorithms are based on the combination of a fourth-order fixed point method with good non-local convergence properties (the Schwarzian-Newton method), asymptotic inversion methods and sharp bounds in the tails of the distribution function.


Introduction
The cumulative central beta distribution (also known as the incomplete beta function) is defined by where we assume that p and q are real positive parameters and 0 ≤ x ≤ 1. B(p, q) is the Beta function B(p, q) = Γ(p)Γ(q) Γ(p + q) . (1.2) From the integral representation in (1.1) it is easy to check the following relation: (1.3) In this paper we describe algorithms for solving the equation with p, q given fixed real positive values. In statistical terms, we are computing the quantile function for I x (p, q).
The beta distribution ia a standard and widely used statistical distribution which has as particular cases other important distributions like the Student's distribution, the F-distribution and the binomial distribution. Therefore, the computational schemes for inverting the central beta distribution can be used to compute percentiles for other distributions related to the beta. For an example for the F-distribution see [1].
The quantile function is useful, for instance, for the generation of random variables following the beta distribution density. In some Monte Carlo simulations the generation of such random variables are required and a massive number of inversions of the beta cumulative distribution are needed. Therefore, it is important to construct methods as reliable and efficient as possible.
Existing algorithms use some simple initial approximations which are improved by iterating with the Newton method. In particular, this is the approach used in the inversion method of the statistical software package R, which is based on the algorithm of [9] and the succesive improvements and corrections [4,2,3]. In [9], a simple approximation is used in terms of the error function together with two additional starting value approximations for the tails; these initial values are refined by the Newton iteration. As discussed in [4,2], the Newton method needs some modification to ensure convergence inside the interval [0, 1], and further tuning of the Newton method has been considered in recent versions of this algorithm for R (but some convergence problem still remain in the present version, as we later discuss).
In this paper the methods for the computation of the inversion of the cumulative beta distribution are improved in several directions. In the first place, we introduce the Schwarzian-Newton method (SNM) as alternative to Newton's method (NM). With respect to Newton's method the SNM has the advantage of having order of convergence four instead of two. In addition, as explained in [10], the SNM has good non-local properties for this type of functions and it is possible to build an algorithm with certified convergence. In the second place, we analyze initial value approximations (much sharper than those given in [9]) in terms of asymptotic approximations for large values of p and/or q, but which also give accurate values for moderate values; these approximations are given in terms of inverse error functions or the inverse gamma distribution ( [12], [6, §10.5.2], [13, §42.3]). We also provide improved approximations for the tails obtained from the sharp bounds described in [11].
An additional direction of improvement of the algorithms is in the selection of the methods of computation of the beta distribution, which are needed in the application of iterative methods (with Newton, SNM or any other choice). This is not discussed in this paper, and we leave this topic for future research. A relatively recent algorithm was given in [5].

Methods of computation
We next describe the methods of computation used in the algorithms. First we describe the SNM method, and discuss how a standalone algorithm with certified convergence can be built with this method, provided an accurate method of computation of the beta distribution is available. In the second place we describe the methods for estimating the quantile function based on asymptotics for large p and/or q. Finally, we describe sharp upper and lower bounds for the tails that can be used for solving the problem (1.4) for α close to zero or 1.

Schwarzian-Newton method
The Schwarzian-Newton method (SNM) is a fourth order fixed point method with good non-local convergence properties for solving nonlinear equations f (x) = 0 [10]. The SNM has Halley's method as limit when the Schwarzian derivative of the function f (x) tends to zero.
Given a function f (x) with positive derivative (in our case f (x) = I x (p, q) − α), it is easy to prove that Φ(x) = f (x)/ f ′ (x) satisfies the differential equation where {f, x} is the Schwarzian derivative of f (x) with respect to x: (2. 2) The SNM is obtained from the approximate integration of the Riccati under the assumption that Ω(x) is approximately constant. In the case of negative Schwarzian derivative (which will be the case for the beta distribution) the iteration function can be written as: We discuss two implementations: a direct implementation, which gives a convergent algorithm for p, q > 1 and an implementation with an exponential change of variables, which is more easy to handle for the rest of cases.

The direct implementation
It is proved in [10] that if Ω(x) has one and only one extremum at x e ∈ I and it is a maximum, then if Ω < 0 the SNM converges monotonically to the root of f (x) in I starting from x 0 = x e . We will use this result for the cumulative central beta distribution, when the parameters p and q are larger than 1. In this case, the function Ω(x) (the Schwarzian derivative of f (x) with a factor 1/2) is given by It is possible to show that for p > 1 and q > 1, the function Ω(x) in (2.4) is negative in (0, 1) and has only one extremum (which is a maximum) in that interval. The extremum of Ω(x) is at (p + q) 2 + 2(p + q) , (p + q + 2) (p + q + 2)(27q + 54 + q 2 p + 18pq + 27p + p 2 q) .
Then, the fixed point method is (2.3) with Ω(x) given by (2.4) and

The exponential implementation
When p and/or q are smaller than 1, it is possible to make a change of variables in order to obtain a negative Schwarzian derivative and simpler monotonicity properties. In particular, with the change of variables we obtain that Φ(z) = f (z)/ ḟ (z) (where the dot represents the derivative with respect to z) satisfiesΦ(z) + Ω(z)Φ(z) = 0, where The function Ω(z) has its extremum at z e = log(x e /(1 − x e )), x e = (p − 1)/(p + q − 2). When p and/or q smaller than 1, Ω(z) can be either be monotonic or it can have a minimum. Convergence of the SNM can be guaranteed, in this case, using the following results [10]: a) if Ω(z) is negative and decreasing in the interval I = [a, α], then the SNM converges monotonically to α for any starting value z 0 ∈ [a, α]; b) if Ω(z) is negative and increasing in the interval I = [α, b], then the SNM converges monotonically to α for any starting value z 0 ∈ [α, b].
The case p = q = 1 is of course trivial. Apart from this, there are three different cases to be considered: a) p ≤ 1, q > 1: the function Ω(z) is decreasing. In this case, the SNM uses as starting point a large negative value (in the z variable). b) p > 1, q ≤ 1: the function Ω(z) is increasing. In this case, the SNM uses as starting point a large positive value (in the z variable). c) p < 1, q < 1: the extremum of Ω(z) at z e is reached and it is a minimum. In this case, we use the sign of the function h(z) at z e to select a subinterval for application of the SNM, according to the previous results. The function h(z) is given by When this sign is negative, the SNM uses a large positive value (in the z variable), otherwise the SNM uses a large negative value.
Once the SNM is applied to find the root z r in the z-variable, the corresponding x-value will be given by x r = e zr 1 + e zr .

Discussion
We have constructed two methods of order four which are proven to converge with certainty for the initial values prescribed. The method has, in addition, good non-local properties, which means that few iterations are needed for a good estimation of the inverse (typically from 2 to 4 for 20 digits), even without accurate starting values. The exceptions are the tails (α very close to 0 or 1), but we will discuss later how to deal with these cases.
Because the convergence is guaranteed, no special precaution is needed to ensure that the interval [0, 1] in the original variable is not abandoned, as happened with earlier versions of the algorithm given in [9] (see [4]) and as it is still happens for some values in the latests R version of this algorithm. For instance, the R command qbeta(alpha,600,1.1) does not converge properly if alpha∈ (6.9 10 −35 , 1.4 10 −20 ). Our method avoids this type of problems.
The performance of the method can be improved by considering initial approximations, which we are discussing next.

Asymptotic inversion methods
The algorithm considered in [9], which is the basis of the R implementation, uses an approximation in terms of the inverse error function, which works reasonably away from the tails. However, this simple approximation does not give more than two accurate digits, except by accident.
Much more accurate initial approximations (some of them also in terms of error functions) can be obtained from asymptotics for large p and/or q. These are accurate approximations for large and not so large p and/or q, as we later discuss.
This section is based on the results given in [12] and [13, §42.3].
Using this representation of I x (p, q), we solve the equation in (1.4) first in terms of η. When r = p + q is a large parameter, the asymptotic method will provide an approximation to the requested value of η in the form The algorithm for computing the coefficients η i , i = 0, 1, 2, . . . can be summarized as follows 1. The value η 0 is obtained from the equation 10) is inverted to obtain a first approximation to the value of x. For inverting this equation, it seems convenient to write it in the form With these values of η 0 and x, the coefficient η 1 is given by 4. Higher-order coefficients η j , j = 2, 3, . . . can be obtained in terms of x, η 0 , η 1 , sin θ and cos θ. As an example, the coefficient η 2 is given by where s = sin θ, c = cos θ.

5.
With these coefficients in the expansion (2.11), a value for η is obtained. Then, the inversion of (2.10) will provide the final asymptotic estimation of the x-value.
Using (2.10) we can derive the following expansion or small values of |η|: where s = sin θ, c = cos θ. For larger values of |η|, with η < 0, we rewrite (2.10) in the form and for small values of u we expand

(2.18)
A similar approach is possible for positive values of η, giving an expansion for x near unity. In that case we have the equation and we have the expansion The approximations to x obtained in this way will be used for starting the SNM for obtaining more accurate values of x.

Inversion using the incomplete gamma function
In this case, we start from where Q(a, x) is the incomplete gamma function ratio. The parameter η is given by where µ = q/p and x have the following corresponding points: So, for x ∈ (0, 1) we have sign(η − µ) = −sign(x − 1/(1 + µ)).
For the representation in (2.21) we assume that p is the large parameter, and we will obtain approximations to the value of η in the form We follow similar ideas as in §2.2.1. The value of η 0 can be obtained by solving Q(q, η 0 p) = α. (2.25) The inversion of Q(a, x) can be done by using our inversion algorithm described in [7].
To compute x from equation (2.22) we can use the inversion algorithm for computing x when in (2.10) η is given. This follows from µ = q/p = cot 2 θ and from writing (2.22) in the form This equation can also be written as (2.28) = g u (x n u ), respectively, where the iteration functions g l (x) and g u (x) for the lower tail are given by [11] g l (x) = αB(p, q)

Interval estimation for the tails
and The starting value of the fixed point iterations is x = 0. The solution x of the equation (1.4) satisfies x l < x < x u . These bounds of x can be used as starting values for the SNM. Notice that, because a lower and an upper bound is obtained we have an estimation of the error for these approximations that can be used to decide if they are accurate enough.
We notice that the approximation for the lower tail used in [9] (and also in the R implementation) is just g l (0) = g u (0). Our approximation is more accurate and provides upper and lower bounds.
For the upper tail, the iteration functions are the same as before, but with p and q interchanged (by using (1.3)). The bounds are then given for 1 − x.

Numerical testing
In this section we illustrate the use of the different methods with numerical examples which will help in deciding how to combine the methods in order to obtain fast reliable algorithms (which are described in section 4).

Initial values obtained with the asymptotic approximations
In Tables 1 and 2 we show examples of the accuracy obtained in the computation of |I x (p, q) − α| /α with x the values provided by the asymptotic approximations (before iterating the SNM). 1 The asymptotic methods provide relatively good initial values even for quite small values of p and q: using 10 7 random points in the region (p, q, α) ∈ (0.5, 1.5) × (0.7, 1.5) × (0, 1) we have tested that a relative accuracy better than 0.06 was obtained when computing |I x (p, q) − α| /α with x, the asymptotic approximations obtained using the error function. With these initial values, not more than two iterations of the SNM are needed to obtain an accuracy better than 5.0 10 −13 . The function I x (p, q) is computed in the iterations of the SNM by using a continued fraction representation . . . ,  Table 1. Relative errors |Ix(p, q) − α| /α for r = p + q = 6 using the estimates provided by the asymptotic inversion method with the error function.
For the computation of the Beta function B(p, q), it is convenient, in particular when p and q are large, to use the following expression in terms of the scaled gamma function Γ * (x): where Γ * (x) is defined as The function Γ * (x) is computed using the function gamstar included in a previous package developed by the authors [8].

Initial values obtained in the tails of the distribution function
The interval estimations in the tails of the central beta distribution function by using the fixed point iterations of §2.3, can be also used for providing starting values of the SNM. This will be particularly useful for quite small values of the parameters p and q, where the asymptotic method cannot be  Table 2. Relative errors |Ix(p, q) − α| /α for p = 7 and several values of µ = q/p using the estimates provided by the asymptotic inversion method with the incomplete gamma function.
applied. It is important to note that when the value of the parameters p or q are close to 0, the inversion of I x (p, q) becomes problematic in the lower (when p → 0) or upper (when q → 0) tail of the cumulative distribution function, because of the particular shape of the functions.
In Table 3  3. We have also tested that for small values of p and q, the bounds provide in all cases reasonable approximations for starting the SNM, no matter if the value of α is small. Besides, even for not so small values of p and q, the bounds provide very accurate estimations when α is very small. In some cases, these estimations could be even better than the estimations of the asymptotic method.

Performance of the SNM for small values of the parameters
We have tested that the scheme for the SNM, as described in §2.1 when the parameters p and q are both small, also provides a good uniform accuracy in the computation: using 10 7 random points in the region (p, q, α) ∈ (0.1, 0.5) × (0.1, 0.7) × (0, 1) we have tested that a relative accuracy better than 4.8 10 −13 was obtained when computing |I x (p, q) − α| /α. The maximum number of iterations of the SNM was 3.

Efficiency testing
As we have shown in §3.1, the asymptotic method provides very accurate initial values for starting the SNM even for small values of the parameters p and q. But apart from accuracy, an important feature of any computational scheme is also efficiency. So, we have compared whether the combined use of the asymptotic approximations plus iterations of the SNM is more efficient or not than the sole use of the SNM. In Table 4 Table 4 and additional testing for other parameter values, indicate that the sole use of the SNM is efficient in all cases for the inversion of the cumulative central beta distribution, but specially when the values of the parameter α are neither very small nor near to 1.
A different scenario arises when the solution of equation (1.4) is computed with an aimed accuracy better than 10 −8 (single precision). In this case, just using the approximations provided by the asymptotic expansions will be enough to obtain such an accuracy for a wide range of parameters.

Proposed algorithms
Based on the previous numerical experiments of §3, we conclude that if the precision required is not very high, the initial approximations given by asymptotics or by the tail estimations could be sufficient in a large range of the parameters. However, for higher precision the use of the SNM must be prevalent. This leads us to suggest two different schemes for computing the solution When 0.01 < α ≤ 0.5 For 1 < q < 5 and p > 50, use the SNM, using as starting values the approximations provided by the uniform asymptotic expansion in terms of the incomplete gamma function. For p > 30 and q > 30, use the SNM, using as starting values the approximations provided by the uniform asymptotic expansion in terms of the error function. In other cases, use the SNM as described in §2.1.  4). In other cases, use the SNM as described in §2.1. For 0.5 < α < 1, use the relation (1.3) and apply the previous steps to solve 1 − x in I 1−x (q, p) = 1 − α.

Conclusions
We have presented methods for the inversion of cumulative beta distributions which improve previously existing methods. We have described how the Schwarzian-Newton method provides a standalone method with certified convergence which is in itself an efficient method, even without accurate initial estimations (except at the tails). In addition, we have discussed how to improve the efficiency by estimating the quantile function using asymptotics for large p and/or q and by considering sharp upper and lower bounds for the tails. These initial estimations are considerably more accurate than the simple approximations used in some standard mathematical software packages (like R) and, combined with the fourth order SNM, provide efficient and reliable algorithms for the inversion of cumulative beta distributions.