Differential Equations for Gaussian Statistical Models with Rational Maximum Likelihood Estimator
ABSTRACT: We study multivariate Gaussian statistical models whose maximum likelihood estimator (MLE) is a rational function of the observed data. We establish a one-to-one correspondence between such models and the solutions to a nonlinear first-order partial differential equation (PDE). Using our correspondence, we reinterpret familiar classes of models with rational MLE, such as directed (and decomposable undirected) Gaussian graphical models. We also find new models with rational MLE. For linear concentration models with rational MLE, we show that homaloidal polynomials from birational geometry lead to solutions to the PDE. We thus shed light on the problem of classifying Gaussian models with rational MLE by relating it to the open problem in birational geometry of classifying homaloidal polynomials.
In the classical setting of \(m\)-dimensional Gaussian statistical models with mean zero and concentration matrix \(K\), the log-likelihood is, up to an affine transformation, \(\ell_S (K) = \log \det (K) - \mathrm{tr}(K S)\), where \(S\) is the \(m \times m\) sample covariance matrix. The maximum likelihood degree (ML degree) of a model is the number of complex critical points to the likelihood function \(\ell_S\) for generic data \(S\).
The main result of the paper can be applied in this setting, establishing a bijection between projective varieties \(X \subseteq \mathbb{P}(\mathrm{Sym_2}(\mathbb{C}^m))\) of ML degree 1 and rational homogeneous solutions \(\Phi:\mathrm{Sym_2}(\mathbb{C}^m) \longrightarrow \mathbb{C}\) to the homaloidal PDE \(\Phi = \det( -\nabla_S \log \Phi)\).
The gradient with respect to \(S\) should be understood in the variables \(s_{ij}\), with \(i\leq j\), such that \((\nabla_S \varphi)_{ij} = \dfrac{1}{2 -\delta_{ij}}\dfrac{\partial \varphi}{\partial s_{ij}}\) for a scalar-valued function \(\varphi\) on \(\mathrm{Sym}_2(\mathbb{C}^m)\).
We illustrate this theorem for several classes of graphical models, with code implementations in \(\verb|Macaulay2|\). The main functions are verifyHomaloidal, which verifies whether a candidate function \(\Phi\) solves the homaloidal PDE, varietyAssociatedToPhi that computes the ideal of the variety \(X\) associated to a solution \(\Phi\), and printMLE that computes the maximum likelihood estimator for \(X\) from \(Phi\). The code for these and other auxiliary functions can be found here: homaloidal.m2
.
Undirected Gaussian graphical models
For the undirected path \(1 - 2 - 3\), the Gaussian graphical model consists of all \(3 \times 3\) concentration matrices with \(k_{13}=0\). A solution to the homaloidal PDE is given as a rational function in minors of \(S\), \(\Phi(S)=\dfrac{\det(S_2)}{\det(S_{12})\det(S_{23})}\), as verified below. Using the solution \(\Phi\), we also compute the maximum likelihood estimator.
-- Example 5.10
CreateSym2Ring(3)
S= toMatrix(vars(R))
Phi = s_(2,2)*(1/(detS({1,2})*detS({2,3})))
verifyHomaloidal(Phi)
sym2VarietyAssociatedTo(Phi)
printMLE(Phi)
If we consider now the path \(1 - 2 - 3 - 4 - 5\) with an additional edge \(2 - 4\), one can follow cliques and separators to obtain \(\Phi(S)=\dfrac{\det(S_2)\det(S_4)}{\det(S_{12})\det(S_{234})\det(S_{45})}\).
-- Example 5.11
CreateSym2Ring(5)
S = toMatrix(vars(R))
Phi = s_(2,2)*s_(4,4)/(detS({1,2})*detS({2,3,4})*detS({4,5}))
verifyHomaloidal(Phi)
sym2VarietyAssociatedTo(Phi)
printMLE(Phi)
However, a similar guess for the undirected 4-cycle (well-known for being the smallest non-decomposable graph) would give as candidate \(\Phi(S)=\dfrac{\det(S_1)\det(S_2)\det(S_3)\det(S_4)}{\det(S_{12})\det(S_{23})\det(S_{34})\det(S_{14})}\). This is actually not a solution to the homaloidal PDE as we verify below.
-- Examples 5.12 and 5.18 the un- and directed m-cycle, should NOT solve the PDE for m>=4
m = 4
CreateSym2Ring(m)
S = toMatrix(vars(R))
Phi = s_(m,m)/(detS({1,m}))*product(apply(m-1, i -> s_(i+1,i+1)/detS({i+1,i+2})))
verifyHomaloidal(Phi)
Directed Gaussian graphical models
All Gaussian models defined by a directed acyclic graph (DAG) have ML degree one. Consider the collider triple \(1\rightarrow 3 \leftarrow 2\). Then \(\Phi(S)= \dfrac{s_{11}s_{22}-s_{12}^2}{s_{11}s_{22}\det(S)}\), and the model is defined by the equation \(k_{13} k_{23} - k_{12} k_{33} = 0\).
-- Example 5.16
CreateSym2Ring(3)
S = toMatrix(vars(R))
Phi = detS({1,2})/(s_(1,1)*s_(2,2)*det(S))
verifyHomaloidal (Phi)
sym2VarietyAssociatedTo(Phi)
printMLE(Phi)
For a more complicated example, consider the DAG with edges \(1\rightarrow 2, 1 \rightarrow 3, 2 \rightarrow 4, 3 \rightarrow 4, 3 \rightarrow 5, 4 \rightarrow 5\). Then the solution of the homaloidal PDE is \(\Phi(S) = \dfrac{\det(S_{1})\det(S_{23})\det(S_{34})} {\det(S_{12})\det(S_{13})\det(S_{234})\det(S_{345})}\).
-- Example 5.17
CreateSym2Ring(5)
S= toMatrix(vars(R))
Phi = s_(1,1)*detS({2,3})*detS({3,4})/(detS({1,2})*detS({1,3})*detS({2,3,4})*detS({3,4,5}) )
verifyHomaloidal(Phi)
printMLE(Phi)
Colored Gaussian graphical models
Consider the reciprocal variety of the linear space of \(3 \times 3\) matrices of the form \(\begin{pmatrix}x&y&0\\y&z&0\\0&0&z\end{pmatrix}\). This is the colored covariance graphical model corresponding to the undirected graph on 3 nodes with \(1 - 2\) as its only edge, but where the nodes 2 and 3 have the same color. The code below shows that this model is of ML degree 1, by verifying that \(\Phi(S) = \dfrac{4 s_{22}}{(s_{22} + s_{33} )^2( s_{11}s_{22}-s_{12}^2) }\) solves the homaloidal PDE, and that it corresponds to the variety \(X = V(k_{23},k_{13},k_{12}^2-k_{11}k_{22}+k_{11}k_{33})\).
-- Example 7.4
CreateSym2Ring(3)
S = toMatrix(vars(R))
Phi = 4*s_(2,2)/(( s_(2,2)+s_(3,3))^2*detS({1,2}))
verifyHomaloidal( Phi )
sym2VarietyAssociatedTo(Phi)
printMLE(Phi)
Finally, consider the undirected 3-cycle where the first two nodes have the same color. The associated colored Gaussian graphical model is defined by the hyperplane \(k_{11}-k_{22}=0\). It has ML degree one with \(\Phi(S) = \dfrac{4s_{33}}{\det((gSg^{\top})_{23})\det ((gSg^{\top})_{13})}\) , where \(g = \begin{pmatrix} 1&1&0\\1&-1&0\\0&0&1/2\\ \end{pmatrix}\), as verified below.
-- Example 7.5
CreateSym2Ring(3)
S = toMatrix(vars(R))
g = matrix {{1,1,0}, {1,-1,0},{0,0,1/2}}; gtS = (transpose g)*S*g;
Phi = det(submatrix'(gtS, {0,1},{0,1}))/(det(submatrix'(gtS, {0},{0}))*det(submatrix'(gtS, {1},{1})))
verifyHomaloidal(Phi)
sym2VarietyAssociatedTo(Phi)
printMLE(Phi)
Non-embedded ML degree 1 models
In our paper, we find it useful to study ML degree one projective varieties in a general setting (see Definition 2.1) of a linear space \(\mathcal{L}\) and a polynomial \(F\) on \(\mathcal{L}\). The main result of the paper is the characterization of ML degree 1 models via the homaloidal PDE in Theorem 3.5. One recovers the classical setting by letting \(\mathcal{L}=\mathrm{Sym}_2(\mathbb{C}^m)\) and \(F=\det\). The previously used functions in this coordinate-free setting take the following form:
gradient= (func) -> (
-- Input: A rational function func
-- Output: The gradient of func
f := sub(func, frac R);
p := numerator f; gradp := diff(vars R, p);
q := denominator f; gradq := diff(vars R, q);
sub((1/q^2), frac R)*sub((q*gradp - p*gradq), frac R)
)
verifyHomaloidal=(Phi, F) -> (
-- Input: An element of frac R, a candidate solution to the homaloidal PDE w.r.t. the homogeneous polynomial F
-- Output: Boolean specifying if the input Phi is a solution to the homaloidal PDE
Psi := -1/(Phi)*(gradient Phi);
F(Psi) == Phi
)
varietyAssociatedTo=(Phi) -> (
-- Input: An element of frac R, a solution to the homaloidal PDE
-- Output: The ideal of the variety X associated to Phi
ker (map(R,Rx, sub(-denominator(Phi)^2*(gradient Phi), R)))
)
To compute with specific examples, we set a value for the dimension of \(\mathcal{L}\), say \(n=5\).
--Specify the dimension of the ambient projective space
n=5
R = QQ[u_0..u_n]
Rx = QQ[x_0..x_n]
uvec = vars R
We prove in Section 7 that if \(Q\) is a smooth quadric, then \(\mathrm{MLD}_Q(\mathbb{P}^n)=1\), with \(\Phi(\mathbf{u})=\dfrac{4}{Q^\vee(\mathbf{u})}\). We verify this in the following code for a randomly generated quadric.
--Example 7.2
--Create a generic quadric
A = matrix apply(n+1, i-> apply(n+1, j -> random(QQ))); A = A + transpose(A)
Q = (v) -> (v*A*transpose(v))_0_0
--Verify the homaloidal PDE w.r.t. Q and compute the associated variety
Qdual =(uvec*inverse(A)*transpose(uvec))_0_0
Phi = 4/Qdual
verifyHomaloidal(Phi, Q)
varietyAssociatedTo(Phi)
To work explicitly with this quadric \(Q\), we choose a basis such that \(Q\) can be written as \(Q(x_0,x_1,x_2,\dots,x_n) = x_0x_1+x_2^2+\dots+x_n^2\). As in Section 7, we consider a particular \(\ell\) tangent to \(Q\) at a point \(p\).
-- After a change of basis, any quadric can be written using a matrix A of the form below
-- We also fix a point ell on the dual quadric and the point p on Q where ell is tangent
A = matrix apply(n+1, i-> apply(n+1, j-> if (i<2 and j<2) then (if i != j then 1/2 else 0) else (if i==j then 1 else 0) ))
ell = matrix {apply(n+1, i -> if i==0 then 1 else 0)}
Qdual = (uvec*inverse(A)*transpose(uvec))_0_0
p = transpose sub( diff(vars R, Qdual), apply(n+1, i-> u_i => ell_i_0))
up = (uvec*p)_0_0
The following example illustrates Theorem 6.2, where we consider the quadric \(Q\) and \(\ell\) the equation of the tangent hyperplane to \(Q\) at a point \(p=\nabla_\ell Q^\vee\). Then we show \(\mathrm{MLD}_\ell(V(Q))=1\) and \(\Phi(\mathbf{u})=\dfrac{\mathbf{u}(p)}{Q^\vee(\mathbf{u})}\). We see that the variety associated to the solution \(\Phi\) coincides with the dual variety of the denominator of \(\Phi\): \((V(Q^\vee))^\vee = V(Q)\).
-- Example 7.1
ellFunc = (v) ->(
-- Input: A point v in the primal vector space, as a row vector
-- Output: The value of the quadric Q at v
(ell*transpose(v))_0_0
)
Phi = up/Qdual
verifyHomaloidal(Phi, ellFunc)
xvec = vars(Rx)
varietyAssociatedTo(Phi) == ideal(xvec*A*transpose(xvec))
In our final example, with the same notation, we consider \(F=Q\cdot \ell\), \(\mathcal{L} \cong \mathbb{C}^{n+1}\) and \(\Phi(\mathbf{u}) = 16\dfrac{\mathbf{u}(p)}{Q^\vee(\mathbf{u})^2}\).
-- Example 7.3
Qell = (v) ->(
-- Input: A point v in the primal vector space, as a row vector
-- Output: The value of the polynomial Q*ell at v
((v*A*transpose(v))_0_0)*(ell*transpose(v))_0_0
)
Phi = 16*up/(Qdual^2)
verifyHomaloidal(Phi, Qell)
varietyAssociatedTo(Phi)
Credits
Project page created: 18/04/2023
Project contributors: Carlos Améndola, Lukas Gustafsson, Kathlén Kohn, Orlando Marigliano, Anna Seigal
Software used: Macaulay2 (Version 1.20)
Corresponding author of this page: Carlos Améndola, amendola@math.tu-berlin.de