Sign patterns of principal minors of real symmetric matrices

This page contains the source code and explanations related to the results in the paper:
Tobias Boege, Jesse Selover and Maksym Zubkov: Sign patterns of principal minors of real symmetric matrices

ABSTRACT: We analyze a combinatorial rule satisfied by the signs of principal minors of a real symmetric matrix. The sign patterns satisfying this rule are equivalent to uniform oriented Lagrangian matroids. We first discuss their structure and symmetries and then study their asymptotics, proving that almost all of them are not representable by real symmetric matrices. We offer several conjectures and experimental results concerning representable sign patterns and the topology of their representation spaces.

Installation of software

We require a recent Perl, Mathematica and sagemath installation which are fairly standard. In addition, we need the SAT solver tools from the CInet project. You can also follow the instructions on the SelfadhesiveGaussianCI page.

Enumeration and symmetry reduction of admissible sign patterns

The Perl script signs.pl enumerates admissible sign patterns using a SAT solver and then reduces them to canonical representatives modulo the hyperoctahedral group.

$ perl signs.pl 3 >signs3.log
38 compatible sign vectors
Performing symmetry reduction .....
5 representatives

It prints some progress information while it is working, including an estimate (obtained via the probabilistic model counter GANAK) for the number of sign vectors to be processed.

Its output is a list of records, one per line, with a sign pattern and its orbit size:

$ cat signs3.log
++++++++: 8
+++++++-: 8
++++++--: 12
+++++---: 8
++++----: 2

The orbit sizes add up to the number 38 reported earlier by the program.

The encoding of sign patterns is briefly mentioned in the paper: we use strings of + and - characters for the principal minors in “grouped-lexicographic” order, i.e., the subsets of \(N\) are first grouped by cardinality and within each group arranged lexicographically based on a fixed ordering of \(N\).

The canonical representative of an orbit is that sign pattern which has the largest number of leading + signs in this encoding. This is the representative chosen by signs.pl.

This procedure works decently also for \(n = 4\) and \(n = 5\):

$ perl signs.pl 4 >signs4.log
990 compatible sign vectors
Performing symmetry reduction ........................
24 representatives

$ perl signs.pl 5 >signs5.log
395094 compatible sign vectors
Performing symmetry reduction ........................ [...]
434 representatives

The script understands a few options: --count-only to stop after obtaining the estimate of the number of solutions to the boolean formula, and --positive-singletons which adds axioms to orient all singletons positively. Combining both, we get the reported upper bound on the number of hyperoctahedral orbits for \(n = 6\):

$ perl signs.pl --count-only --positive-singletons 6
7109686748 compatible sign vectors

The space \(\mathrm{PR}_3\)

With the list of 5 canonical representatives of admissible sign patterns, the analysis of their representation spaces is swiftly done using computational real algebraic geometry tools in \(\verb|Mathematica|\). We use cylindrical algebraic decomposition to count their connected components and produce the figures from the paper. The whole notebook with all code and computations is available for download (compressed):

../_images/pr3-all.png ../_images/pr3-hypersurfaces.png

Representability for \(n \le 5\)

To find representations for admissible sign patterns, we implemented a simple sampling methodology in \(\verb|sagemath|\) which generates a random rational matrix and computes its sign pattern. This has a chance of working well because if a sign pattern is representable, then its space of representations has a positive Lebesgue measure in the space of all symmetric matrices with normalized diagonal.

The sampling script sample.sage has two modes. In the first mode, it is given options -n NUMER -d DENOM -t TRIES which give bounds on the numerators and denominators of matrix entries (by default both bounds are \(35\) which appears good enough) and the number of random matrices to generate. The matrices always have their diagonal elements fixed to \(1\) which is justified in the paper.

It then outputs a list of records containing the witnessed sign patterns, how often they were encountered and a simple representation.

In the second mode, a sign pattern is explicitly given using the -p PATTERN option. In this case, a little more information is used during the sampling of matrices to target the representation space of the given pattern. Namely, the diagonals are fixed to \(\pm 1\) depending on the \(1 \times 1\) minor signs and each off-diagonal entry is ensured to be strictly smaller or larger than \(1\) in absolute value, depending on the \(2 \times 2\) minor signs. The switch -b can be used to stop sampling after the first representation was found.

This second mode works well for \(n = 3\) and \(n = 4\):

$ cut -f1 -d: signs4.log | while read S
> do sage sample.sage -t 10000 -p "$S" 4
> done >sample4.log

This loops through all hyperoctahedral representatives obtained in the previous sections and samples 10,000 rational matrices for each of them. In our test run, this is sufficient to find representations for all orbits.

The case \(n = 5\) is tougher. We first generate 4,000,000 random matrices and collect their sign patterns. Since we do not target a specific pattern with the -p option in this case, the script also reports representable sign patterns which are not canonical. The reduce.pl script acts on matrices via the symmetric group trying to produce a representation for their canonical hyperoctahedral representative. In our test run, this leaves only 18 of the sign patterns without a representation.

$ sage sample.sage -t 4000000 5 >sample5.log
$ perl reduce.pl 5 sample5.log >reduce5.log
$ cut -f1 -d: signs5.log | while read S
> do grep -Gq "$S" reduce5.log || echo "$S"
> done | tee leftover.log
++++++++++++++++++++++++++-----+
++++++++++++++++++++++---------+
+++++++++++++++++++++-++-------+
+++++++++++++++++++----------+++
++++++++++++++++++-+--+--------+
+++++++++++++++++-----------++++
+++++++++++++++++--------------+
++++++++++++++++----------++++++
++++++++++++++++----------++++-+
++++++++++++++++----------+++--+
++++++++++++++++----------++---+
++++++++++++++++----------+----+
++++++++++++++++---------------+
+++++++++++++++-----------++++++
+++++++++++++++-----------++++-+
+++++++++++++++-----------+++--+
+++++++++++++++-----------+-++++
+++++++++-+-+--+----------+++--+

On these “leftover” sign patterns, we can run the targeted sampling procedure as shown before for \(n = 4\). With enough iterations, representations of all sign patterns except the elusive \(s_*\) are generated with the default bounds on numerator and denominator.

The representations are accumulated into the following files:

The correctness can be independently verified using verify.pl:

$ prove verify.pl :: 4 sample4.log
verify.pl .. ok
All tests successful.
Files=1, Tests=24,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.42 cusr  0.01 csys =  0.44 CPU)
Result: PASS

$ prove verify.pl :: 5 sample5.log
verify.pl .. ok
All tests successful.
Files=1, Tests=433, 15 wallclock secs ( 0.07 usr  0.00 sys + 15.52 cusr  0.00 csys = 15.59 CPU)
Result: PASS

Notice that sample5.log contains 433 representations as it is missing \(s_*\). The canonical representative of \(s_*\) is +++++++++-+-+--+----------+++--+ and \(s_*\) as given in the paper arises from it by duality. We are unable to decide whether it is representable or not.

The file last5.pip encodes a trivial polynomial optimization problem over the representation space of \(s_*\) intersected with a large bounding box for the solver SCIP. Only the \(3 \times 3\) minors and the whole determinant are used in the formulation (the script writepip5.pl ignores principal minors marked with * in the input pattern). During the solving, it appears that certain matrices become too close to singular and eventually SCIP aborts.

$ perl writepip5.pl -- +--+++**********--+++-+-++*****+ >last5.pip
$ scip
SCIP> read last5.pip
SCIP> optimize
[...]
LU pivot element is almost zero (< 1e-10) - Basis is numerically singular
LU pivot element is almost zero (< 1e-10) - Basis is numerically singular
54.9s|  6300 |  4457 |  3616k| 574.1 |    83M | 125 | 102 |  23 | 583 | 165k|  4 |   3 |   0 | 1.000000e+00 |      --      |    Inf |   3.20%
55.5s|  6400 |  4511 |  3665k| 572.8 |    83M | 125 | 102 |  23 |   0 | 167k|  0 |   3 |   0 | 1.000000e+00 |      --      |    Inf |   3.22%
56.6s|  6500 |  4587 |  3734k| 574.7 |    85M | 125 | 102 |  23 | 196 | 171k|  3 |   3 |   0 | 1.000000e+00 |      --      |    Inf |   3.22%
58.3s|  6600 |  4635 |  3831k| 580.5 |    87M | 125 | 102 |  23 | 202 | 175k|  4 |   3 |   0 | 1.000000e+00 |      --      |    Inf |   3.24%
59.5s|  6700 |  4695 |  3911k| 583.9 |    89M | 125 | 102 |  23 | 229 | 178k|  7 |   3 |   0 | 1.000000e+00 |      --      |    Inf |   3.26%
63.4s|  6800 |  4751 |  4163k| 612.4 |    91M | 125 | 102 |  23 | 595 | 185k|  4 |   3 |   0 | 1.000000e+00 |      --      |    Inf |   3.27%
64.1s|  6900 |  4817 |  4194k| 608.0 |    92M | 125 | 102 |  23 | 291 | 187k|  6 |   3 |   0 | 1.000000e+00 |      --      |    Inf |   3.29%
scip: /usr/src/debug/scip/scip-804/src/scip/lp.c:12131: lpSolve: Assertion `!set->lp_checkstability || SCIPsetIsRelGE(set, lp->lpobjval, lp->lpiobjlim)' failed.
Aborted (core dumped)

Finally, we also provide a \(\verb|Mathematica|\) notebook which contains the setup for testing representability of \(s_*\) via cylindrical algebraic decomposition. We also determine certain symmetries of the polynomial system which allow us to fix the signs or even an ordering of some entries. However, the CAD computation still runs out of memory eventually.

A lower bound for \(\mathrm{PR}_4\)

We show that if a sign pattern \(s\) is representable, then all its minors are representable and that the number of connected components of the representation space of \(s\) is at least as high as that of any of its minors. Using that all sign patterns on \(n = 4\) are representable and we have the exact numbers of connected components of all sign patterns on \(n = 3\), we can obtain a lower bound on the number of connected components of \(\mathrm{PR}_4\) by summing over the lower bounds for the 24 canonical representatives times their orbit sizes. The script dimh0-pr4.pl does this using a hardcoded table generated by dimh0-pr3.pl:

$ perl dimh0-pr4.pl
+++++++++------+: at least 4 x 96 = 384
+++++++--++----+: at least 2 x 12 = 24
+++++++++-------: at least 16 x 64 = 1024
+++++++-+-------: at least 4 x 16 = 64
++++++++++++++--: at least 4 x 64 = 256
++++++++++++++++: at least 1 x 16 = 16
+++++++++++++++-: at least 4 x 16 = 64
+++++++--+-----+: at least 4 x 48 = 192
+++++----------+: at least 16 x 2 = 32
++++++---------+: at least 16 x 16 = 256
++++++++++-----+: at least 4 x 48 = 192
++++++++++------: at least 16 x 48 = 768
+++++++++++-----: at least 16 x 16 = 256
+++++++--------+: at least 16 x 48 = 768
+++++++++++----+: at least 4 x 8 = 32
++++++++-------+: at least 16 x 64 = 1024
++++++----+----+: at least 4 x 8 = 32
++++++++--------: at least 16 x 16 = 256
+++++++-+--+----: at least 4 x 16 = 64
+++++++++--+----: at least 4 x 64 = 256
++++++++++-++---: at least 2 x 48 = 96
+++++++++++++---: at least 4 x 96 = 384
++++++++++-+----: at least 4 x 96 = 384
++++++++++++----: at least 16 x 64 = 1024
Total = 7848

Numerical estimate for \(\mathrm{PR}_4\)

Using the Julia package \(\verb|HypersurfaceRegions|\) of Breiding et al. we can get an estimate on the number of connected components of \(\mathrm{PR}_4\) using numerical algebraic geometry. The program linked below invokes the package with the principal minors of a symmetric \(4 \times 4\) matrix with diagonals fixed to \(1\).

It finds the correct number of sign patterns 228 as verified by the SAT solver computation

$ perl signs.pl --positive-singletons --count-only 4
228 compatible sign vectors

In addition to the sign patterns, \(\verb|HypersurfaceRegions|\) computes the number of connected components of each sign-invariant region. Fixing the diagonal entries greatly reduces the computation time and, as shown in the paper, does not hide any hyperoctahedral orbits from our analysis. From this data, we can recover the following estimated number \(24\,352\) for the connected components of \(\mathrm{PR}_4\):

$ perl -MList::Util=sum0 -MPath::Tiny -E '
> sub make_hash {
>     map { split /: / } path(shift)->lines_utf8({ chomp => 1 })
> };
> my %orbs  = make_hash "signs4.log";
> my %comps = make_hash "pr4-regions.log";
> say sum0 map { $orbs{$_} * $comps{$_} } keys %orbs'
24352

Colophon

Project page created: 22/07/2024

Project contributors: Tobias Boege, Jesse Selover and Maksym Zubkov

Software used: Perl (v5.38.0), GANAK (v1.0.1), nbc_minisat_all (v1.0.2.b), CInet::ManySAT (v1.1.1), CInet::Base (v0.10.2), SageMath (v10.4), Mathematica (v13.3), SCIP (v8.0.4) and Julia (v1.10.5) with HypersurfaceRegions (v0.1.0).

System setup used: All computations were performed on a customary Thinkpad T14 with 32 GiB of RAM.

Corresponding author of this page: Tobias Boege, boege@kth.se

License for code of this project page: MIT License (https://spdx.org/licenses/MIT.html)

License for all other content of this project page (text, images, …): CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

Last updated 04/11/2024