Geometry of Polynomial Neural Networks

This page contains auxiliary files to the paper:
Kaie Kubjas, Jiayi Li, and Maximilian Wiesmann:  Geometry of polynomial neural networks
In: Algebraic statistics, 15 (2024) 2, p. 295-328
DOI: 10.2140/astat.2024.15.295 ARXIV: https://arxiv.org/abs/2402.00949 CODE: https://mathrepo.mis.mpg.de/PolynomialNeuralNetworks

ABSTRACT: We study the expressivity and learning process for polynomial neural networks (PNNs) with monomial activation functions. The weights of the network parametrize the neuromanifold. In this paper, we study neuromanifolds using tools from algebraic geometry: we give explicit descriptions as semialgebraic sets and characterize their Zariski closures, called neurovarieties. We study their dimension and associate an algebraic degree, the learning degree, to the neurovariety. The dimension serves as a geometric measure for the expressivity of the network, the learning degree is a measure for the complexity of training the network and provides upper bounds on the number of learnable functions. These theoretical results are accompanied with experiments.

The paper features several computations and experiments; the source code is provided in the following.

Neurovarieties

In Example 4.4, we describe a procedure to obtain defining equations of the neurovariety \(\mathcal{V}_{(3,3,3),2}\subseteq \mathbb{P}^5\times \mathbb{P}^5\times \mathbb{P}^5\). To this end, we use the symbolic algebra package Oscar which is available in the Julia programming language. The routine is implemented in Example_4_4.jl.

A general purpose code for computing defining ideals of (low-dimensional) neurovarieties via elimination can be found here: neurovariety.jl

Backpropagation & Dimension

As described in Section 6.3, the backpropagation routine (Algorithm 1) can be used to compute the dimension of neurovarieties efficiently. We provide a correction of the implementation by Trager et al., which in turn is based on an implementation by Nielsen. Here we are using the software SageMath.

dim_backprop.py

To perform experiments for a large number of architectures it is convenient to stop a dimension computation after a certain time if it was not successful. By running the following code you can stop computations after a given timeout time and save the results in a csv file.

import csv
import time
import threading
from multiprocessing import Pool

exit_event = threading.Event()

# auxiliary method to run the compute_dimension function with a timeout
def run_with_timeout(network_widths, network_exponent, timeout):
    result = [None] * 6  # Placeholder for the result
    elapsed_time = None
    def target():
        nonlocal result, elapsed_time
        start_time = time.time()
        result = compute_dimension(network_widths, network_exponent)
        end_time = time.time()
        elapsed_time = end_time - start_time

    thread = threading.Thread(target=target)
    thread.start()
    thread.join(timeout)
    if thread.is_alive():
        exit_event.set()
        # thread.terminate()
        result = ["Timeout"] * 6
        elapsed_time = timeout
    return result, elapsed_time


def run_experiment(run_arguments, timeout, filename_result, filename_timeout):
    with open(filename_result, 'w', newline='') as csvfile:
        csvwriter = csv.writer(csvfile)
        # Write header
        header = ['Widths', 'Activation', 'Ambient Dimension', 'Expected Dimension',
                'Dimension', 'Defect', 'Time Taken (seconds)']
        csvwriter.writerow(header)

        with open(filename_timeout, 'w', newline='') as timeoutfile:
            timeoutwriter = csv.writer(timeoutfile)
            timeoutwriter.writerow(['Widths', 'Activation'])

            for args in run_arguments:
                result, exit_code = run_with_timeout(args[0], args[1], timeout)
                if exit_event.is_set():
                    exit_event.clear()
                    timeoutwriter.writerow([args[0], args[1]])
                    continue
                if isinstance(result, tuple):
                    result = list(result)
                row = [args[0], args[1]] + results[2:6] + [exit_code]
                csvwriter.writerow(row)

Then, for example, to compute the dimensions of the neurovarieties \(\mathcal{V}_{(3,3,3),2}\) and \(\mathcal{V}_{(2,2,2,2),3}\) with a maximum computation time of 60s you run the following command:

run_args = [([3,3,3],2), ([2,2,2,2],3)]
run_experiment(run_args, 60, 'results.csv', 'timeout.csv')

The results of the dimension computations will be stored in a file results.csv; if a computation exceeds the timeout of 60s the architecture will be written into the timeout.csv file. Some results obtained in this way can be found here:

results_4_layer.csv

results_5_layer.csv

An evaluation of these results is most conveniently done using the Python Pandas package as in the following two Jupyter notebooks, where we verify Conjecture 5.7 for the dimensions computed above:

dim_4_layer.ipynb

dim_5_layer.ipynb

Learning Experiments

In Section 6.1 we describe a machine learning experiment counting the number of functions a polynomial neural network with architecture \(d=(2,2,3),~r=2\) can learn. The code for this experiment is available as a Google Colab Notebook.

Project page created: 05/01/2024.

Project contributors: Kaie Kubjas, Jiayi Li, Maximilian Wiesmann.

Corresponding author of this page: Maximilian Wiesmann, wiesmann@mis.mpg.de.

Code written by: Jiayi Li, Maximilian Wiesmann

Software used: Julia (Version 1.9.1), Oscar (Version 0.12.1), SageMath (Version 9.7, using Python 3.10.5).

System setup used: MacBook Pro with macOS Monterey 12.5, Processor Apple M1 Pro, Memory 16 GB LPDDR5.

License for code of this project page: MIT License (https://spdx.org/licenses/MIT.html).

License for all other content of this project page (text, images, …): CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/).

Last updated 08/01/2024.