Package Documentation

Magma

class neural_semigroups.Magma(cayley_table: Optional[torch.Tensor] = None, cardinality: Optional[int] = None)

an implementation of a magma (a set with a binary operation)

__init__(cayley_table: Optional[torch.Tensor] = None, cardinality: Optional[int] = None)

constucts a new magma

>>> seed = torch.manual_seed(11)
>>> Magma(cardinality=2)
tensor([[1, 1],
        [0, 1]])
>>> Magma(torch.tensor([[0, 1], [1, 0]]))
tensor([[0, 1],
        [1, 0]])
>>> Magma()
Traceback (most recent call last):
    ...
ValueError: at least one argument must be given
>>> Magma([[0, 1]])
Traceback (most recent call last):
    ...
ValueError: cayley_table must be a `torch.Tensor` of shape (cardinality, cardinality)
>>> Magma([[0]], cardinality=2)
Traceback (most recent call last):
    ...
ValueError: cayley_table must be a `torch.Tensor` of shape (cardinality, cardinality)
Parameters
  • cayley_table – a Cayley table for a magma. If not provided, a random table is generated.

  • cardinality – a number of elements in a magma to generate a random one

property cardinality: int

number of elements in a magma

property has_inverses: bool

check whether there are solutions of equations \(ax=b\) and \(xa=b\)

property identity: int

find an identity element in a Cayley table

Returns

the index of the identity element or -1 if there is no identity

property is_associative: bool

check associativity of a Cayley table

Returns

whether the input table is assosiative or not

property is_commutative: bool

check commutativity of a Cayley table

Returns

whether the input table is commutative or not

property next_magma: neural_semigroups.magma.Magma

goes to the next magma Cayley table in their lexicographical order

>>> Magma(torch.tensor([[0, 1], [1, 0]])).next_magma
tensor([[0, 1],
        [1, 1]])
>>> Magma(torch.tensor([[0, 1], [1, 1]])).next_magma
tensor([[1, 0],
        [0, 0]])
>>> Magma(torch.tensor([[1, 1], [1, 1]])).next_magma
Traceback (most recent call last):
    ...
ValueError: there is no next magma!
Returns

another magma

property probabilistic_cube: torch.Tensor

a 3d array \(a\) where \(a_{ijk}=P\left\{e_ie_j=e_k\right\}\)

Returns

a probabilistic cube representation of a Cayley table

Cyclic Group

class neural_semigroups.CyclicGroup(cardinality: int)

finite cyclic group

__init__(cardinality: int)
Parameters

cardinality – number of elements in a cyclic group

Cayley Database

class neural_semigroups.CayleyDatabase(cardinality: int, database_filename: Optional[str] = None, data_path: str = '/home/docs/neural-semigroups-data')

a database of Cayley tables with different utility functions

__init__(cardinality: int, database_filename: Optional[str] = None, data_path: str = '/home/docs/neural-semigroups-data')
Parameters
  • cardinality – the number of elements in underlying magmas

  • database_filename – a full path to a pre-generated Cayley database. If None, a smallsemi data is used.

  • data_path – a valid path to use as a permanent data storage

augment_by_equivalent_tables() None

for every Cayley table in a previously loaded database adds all of its equivalent tables to the database

fill_in_with_model(cayley_table: List[List[int]]) Tuple[torch.Tensor, torch.Tensor]

get a list of possible completions of a partially filled Cayley table (unknow entries are filled by -1) using a machine learning model

Parameters

cayley_table – a partially filled Cayley table (unknow entries are filled by -1)

Returns

a tuple: (most probable completion, probabilistic cube)

load_model(filename: str) None

load pre-trained PyTorch model

Parameters

filename – where to load the model from

Returns

property model: torch.nn.Module
Returns

pre-trained Torch model

search_database(cayley_table: List[List[int]]) List[torch.Tensor]

get a list of possible completions of a partially filled Cayley table (unknown entries are filled by -1)

Parameters

cayley_table – a partially filled Cayley table (unknow entries are filled by -1)

Returns

a list of Cayley tables

testing_report(max_level: int = - 1) torch.Tensor

this function:

  • takes 1000 random Cayley tables from the database (if there are less tables, it simply takes all of them)

  • for each Cayley table generates max_level puzzles

  • each puzzle is created from a table by omitting several cell values

  • for each table the function omits 1, 2, and up to max_level of all cells

  • each puzzle is given to a pre-trained model of that database

  • if the model returns an associative table (not necessary the original one) it is considered to be a sucessfull solution

Parameters

max_level – up to how many cells to omit when creating a puzzle; when not provided or explicitly set to -1 it defaults to the total number of cells in a table

Returns

statistics of solved puzzles splitted by the levels of difficulty (number of cells omitted)

train_test_split(train_size: int, validation_size: int) Tuple[neural_semigroups.cayley_database.CayleyDatabase, neural_semigroups.cayley_database.CayleyDatabase, neural_semigroups.cayley_database.CayleyDatabase]

split a database of Cayley table in three: train, validation, and test

Parameters
  • cayley_db – a database of Cayley tables

  • train_size – number of tables in a train set

  • validation_size – number of tables in a validation set

Returns

a triple of distinct Cayley tables databases: (train, validation, test)

Denoising Autoencoder for Magmas

class neural_semigroups.MagmaDAE(*args: Any, **kwargs: Any)

Denoising Autoencoder for probability Cayley cubes of magmas

__init__(cardinality: int, hidden_dims: List[int], dropout_rate: float, do_reparametrization: bool = False)
Parameters
  • cardinality – the number of elements in a magma

  • hidden_dims – a list of sizes of hidden layers of the encoder and the decoder

  • dropout_rate – what percentage of cells from the Cayley table to substitute with uniform random variables

  • do_reparametrization – if True, adds a reparameterization trick

decode(encoded_input: torch.Tensor) torch.Tensor

represent an embedding vector as something with size aligned with the input

Parameters

encoded_input – an embedding vector

Returns

a vector of values from 0 to 1 (kind of probabilities)

encode(corrupted_input: torch.Tensor) torch.Tensor

represent input cube as an embedding vector

Parameters

corrupted_input – a tensor with two indices

Returns

some tensor with two indices and non-negative values

forward(cayley_cubes: torch.Tensor) torch.Tensor

forward pass inhereted from Module

Parameters

cayley_cubes – a batch of probabilistic representations of magmas

Returns

autoencoded probabilistic representations of magmas

reparameterize(mu_and_sigma: torch.Tensor) torch.Tensor

do a reparametrization trick

Parameters

mu_and_sigma – vector of expectation and standard deviation

Returns

sample from a distribution

Associator Loss

class neural_semigroups.AssociatorLoss(*args: Any, **kwargs: Any)

probabilistic associator loss

__init__(discrete: bool = False)
Parameters

discrete – when False, the KL divergence is is used for measuring associativity in a continuous way. when True, returns 1 for associative and 0 for non associative samples.

forward(cayley_cubes: torch.Tensor) torch.Tensor

finds a probabilistic associator of a given probabilistic Cayley cube

First, we build two 4-index tensors representating probability distributions of products \(a\left(bc\right)\) and \(\left(ab\right)c\), respectively:

\(T_{ijkl}=P\left\{e_i\left(e_je_k\right)=e_l\right\}= \sum\limits_{m=1}^nP\left\{e_ie_m=e_l\vert e_je_k=e_m\right\} P\left\{e_je_k=e_m\right\}=\sum\limits_{m=1}^na_{iml}a_{jkm}\)

and

\(T\prime_{ijkl}=P\left\{\left(e_ie_j\right)e_k=e_l\right\}= \sum\limits_{m=1}^nP\left\{e_me_k=e_l\vert e_ie_j=e_m\right\} P\left\{e_ie_j=e_m\right\}=\sum\limits_{m=1}^na_{mkl}a_{ijm}\)

Then we calculate Kullback-Leibler divergence between \(T_{ijkl}\) and \(T\prime_{ijkl}\) to find a continuous measure of associativity of the input table.

Parameters

cayley_cubes – a batch of probabilistic Cayley cubes

Returns

the probabilistic associator

utils

A collection of different functions used by other modules.

neural_semigroups.utils.random_semigroup(dim: int, maximal_tries: int) Tuple[bool, torch.Tensor]

randomly serch for a semigroup Cayley table. Not recommended to use with dim > 4

Parameters
  • dim – number of elements in a semigroup

  • maximal_tries – how many times to try at most

Returns

a pair (whether the Cayley table is associative, a Cayley table of a magma)

neural_semigroups.utils.check_filename(filename: str) int

checks filename, raises if it’s incorrect

Parameters

filename – filename to check

Returns

magma cardinality extracted from the filename

neural_semigroups.utils.check_smallsemi_filename(filename: str) int

checks a filename from a smallsemi package, raises if it’s incorrect

Parameters

filename – filename from a smallsemi package to check

Returns

magma cardinality extracted from the filename

neural_semigroups.utils.get_magma_by_index(cardinality: int, index: int) neural_semigroups.magma.Magma

find a magma from a lexicographical order by its index

Parameters
  • cardinality – the number of elements in a magma

  • index – an index of magma in a lexicographical order

Returns

a magma with a given index

neural_semigroups.utils.import_smallsemi_format(lines: List[bytes]) torch.Tensor

imports lines in a format used by smallsemi GAP package. Format description:

  • filename is of a form data[n].gl.gz, \(1<=n<=7\)

  • lines are separated by a pair of symbols \r\n

  • there are exactly \(n^2\) lines in a file

  • the first line is a header starting with ‘#’ symbol

  • each line is a string of \(N\) digits from \(0\) to \(n-1\)

  • \(N\) is the number of semigroups in the database

  • each column represents a serialised Cayley table

  • the database contains only cells starting from the second

  • the first cell of each Cayley table is assumed to be filled with 0

Parameters

lines – lines read from a file of smallsemi format

Returns

a list of Cayley tables

neural_semigroups.utils.get_equivalent_magmas(cayley_tables: torch.Tensor) torch.Tensor

given a Cayley tables batch generate Cayley tables of isomorphic and anti-isomorphic magmas

Parameters

cayley_tables – a batch of Cayley tables

Returns

a batch of Cayley tables of isomorphic and anti-isomorphic magmas

neural_semigroups.utils.download_file_from_url(url: str, filename: str, buffer_size: int = 1024) None

downloads some file from the Web to a specified destination

>>> download_file_from_url("https://python.org/", "/tmp/test.html")
>>> import subprocess
>>> subprocess.run("ls /tmp/test.html", shell=True).returncode
0
Parameters
  • url – a valid HTTP URL

  • filename – a valid filename

  • buffer_size – a number of bytes to read from URL at once

neural_semigroups.utils.download_smallsemi_data(data_path: str) None

downloads, unzips and moves smallsemi data

Parameters

data_path – data storage path

Returns

neural_semigroups.utils.print_report(totals: torch.Tensor) pandas.DataFrame

print report in a pretty format

>>> totals = torch.tensor([[4, 4], [0, 1]])
>>> print_report(totals)
       puzzles  solved  (%)
level
1      4             0    0
2      4             1   25
Parameters

totals – a table with three columns:

  • a column with total number of puzzles per level

  • a column with numbers of correctly solved puzzles

Returns

the report in a form of pandas.DataFrame

neural_semigroups.utils.get_newest_file(dir_path: str) str

get the last modified file from a diretory

>>> from pathlib import Path
>>> rmtree("/tmp/tmp/", ignore_errors=True)
>>> makedirs("/tmp/tmp/")
>>> Path("/tmp/tmp/one").touch()
>>> from time import sleep
>>> sleep(0.01)
>>> Path("/tmp/tmp/two").touch()
>>> get_newest_file("/tmp/tmp/")
'/tmp/tmp/two'
Parameters

path – a diretory path

Returns

the last modified file’s name

neural_semigroups.utils.get_two_indices_per_sample(batch_size: int, cardinality: int) Tuple[torch.Tensor, torch.Tensor, torch.Tensor]

generates all possible combination of two indices for each sample in a batch

>>> get_two_indices_per_sample(1, 2)
(tensor([0, 0, 0, 0]), tensor([0, 0, 1, 1]), tensor([0, 1, 0, 1]))
Parameters

batch_size – number of samples in a batch

Pamam cardinality

number of possible values of an index

neural_semigroups.utils.make_discrete(cayley_cubes: torch.Tensor) torch.Tensor

transforms a batch of probabilistic Cayley cubes and in the following way:

  • maximal probabilities in the last dimension become ones

  • all other probabilies become zeros

>>> make_discrete(torch.tensor([
...    [[[0.9, 0.1], [0.1, 0.9]], [[0.8, 0.2], [0.2, 0.8]]],
...    [[[0.7, 0.3], [0.3, 0.7]], [[0.7, 0.3], [0.3, 0.7]]],
... ]))
tensor([[[[1., 0.],
          [0., 1.]],

         [[1., 0.],
          [0., 1.]]],


        [[[1., 0.],
          [0., 1.]],

         [[1., 0.],
          [0., 1.]]]])
Parameters

cayley_cubes – a batch of probabilistic cubes representing Cayley tables

Returns

a batch of probabilistic cubes filled in with 0 or 1

neural_semigroups.utils.load_data_and_labels_from_file(database_filename: str) Tuple[torch.Tensor, torch.Tensor]

reads data from a special file format

Parameters

database_filename – a special file to read data from

Returns

(a tensor with Cayley tables, a tensor of their labels)

neural_semigroups.utils.load_data_and_labels_from_smallsemi(cardinality: int, data_path: str) Tuple[torch.Tensor, torch.Tensor]

Loads data from smallsemi package

Parameters
  • cardinality – which smallsemi file to use

  • data_path – where to seach for smallsemi data

Returns

(a tensor with Cayley tables, a tensor of their labels)