Package Documentation¶

Magma¶

class neural_semigroups.Magma(cayley_table: Optional[torch.Tensor] = None, cardinality: Optional[int] = None)¶

an implementation of a magma (a set with a binary operation)

__init__(cayley_table: Optional[torch.Tensor] = None, cardinality: Optional[int] = None)¶

constucts a new magma

>>> seed = torch.manual_seed(11)
>>> Magma(cardinality=2)
tensor([[1, 1],
        [0, 1]])
>>> Magma(torch.tensor([[0, 1], [1, 0]]))
tensor([[0, 1],
        [1, 0]])
>>> Magma()
Traceback (most recent call last):
    ...
ValueError: at least one argument must be given
>>> Magma([[0, 1]])
Traceback (most recent call last):
    ...
ValueError: cayley_table must be a `torch.Tensor` of shape (cardinality, cardinality)
>>> Magma([[0]], cardinality=2)
Traceback (most recent call last):
    ...
ValueError: cayley_table must be a `torch.Tensor` of shape (cardinality, cardinality)

Parameters

cayley_table – a Cayley table for a magma. If not provided, a random table is generated.
cardinality – a number of elements in a magma to generate a random one

property cardinality: int¶: number of elements in a magma

property has_inverses: bool¶: check whether there are solutions of equations \(ax=b\) and \(xa=b\)

property identity: int¶

find an identity element in a Cayley table

Returns: the index of the identity element or -1 if there is no identity

property is_associative: bool¶

check associativity of a Cayley table

Returns: whether the input table is assosiative or not

property is_commutative: bool¶

check commutativity of a Cayley table

Returns: whether the input table is commutative or not

property next_magma: neural_semigroups.magma.Magma¶

goes to the next magma Cayley table in their lexicographical order

>>> Magma(torch.tensor([[0, 1], [1, 0]])).next_magma
tensor([[0, 1],
        [1, 1]])
>>> Magma(torch.tensor([[0, 1], [1, 1]])).next_magma
tensor([[1, 0],
        [0, 0]])
>>> Magma(torch.tensor([[1, 1], [1, 1]])).next_magma
Traceback (most recent call last):
    ...
ValueError: there is no next magma!

Returns: another magma

property probabilistic_cube: torch.Tensor¶

a 3d array \(a\) where \(a_{ijk}=P\left\{e_ie_j=e_k\right\}\)

Returns: a probabilistic cube representation of a Cayley table

Cyclic Group¶

class neural_semigroups.CyclicGroup(cardinality: int)¶

finite cyclic group

__init__(cardinality: int)¶

Parameters: cardinality – number of elements in a cyclic group

Cayley Database¶

class neural_semigroups.CayleyDatabase(cardinality: int, database_filename: Optional[str] = None, data_path: str = '/home/docs/neural-semigroups-data')¶

a database of Cayley tables with different utility functions

__init__(cardinality: int, database_filename: Optional[str] = None, data_path: str = '/home/docs/neural-semigroups-data')¶

Parameters

cardinality – the number of elements in underlying magmas
database_filename – a full path to a pre-generated Cayley database. If None, a smallsemi data is used.
data_path – a valid path to use as a permanent data storage

augment_by_equivalent_tables() → None¶: for every Cayley table in a previously loaded database adds all of its equivalent tables to the database

fill_in_with_model(cayley_table: List[List[int]]) → Tuple[torch.Tensor, torch.Tensor]¶

get a list of possible completions of a partially filled Cayley table (unknow entries are filled by -1) using a machine learning model

Parameters: cayley_table – a partially filled Cayley table (unknow entries are filled by -1)
Returns: a tuple: (most probable completion, probabilistic cube)

load_model(filename: str) → None¶

load pre-trained PyTorch model

Parameters: filename – where to load the model from
Returns

property model: torch.nn.Module¶

Returns: pre-trained Torch model

search_database(cayley_table: List[List[int]]) → List[torch.Tensor]¶

get a list of possible completions of a partially filled Cayley table (unknown entries are filled by -1)

Parameters: cayley_table – a partially filled Cayley table (unknow entries are filled by -1)
Returns: a list of Cayley tables

testing_report(max_level: int = - 1) → torch.Tensor¶

this function:

takes 1000 random Cayley tables from the database (if there are less tables, it simply takes all of them)
for each Cayley table generates max_level puzzles
each puzzle is created from a table by omitting several cell values
for each table the function omits 1, 2, and up to max_level of all cells
each puzzle is given to a pre-trained model of that database
if the model returns an associative table (not necessary the original one) it is considered to be a sucessfull solution

Parameters: max_level – up to how many cells to omit when creating a puzzle; when not provided or explicitly set to -1 it defaults to the total number of cells in a table
Returns: statistics of solved puzzles splitted by the levels of difficulty (number of cells omitted)

train_test_split(train_size: int, validation_size: int) → Tuple[neural_semigroups.cayley_database.CayleyDatabase, neural_semigroups.cayley_database.CayleyDatabase, neural_semigroups.cayley_database.CayleyDatabase]¶

split a database of Cayley table in three: train, validation, and test

Parameters

cayley_db – a database of Cayley tables
train_size – number of tables in a train set
validation_size – number of tables in a validation set

Returns

a triple of distinct Cayley tables databases: (train, validation, test)

Denoising Autoencoder for Magmas¶

class neural_semigroups.MagmaDAE(*args: Any, **kwargs: Any)¶

Denoising Autoencoder for probability Cayley cubes of magmas

__init__(cardinality: int, hidden_dims: List[int], dropout_rate: float, do_reparametrization: bool = False)¶

Parameters

cardinality – the number of elements in a magma
hidden_dims – a list of sizes of hidden layers of the encoder and the decoder
dropout_rate – what percentage of cells from the Cayley table to substitute with uniform random variables
do_reparametrization – if True, adds a reparameterization trick

decode(encoded_input: torch.Tensor) → torch.Tensor¶

represent an embedding vector as something with size aligned with the input

Parameters: encoded_input – an embedding vector
Returns: a vector of values from 0 to 1 (kind of probabilities)

encode(corrupted_input: torch.Tensor) → torch.Tensor¶

represent input cube as an embedding vector

Parameters: corrupted_input – a tensor with two indices
Returns: some tensor with two indices and non-negative values

forward(cayley_cubes: torch.Tensor) → torch.Tensor¶

forward pass inhereted from Module

Parameters: cayley_cubes – a batch of probabilistic representations of magmas
Returns: autoencoded probabilistic representations of magmas

reparameterize(mu_and_sigma: torch.Tensor) → torch.Tensor¶

do a reparametrization trick

Parameters: mu_and_sigma – vector of expectation and standard deviation
Returns: sample from a distribution

Associator Loss¶

class neural_semigroups.AssociatorLoss(*args: Any, **kwargs: Any)¶

probabilistic associator loss

__init__(discrete: bool = False)¶

Parameters: discrete – when False, the KL divergence is is used for measuring associativity in a continuous way. when True, returns 1 for associative and 0 for non associative samples.

forward(cayley_cubes: torch.Tensor) → torch.Tensor¶

finds a probabilistic associator of a given probabilistic Cayley cube

First, we build two 4-index tensors representating probability distributions of products \(a\left(bc\right)\) and \(\left(ab\right)c\), respectively:

\(T_{ijkl}=P\left\{e_i\left(e_je_k\right)=e_l\right\}= \sum\limits_{m=1}^nP\left\{e_ie_m=e_l\vert e_je_k=e_m\right\} P\left\{e_je_k=e_m\right\}=\sum\limits_{m=1}^na_{iml}a_{jkm}\)

and

\(T\prime_{ijkl}=P\left\{\left(e_ie_j\right)e_k=e_l\right\}= \sum\limits_{m=1}^nP\left\{e_me_k=e_l\vert e_ie_j=e_m\right\} P\left\{e_ie_j=e_m\right\}=\sum\limits_{m=1}^na_{mkl}a_{ijm}\)

Then we calculate Kullback-Leibler divergence between \(T_{ijkl}\) and \(T\prime_{ijkl}\) to find a continuous measure of associativity of the input table.

Parameters: cayley_cubes – a batch of probabilistic Cayley cubes
Returns: the probabilistic associator

utils¶

A collection of different functions used by other modules.

neural_semigroups.utils.random_semigroup(dim: int, maximal_tries: int) → Tuple[bool, torch.Tensor]¶

randomly serch for a semigroup Cayley table. Not recommended to use with dim > 4

Parameters

dim – number of elements in a semigroup
maximal_tries – how many times to try at most

Returns

a pair (whether the Cayley table is associative, a Cayley table of a magma)

neural_semigroups.utils.check_filename(filename: str) → int¶

checks filename, raises if it’s incorrect

Parameters: filename – filename to check
Returns: magma cardinality extracted from the filename

neural_semigroups.utils.check_smallsemi_filename(filename: str) → int¶

checks a filename from a smallsemi package, raises if it’s incorrect

Parameters: filename – filename from a smallsemi package to check
Returns: magma cardinality extracted from the filename

neural_semigroups.utils.get_magma_by_index(cardinality: int, index: int) → neural_semigroups.magma.Magma¶

find a magma from a lexicographical order by its index

Parameters

cardinality – the number of elements in a magma
index – an index of magma in a lexicographical order

Returns

a magma with a given index

neural_semigroups.utils.import_smallsemi_format(lines: List[bytes]) → torch.Tensor¶

imports lines in a format used by smallsemi GAP package. Format description:

filename is of a form data[n].gl.gz, \(1<=n<=7\)
lines are separated by a pair of symbols \r\n
there are exactly \(n^2\) lines in a file
the first line is a header starting with ‘#’ symbol
each line is a string of \(N\) digits from \(0\) to \(n-1\)
\(N\) is the number of semigroups in the database
each column represents a serialised Cayley table
the database contains only cells starting from the second
the first cell of each Cayley table is assumed to be filled with 0

Parameters: lines – lines read from a file of smallsemi format
Returns: a list of Cayley tables

neural_semigroups.utils.get_equivalent_magmas(cayley_tables: torch.Tensor) → torch.Tensor¶

given a Cayley tables batch generate Cayley tables of isomorphic and anti-isomorphic magmas

Parameters: cayley_tables – a batch of Cayley tables
Returns: a batch of Cayley tables of isomorphic and anti-isomorphic magmas

neural_semigroups.utils.download_file_from_url(url: str, filename: str, buffer_size: int = 1024) → None¶

downloads some file from the Web to a specified destination

>>> download_file_from_url("https://python.org/", "/tmp/test.html")
>>> import subprocess
>>> subprocess.run("ls /tmp/test.html", shell=True).returncode
0

Parameters

url – a valid HTTP URL
filename – a valid filename
buffer_size – a number of bytes to read from URL at once

neural_semigroups.utils.download_smallsemi_data(data_path: str) → None¶

downloads, unzips and moves smallsemi data

Parameters: data_path – data storage path
Returns

neural_semigroups.utils.print_report(totals: torch.Tensor) → pandas.DataFrame¶

print report in a pretty format

>>> totals = torch.tensor([[4, 4], [0, 1]])
>>> print_report(totals)
       puzzles  solved  (%)
level
1      4             0    0
2      4             1   25

Parameters: totals – a table with three columns:

a column with total number of puzzles per level
a column with numbers of correctly solved puzzles

Returns: the report in a form of pandas.DataFrame

neural_semigroups.utils.get_newest_file(dir_path: str) → str¶

get the last modified file from a diretory

>>> from pathlib import Path
>>> rmtree("/tmp/tmp/", ignore_errors=True)
>>> makedirs("/tmp/tmp/")
>>> Path("/tmp/tmp/one").touch()
>>> from time import sleep
>>> sleep(0.01)
>>> Path("/tmp/tmp/two").touch()
>>> get_newest_file("/tmp/tmp/")
'/tmp/tmp/two'

Parameters: path – a diretory path
Returns: the last modified file’s name

neural_semigroups.utils.get_two_indices_per_sample(batch_size: int, cardinality: int) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor]¶

generates all possible combination of two indices for each sample in a batch

>>> get_two_indices_per_sample(1, 2)
(tensor([0, 0, 0, 0]), tensor([0, 0, 1, 1]), tensor([0, 1, 0, 1]))

Parameters: batch_size – number of samples in a batch
Pamam cardinality: number of possible values of an index

neural_semigroups.utils.make_discrete(cayley_cubes: torch.Tensor) → torch.Tensor¶

transforms a batch of probabilistic Cayley cubes and in the following way:

maximal probabilities in the last dimension become ones
all other probabilies become zeros

>>> make_discrete(torch.tensor([
...    [[[0.9, 0.1], [0.1, 0.9]], [[0.8, 0.2], [0.2, 0.8]]],
...    [[[0.7, 0.3], [0.3, 0.7]], [[0.7, 0.3], [0.3, 0.7]]],
... ]))
tensor([[[[1., 0.],
          [0., 1.]],

         [[1., 0.],
          [0., 1.]]],


        [[[1., 0.],
          [0., 1.]],

         [[1., 0.],
          [0., 1.]]]])

Parameters: cayley_cubes – a batch of probabilistic cubes representing Cayley tables
Returns: a batch of probabilistic cubes filled in with 0 or 1

neural_semigroups.utils.load_data_and_labels_from_file(database_filename: str) → Tuple[torch.Tensor, torch.Tensor]¶

reads data from a special file format

Parameters: database_filename – a special file to read data from
Returns: (a tensor with Cayley tables, a tensor of their labels)

neural_semigroups.utils.load_data_and_labels_from_smallsemi(cardinality: int, data_path: str) → Tuple[torch.Tensor, torch.Tensor]¶

Loads data from smallsemi package

Parameters

cardinality – which smallsemi file to use
data_path – where to seach for smallsemi data

Returns

(a tensor with Cayley tables, a tensor of their labels)