Package Documentation¶
Magma¶
- class neural_semigroups.Magma(cayley_table: Optional[torch.Tensor] = None, cardinality: Optional[int] = None)¶
an implementation of a magma (a set with a binary operation)
- __init__(cayley_table: Optional[torch.Tensor] = None, cardinality: Optional[int] = None)¶
constucts a new magma
>>> seed = torch.manual_seed(11) >>> Magma(cardinality=2) tensor([[1, 1], [0, 1]]) >>> Magma(torch.tensor([[0, 1], [1, 0]])) tensor([[0, 1], [1, 0]]) >>> Magma() Traceback (most recent call last): ... ValueError: at least one argument must be given >>> Magma([[0, 1]]) Traceback (most recent call last): ... ValueError: cayley_table must be a `torch.Tensor` of shape (cardinality, cardinality) >>> Magma([[0]], cardinality=2) Traceback (most recent call last): ... ValueError: cayley_table must be a `torch.Tensor` of shape (cardinality, cardinality)
- Parameters
cayley_table – a Cayley table for a magma. If not provided, a random table is generated.
cardinality – a number of elements in a magma to generate a random one
- property cardinality: int¶
number of elements in a magma
- property has_inverses: bool¶
check whether there are solutions of equations \(ax=b\) and \(xa=b\)
- property identity: int¶
find an identity element in a Cayley table
- Returns
the index of the identity element or -1 if there is no identity
- property is_associative: bool¶
check associativity of a Cayley table
- Returns
whether the input table is assosiative or not
- property is_commutative: bool¶
check commutativity of a Cayley table
- Returns
whether the input table is commutative or not
- property next_magma: neural_semigroups.magma.Magma¶
goes to the next magma Cayley table in their lexicographical order
>>> Magma(torch.tensor([[0, 1], [1, 0]])).next_magma tensor([[0, 1], [1, 1]]) >>> Magma(torch.tensor([[0, 1], [1, 1]])).next_magma tensor([[1, 0], [0, 0]]) >>> Magma(torch.tensor([[1, 1], [1, 1]])).next_magma Traceback (most recent call last): ... ValueError: there is no next magma!
- Returns
another magma
- property probabilistic_cube: torch.Tensor¶
a 3d array \(a\) where \(a_{ijk}=P\left\{e_ie_j=e_k\right\}\)
- Returns
a probabilistic cube representation of a Cayley table
Cyclic Group¶
Cayley Database¶
- class neural_semigroups.CayleyDatabase(cardinality: int, database_filename: Optional[str] = None, data_path: str = '/home/docs/neural-semigroups-data')¶
a database of Cayley tables with different utility functions
- __init__(cardinality: int, database_filename: Optional[str] = None, data_path: str = '/home/docs/neural-semigroups-data')¶
- Parameters
cardinality – the number of elements in underlying magmas
database_filename – a full path to a pre-generated Cayley database. If
None
, asmallsemi
data is used.data_path – a valid path to use as a permanent data storage
- augment_by_equivalent_tables() None ¶
for every Cayley table in a previously loaded database adds all of its equivalent tables to the database
- fill_in_with_model(cayley_table: List[List[int]]) Tuple[torch.Tensor, torch.Tensor] ¶
get a list of possible completions of a partially filled Cayley table (unknow entries are filled by
-1
) using a machine learning model- Parameters
cayley_table – a partially filled Cayley table (unknow entries are filled by
-1
)- Returns
a tuple: (most probable completion, probabilistic cube)
- load_model(filename: str) None ¶
load pre-trained PyTorch model
- Parameters
filename – where to load the model from
- Returns
- property model: torch.nn.Module¶
- Returns
pre-trained Torch model
- search_database(cayley_table: List[List[int]]) List[torch.Tensor] ¶
get a list of possible completions of a partially filled Cayley table (unknown entries are filled by
-1
)- Parameters
cayley_table – a partially filled Cayley table (unknow entries are filled by
-1
)- Returns
a list of Cayley tables
- testing_report(max_level: int = - 1) torch.Tensor ¶
this function:
takes 1000 random Cayley tables from the database (if there are less tables, it simply takes all of them)
for each Cayley table generates
max_level
puzzleseach puzzle is created from a table by omitting several cell values
for each table the function omits 1, 2, and up to
max_level
of all cellseach puzzle is given to a pre-trained model of that database
if the model returns an associative table (not necessary the original one) it is considered to be a sucessfull solution
- Parameters
max_level – up to how many cells to omit when creating a puzzle; when not provided or explicitly set to
-1
it defaults to the total number of cells in a table- Returns
statistics of solved puzzles splitted by the levels of difficulty (number of cells omitted)
- train_test_split(train_size: int, validation_size: int) Tuple[neural_semigroups.cayley_database.CayleyDatabase, neural_semigroups.cayley_database.CayleyDatabase, neural_semigroups.cayley_database.CayleyDatabase] ¶
split a database of Cayley table in three: train, validation, and test
- Parameters
cayley_db – a database of Cayley tables
train_size – number of tables in a train set
validation_size – number of tables in a validation set
- Returns
a triple of distinct Cayley tables databases:
(train, validation, test)
Denoising Autoencoder for Magmas¶
- class neural_semigroups.MagmaDAE(*args: Any, **kwargs: Any)¶
Denoising Autoencoder for probability Cayley cubes of magmas
- __init__(cardinality: int, hidden_dims: List[int], dropout_rate: float, do_reparametrization: bool = False)¶
- Parameters
cardinality – the number of elements in a magma
hidden_dims – a list of sizes of hidden layers of the encoder and the decoder
dropout_rate – what percentage of cells from the Cayley table to substitute with uniform random variables
do_reparametrization – if
True
, adds a reparameterization trick
- decode(encoded_input: torch.Tensor) torch.Tensor ¶
represent an embedding vector as something with size aligned with the input
- Parameters
encoded_input – an embedding vector
- Returns
a vector of values from
0
to1
(kind of probabilities)
- encode(corrupted_input: torch.Tensor) torch.Tensor ¶
represent input cube as an embedding vector
- Parameters
corrupted_input – a tensor with two indices
- Returns
some tensor with two indices and non-negative values
- forward(cayley_cubes: torch.Tensor) torch.Tensor ¶
forward pass inhereted from Module
- Parameters
cayley_cubes – a batch of probabilistic representations of magmas
- Returns
autoencoded probabilistic representations of magmas
- reparameterize(mu_and_sigma: torch.Tensor) torch.Tensor ¶
do a reparametrization trick
- Parameters
mu_and_sigma – vector of expectation and standard deviation
- Returns
sample from a distribution
Associator Loss¶
- class neural_semigroups.AssociatorLoss(*args: Any, **kwargs: Any)¶
probabilistic associator loss
- __init__(discrete: bool = False)¶
- Parameters
discrete – when
False
, the KL divergence is is used for measuring associativity in a continuous way. whenTrue
, returns1
for associative and0
for non associative samples.
- forward(cayley_cubes: torch.Tensor) torch.Tensor ¶
finds a probabilistic associator of a given probabilistic Cayley cube
First, we build two 4-index tensors representating probability distributions of products \(a\left(bc\right)\) and \(\left(ab\right)c\), respectively:
\(T_{ijkl}=P\left\{e_i\left(e_je_k\right)=e_l\right\}= \sum\limits_{m=1}^nP\left\{e_ie_m=e_l\vert e_je_k=e_m\right\} P\left\{e_je_k=e_m\right\}=\sum\limits_{m=1}^na_{iml}a_{jkm}\)
and
\(T\prime_{ijkl}=P\left\{\left(e_ie_j\right)e_k=e_l\right\}= \sum\limits_{m=1}^nP\left\{e_me_k=e_l\vert e_ie_j=e_m\right\} P\left\{e_ie_j=e_m\right\}=\sum\limits_{m=1}^na_{mkl}a_{ijm}\)
Then we calculate Kullback-Leibler divergence between \(T_{ijkl}\) and \(T\prime_{ijkl}\) to find a continuous measure of associativity of the input table.
- Parameters
cayley_cubes – a batch of probabilistic Cayley cubes
- Returns
the probabilistic associator
utils¶
A collection of different functions used by other modules.
- neural_semigroups.utils.random_semigroup(dim: int, maximal_tries: int) Tuple[bool, torch.Tensor] ¶
randomly serch for a semigroup Cayley table. Not recommended to use with dim > 4
- Parameters
dim – number of elements in a semigroup
maximal_tries – how many times to try at most
- Returns
a pair (whether the Cayley table is associative, a Cayley table of a magma)
- neural_semigroups.utils.check_filename(filename: str) int ¶
checks filename, raises if it’s incorrect
- Parameters
filename – filename to check
- Returns
magma cardinality extracted from the filename
- neural_semigroups.utils.check_smallsemi_filename(filename: str) int ¶
checks a filename from a smallsemi package, raises if it’s incorrect
- Parameters
filename – filename from a smallsemi package to check
- Returns
magma cardinality extracted from the filename
- neural_semigroups.utils.get_magma_by_index(cardinality: int, index: int) neural_semigroups.magma.Magma ¶
find a magma from a lexicographical order by its index
- Parameters
cardinality – the number of elements in a magma
index – an index of magma in a lexicographical order
- Returns
a magma with a given index
- neural_semigroups.utils.import_smallsemi_format(lines: List[bytes]) torch.Tensor ¶
imports lines in a format used by
smallsemi
GAP package. Format description:filename is of a form
data[n].gl.gz
, \(1<=n<=7\)lines are separated by a pair of symbols
\r\n
there are exactly \(n^2\) lines in a file
the first line is a header starting with ‘#’ symbol
each line is a string of \(N\) digits from \(0\) to \(n-1\)
\(N\) is the number of semigroups in the database
each column represents a serialised Cayley table
the database contains only cells starting from the second
the first cell of each Cayley table is assumed to be filled with
0
- Parameters
lines – lines read from a file of smallsemi format
- Returns
a list of Cayley tables
- neural_semigroups.utils.get_equivalent_magmas(cayley_tables: torch.Tensor) torch.Tensor ¶
given a Cayley tables batch generate Cayley tables of isomorphic and anti-isomorphic magmas
- Parameters
cayley_tables – a batch of Cayley tables
- Returns
a batch of Cayley tables of isomorphic and anti-isomorphic magmas
- neural_semigroups.utils.download_file_from_url(url: str, filename: str, buffer_size: int = 1024) None ¶
downloads some file from the Web to a specified destination
>>> download_file_from_url("https://python.org/", "/tmp/test.html") >>> import subprocess >>> subprocess.run("ls /tmp/test.html", shell=True).returncode 0
- Parameters
url – a valid HTTP URL
filename – a valid filename
buffer_size – a number of bytes to read from URL at once
- neural_semigroups.utils.download_smallsemi_data(data_path: str) None ¶
downloads, unzips and moves
smallsemi
data- Parameters
data_path – data storage path
- Returns
- neural_semigroups.utils.print_report(totals: torch.Tensor) pandas.DataFrame ¶
print report in a pretty format
>>> totals = torch.tensor([[4, 4], [0, 1]]) >>> print_report(totals) puzzles solved (%) level 1 4 0 0 2 4 1 25
- Parameters
totals – a table with three columns:
a column with total number of puzzles per level
a column with numbers of correctly solved puzzles
- Returns
the report in a form of
pandas.DataFrame
- neural_semigroups.utils.get_newest_file(dir_path: str) str ¶
get the last modified file from a diretory
>>> from pathlib import Path >>> rmtree("/tmp/tmp/", ignore_errors=True) >>> makedirs("/tmp/tmp/") >>> Path("/tmp/tmp/one").touch() >>> from time import sleep >>> sleep(0.01) >>> Path("/tmp/tmp/two").touch() >>> get_newest_file("/tmp/tmp/") '/tmp/tmp/two'
- Parameters
path – a diretory path
- Returns
the last modified file’s name
- neural_semigroups.utils.get_two_indices_per_sample(batch_size: int, cardinality: int) Tuple[torch.Tensor, torch.Tensor, torch.Tensor] ¶
generates all possible combination of two indices for each sample in a batch
>>> get_two_indices_per_sample(1, 2) (tensor([0, 0, 0, 0]), tensor([0, 0, 1, 1]), tensor([0, 1, 0, 1]))
- Parameters
batch_size – number of samples in a batch
- Pamam cardinality
number of possible values of an index
- neural_semigroups.utils.make_discrete(cayley_cubes: torch.Tensor) torch.Tensor ¶
transforms a batch of probabilistic Cayley cubes and in the following way:
maximal probabilities in the last dimension become ones
all other probabilies become zeros
>>> make_discrete(torch.tensor([ ... [[[0.9, 0.1], [0.1, 0.9]], [[0.8, 0.2], [0.2, 0.8]]], ... [[[0.7, 0.3], [0.3, 0.7]], [[0.7, 0.3], [0.3, 0.7]]], ... ])) tensor([[[[1., 0.], [0., 1.]], [[1., 0.], [0., 1.]]], [[[1., 0.], [0., 1.]], [[1., 0.], [0., 1.]]]])
- Parameters
cayley_cubes – a batch of probabilistic cubes representing Cayley tables
- Returns
a batch of probabilistic cubes filled in with
0
or1
- neural_semigroups.utils.load_data_and_labels_from_file(database_filename: str) Tuple[torch.Tensor, torch.Tensor] ¶
reads data from a special file format
- Parameters
database_filename – a special file to read data from
- Returns
(a tensor with Cayley tables, a tensor of their labels)
- neural_semigroups.utils.load_data_and_labels_from_smallsemi(cardinality: int, data_path: str) Tuple[torch.Tensor, torch.Tensor] ¶
Loads data from
smallsemi
package- Parameters
cardinality – which
smallsemi
file to usedata_path – where to seach for
smallsemi
data
- Returns
(a tensor with Cayley tables, a tensor of their labels)