Package Documentation¶

Associator Loss¶

class neural_semigroups.AssociatorLoss(*args: Any, **kwargs: Any)¶

probabilistic associator loss

__init__(discrete: bool = False)¶

Parameters: discrete – when False, the KL divergence is is used for measuring associativity in a continuous way. when True, returns 1 for associative and 0 for non associative samples.

forward(cayley_cubes: torch.Tensor) → torch.Tensor¶

finds a probabilistic associator of a given probabilistic Cayley cube

First, we build two 4-index tensors representing probability distributions of products \(a\left(bc\right)\) and \(\left(ab\right)c\), respectively:

\(T_{ijkl}=P\left\{e_i\left(e_je_k\right)=e_l\right\}= \sum\limits_{m=1}^nP\left\{e_ie_m=e_l\vert e_je_k=e_m\right\} P\left\{e_je_k=e_m\right\}=\sum\limits_{m=1}^na_{iml}a_{jkm}\)

and

\(T\prime_{ijkl}=P\left\{\left(e_ie_j\right)e_k=e_l\right\}= \sum\limits_{m=1}^nP\left\{e_me_k=e_l\vert e_ie_j=e_m\right\} P\left\{e_ie_j=e_m\right\}=\sum\limits_{m=1}^na_{mkl}a_{ijm}\)

Then we calculate Kullback-Leibler divergence between \(T_{ijkl}\) and \(T\prime_{ijkl}\) to find a continuous measure of associativity of the input table.

Parameters: cayley_cubes – a batch of probabilistic Cayley cubes
Returns: the probabilistic associator

Constant Baseline¶

class neural_semigroups.ConstantBaseline(*args: Any, **kwargs: Any)¶

A model that always fills in the same number

__init__(cardinality: int, fill_in_with: int = 0)¶

initializes a constant model

>>> ConstantBaseline(2, 3)
Traceback (most recent call last):
    ...
ValueError: `fill_in_with` should be non-negative and less than `cardinality`, got 3 >= 2
>>> ConstantBaseline(2, 1).constant_distribution.cpu()
tensor([0., 1.])

Parameters

cardinality – the number of elements in a magma
fill_in_with – an item which will be suggested as a correct answer

forward(cayley_cube: torch.Tensor) → torch.Tensor¶

forward pass inhereted from Module

>>> ConstantBaseline(2, 1)(torch.tensor([
...     [[[0., 1.], [0.5, 0.5]], [[1., 0.], [0., 1.]]],
...     [[[0., 1.], [1.0, 0.0]], [[0.5, 0.5], [0., 1.]]]
... ]).to(CURRENT_DEVICE)).cpu()
tensor([[[[0., 1.],
  [0., 1.]],

 [[1., 0.],
  [0., 1.]]],


[[[0., 1.],
  [1., 0.]],

 [[0., 1.],
  [0., 1.]]]])

Parameters: cayley_cube – probabilistic representation of a magma
Returns: a batch of constant values (set in the constructor)

Cyclic Group¶

class neural_semigroups.CyclicGroup(cardinality: int)¶

finite cyclic group

__init__(cardinality: int)¶

Parameters: cardinality – number of elements in a cyclic group

Denoising Autoencoder for Magmas¶

class neural_semigroups.MagmaDAE(*args: Any, **kwargs: Any)¶

Denoising Autoencoder for probability Cayley cubes of magmas

__init__(cardinality: int, hidden_dims: List[int], do_reparametrization: bool = False)¶

Parameters

cardinality – the number of elements in a magma
hidden_dims – a list of sizes of hidden layers of the encoder and the decoder
do_reparametrization – if True, adds a reparametrization trick

decode(encoded_input: torch.Tensor) → torch.Tensor¶

represent an embedding vector as something with size aligned with the input

Parameters: encoded_input – an embedding vector
Returns: a vector of values from 0 to 1 (kind of probabilities)

encode(input_with_noise: torch.Tensor) → torch.Tensor¶

represent input cube as an embedding vector

Parameters: input_with_noise – a tensor with two indices
Returns: some tensor with two indices and non-negative values

forward(cayley_cubes: torch.Tensor) → torch.Tensor¶

forward pass inherited from Module

Parameters: cayley_cubes – a batch of probabilistic representations of magmas
Returns: auto-encoded probabilistic representations of magmas

reparametrize(mu_and_log_sigma: torch.Tensor) → torch.Tensor¶

do a reparametrization trick

Parameters: mu_and_sigma – vector of expectation and standard deviation
Returns: sample from a distribution

Magma¶

class neural_semigroups.Magma(cayley_table: Optional[torch.Tensor] = None, cardinality: Optional[int] = None)¶

an implementation of a magma (a set with a binary operation)

__init__(cayley_table: Optional[torch.Tensor] = None, cardinality: Optional[int] = None)¶

constucts a new magma

>>> seed = torch.manual_seed(11)
>>> Magma(cardinality=2)
tensor([[1, 1],
        [0, 1]])
>>> Magma(torch.tensor([[0, 1], [1, 0]]))
tensor([[0, 1],
        [1, 0]])
>>> Magma()
Traceback (most recent call last):
    ...
ValueError: at least one argument must be given
>>> Magma(torch. tensor([[0]]), cardinality=2)
Traceback (most recent call last):
    ...
ValueError: cayley_table must be a `torch.Tensor` of shape (cardinality, cardinality)

Parameters

cayley_table – a Cayley table for a magma. If not provided, a random table is generated.
cardinality – a number of elements in a magma to generate a random one

property cardinality: int¶: number of elements in a magma

property has_inverses: bool¶: check whether there are solutions of equations \(ax=b\) and \(xa=b\)

property identity: int¶

find an identity element in a Cayley table

Returns: the index of the identity element or -1 if there is no identity

property is_associative: bool¶

check associativity of a Cayley table

Returns: whether the input table is associative or not

property is_commutative: bool¶

check commutativity of a Cayley table

Returns: whether the input table is commutative or not

property next_magma: Magma¶

goes to the next magma Cayley table in their lexicographical order

>>> Magma(torch.tensor([[0, 1], [1, 0]])).next_magma
tensor([[0, 1],
        [1, 1]])
>>> Magma(torch.tensor([[0, 1], [1, 1]])).next_magma
tensor([[1, 0],
        [0, 0]])
>>> Magma(torch.tensor([[1, 1], [1, 1]])).next_magma
Traceback (most recent call last):
    ...
ValueError: there is no next magma!

Returns: another magma

property probabilistic_cube: torch.Tensor¶

a 3d array \(a\) where \(a_{ijk}=P\left\{e_ie_j=e_k\right\}\)

Returns: a probabilistic cube representation of a Cayley table

random_isomorphism() → torch.Tensor¶

get some Cayley table isomorphic to self.cayley_table form example

>>> Magma(torch.tensor([[0, 0], [0, 0]])).random_isomorphism()
tensor([[1, 1],
                [1, 1]])

Precise Guess Loss¶

class neural_semigroups.PreciseGuessLoss(*args: Any, **kwargs: Any)¶

loss for comparing probabilistic Cayley cubes precisely

forward(predicted_cubes: torch.Tensor, target_cubes: torch.Tensor) → torch.Tensor¶: finds a percentage of predicted Cayley tables, identical to the target ones

Datasets¶

Random Dataset¶

class neural_semigroups.datasets.RandomDataset(*args: Any, **kwargs: Any)¶

an iterable dataset having fixed length and returning random tensors of pre-defined shape

>>> data = RandomDataset(2, ([5, 2], [1]))
>>> print([item.shape for item in data[1]])
[torch.Size([5, 2]), torch.Size([1])]
>>> for row in data:
...     print([item.shape for item in row])
...     break
[torch.Size([5, 2]), torch.Size([1])]
>>> data = RandomDataset(3, [4, 4])
>>> print(data[1].shape)
torch.Size([4, 4])
>>> for row in data:
...     print(row.shape)
...     break
torch.Size([4, 4])
>>> print(len(data))
3

__init__(data_size: int, data_dim: Union[torch.Size, Tuple[torch.Size, ...]])¶

Semigroups Dataset¶

class neural_semigroups.datasets.SemigroupsDataset(*args: Any, **kwargs: Any)¶

an extension of torch.util.data.TensorDataset similar to a private class torchvision.datasets.vision.VisionDataset

__init__(root: str, cardinality: int, transform: Optional[Callable] = None)¶

Parameters

root – root directory of dataset
cardinality – a semigroup cardinality to use.
transform – a function/transform that takes in a Cayley table and returns a transformed version.

Smallsemi Dataset¶

class neural_semigroups.datasets.Smallsemi(*args: Any, **kwargs: Any)¶

a torch.util.data.Dataset wrapper for the data from https://www.gap-system.org/Packages/smallsemi.html

>>> import shutil
>>> from neural_semigroups.constants import TEST_TEMP_DATA
>>> shutil.rmtree(TEST_TEMP_DATA, ignore_errors=True)
>>> os.mkdir(TEST_TEMP_DATA)
>>> smallsemi = Smallsemi(root=TEST_TEMP_DATA, cardinality=2)
Traceback (most recent call last):
   ...
ValueError: test_temp_data must have exactly one version of smallsemi
>>> smallsemi = Smallsemi(
...     root=TEST_TEMP_DATA,
...     cardinality=2,
...     download=True,
...     transform=lambda x: x
... )
>>> smallsemi[0][0]
tensor([[0, 0],
        [0, 0]])

__init__(root: str, cardinality: int, download: bool = False, transform: Optional[Callable] = None)¶

Parameters

root – root directory of dataset where smallsemi-*/data/data2to7/ exist.
cardinality – a semigroup cardinality to use. Corresponds to data{cardinality}.gl.gz.
download – if true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
transform – a function/transform that takes in a Cayley table and returns a transformed version.

download() → None¶: downloads, unzips and moves smallsemi data

load_data_and_labels_from_smallsemi() → None¶: loads data from smallsemi package

Mace4 Semigroups Dataset¶

class neural_semigroups.datasets.Mace4Semigroups(*args: Any, **kwargs: Any)¶

a torch.util.data.Dataset wrapper for the data of mace4 output stored in a sqlite database

>>> import shutil
>>> from neural_semigroups.constants import TEST_TEMP_DATA
>>> import os
>>> from neural_semigroups.generate_data_with_mace4 import (
... generate_data_with_mace4)
>>> shutil.rmtree(TEST_TEMP_DATA, ignore_errors=True)
>>> os.mkdir(TEST_TEMP_DATA)
>>> database = os.path.join(TEST_TEMP_DATA,"test.db")
>>> torch.manual_seed(42) 
<torch...
>>> generate_data_with_mace4([
... "--max_dim", "2",
... "--min_dim", "2",
... "--number_of_tasks", "1",
... "--database_name", database])
>>> mace4_semigroups = Mace4Semigroups(
...     root=database,
...     cardinality=2,
...     transform=lambda x: x
... )
>>> mace4_semigroups[0][0]
tensor([[0, 0],
        [0, 0]])
>>> mace4_semigroups.get_table_from_output("not a mace4 output file")
Traceback (most recent call last):
    ...
ValueError: wrong mace4 output file format!

__init__(cardinality: int, root: str, transform: Optional[Callable] = None)¶

Parameters

root – a full path to an sqlite database file which has a table mace_output with a string column output
cardinality – the cardinality of semigroups
transform – a function/transform that takes a Cayley table and returns a transformed version.

get_additional_info(cursor: Cursor) → int¶

gets some info from an SQLite database with mace4 outputs

Parameters: cursor – an SQLite database cursor
Returns: a total number of rows in a table, a magma dimension

get_table_from_output(output: str) → torch.Tensor¶

gets a Cayley table of a magma from the output of mace4

Parameters: output – output of mace4
Returns: a Cayley table

load_data_from_mace_output() → None¶: loads data generated by mace4 from an sqlite database

utils¶

A collection of different functions used by other modules.

neural_semigroups.utils.random_semigroup(dim: int, maximal_tries: int) → Tuple[bool, torch.Tensor]¶

randomly search for a semigroup Cayley table. Not recommended to use with dim > 4

Parameters

dim – number of elements in a semigroup
maximal_tries – how many times to try at most

Returns

a pair (whether the Cayley table is associative, a Cayley table of a magma)

neural_semigroups.utils.get_magma_by_index(cardinality: int, index: int) → Magma¶

find a magma from a lexicographical order by its index

Parameters

cardinality – the number of elements in a magma
index – an index of magma in a lexicographical order

Returns

a magma with a given index

neural_semigroups.utils.import_smallsemi_format(lines: List[str]) → torch.Tensor¶

imports lines in a format used by smallsemi GAP package. Format description:

filename is of a form data[n].gl, \(1<=n<=7\)
lines are separated by a pair of symbols \r\n
there are exactly \(n^2\) lines in a file
the first line is a header starting with ‘#’ symbol
each line is a string of \(N\) digits from \(0\) to \(n-1\)
\(N\) is the number of semigroups in the database
each column represents a serialised Cayley table
the database contains only cells starting from the second
the first cell of each Cayley table is assumed to be filled with 0

Parameters: lines – lines read from a file of smallsemi format
Returns: a list of Cayley tables

neural_semigroups.utils.download_file_from_url(url: str, filename: str, buffer_size: int = 1024) → None¶

downloads some file from the Web to a specified destination

>>> import os
>>> TEMP_FILE = "test.html"
>>> if os.path.exists(TEMP_FILE):
...     os.remove(TEMP_FILE)
>>> download_file_from_url("https://python.org/", TEMP_FILE)
>>> os.path.exists(TEMP_FILE)
True

Parameters

url – a valid HTTP URL
filename – a valid filename
buffer_size – a number of bytes to read from URL at once

neural_semigroups.utils.find_substring_by_pattern(strings: List[str], starts_with: str, ends_before: str) → str¶

search for a first occurrence of a given pattern in a string list

>>> some_strings = ["one", "two", "three"]
>>> find_substring_by_pattern(some_strings, "t", "o")
'tw'
>>> find_substring_by_pattern(some_strings, "four", "five")
Traceback (most recent call last):
   ...
ValueError: pattern four.*five not found

Parameters

strings – a list of strings where the pattern is searched for
starts_with – the first letters of a pattern
ends_before – a substring which marks the beginning of something different

Returns

a pattern which starts with starts_with and ends before ends_before

neural_semigroups.utils.get_newest_file(dir_path: str) → str¶

get the last modified file from a diretory

>>> from pathlib import Path
>>> from os import makedirs, path
>>> from neural_semigroups.constants import TEST_TEMP_DATA
>>> shutil.rmtree(path.join(TEST_TEMP_DATA, "tmp"), ignore_errors=True)
>>> makedirs(path.join(TEST_TEMP_DATA, "tmp"))
>>> Path(path.join(TEST_TEMP_DATA, "tmp", "one")).touch()
>>> from time import sleep
>>> sleep(0.01)
>>> Path(path.join(TEST_TEMP_DATA, "tmp", "two")).touch()
>>> get_newest_file(path.join(TEST_TEMP_DATA, "tmp"))
'test_temp_data/tmp/two'

Parameters: dir_path – a directory path
Returns: the last modified file’s name

neural_semigroups.utils.make_discrete(cayley_cubes: torch.Tensor) → torch.Tensor¶

transforms a batch of probabilistic Cayley cubes and in the following way:

maximal probabilities in the last dimension become ones
all other probabilies become zeros

>>> make_discrete(torch.tensor([
...    [[[0.9, 0.1], [0.1, 0.9]], [[0.8, 0.2], [0.2, 0.8]]],
...    [[[0.7, 0.3], [0.3, 0.7]], [[0.7, 0.3], [0.3, 0.7]]],
... ]))
tensor([[[[1., 0.],
          [0., 1.]],

         [[1., 0.],
          [0., 1.]]],


        [[[1., 0.],
          [0., 1.]],

         [[1., 0.],
          [0., 1.]]]])

Parameters: cayley_cubes – a batch of probabilistic cubes representing Cayley tables
Returns: a batch of probabilistic cubes filled in with 0 or 1

neural_semigroups.utils.count_different(one: torch.Tensor, two: torch.Tensor) → torch.Tensor¶

given two batches of the same size, counts number of positions in these batches, on which the tensor from the first batch differs from the second

Parameters

one – one batch of tensors
two – another batch of tensors

Returns

the number of different tensors

neural_semigroups.utils.hide_cells(cayley_table: torch.Tensor, number_of_cells: int) → torch.Tensor¶

set several cells in a Cayley table to math:-1

>>> torch.manual_seed(42) 
<torch...
>>> hide_cells(torch.tensor([[0, 1], [2, 3]]), 2).cpu()
tensor([[ 0,  1],
        [-1, -1]])

Parameters

cayley_table – a Cayley table
number_of_cells – a number of cells to hide

Returns

a Cayley table with hidden cells

neural_semigroups.utils.read_whole_file(filename: str) → str¶

reads the whole file into a string, for example

>>> read_whole_file("README.rst").split("\n")[3]
'Neural Semigroups'

Parameters: filename – a name of the file to read
Returns: whole contents of the file

neural_semigroups.utils.partial_table_to_cube(table: torch.Tensor) → torch.Tensor¶

create a probabilistic cube from a partially filled Cayley table -1 is translated to \(\frac1n\) where \(n\) is table’s cardinality, for example

>>> partial_table_to_cube(torch.tensor([[0, -1], [0, 0]])).cpu()
tensor([[[1.0000, 0.0000],
          [0.5000, 0.5000]],

         [[1.0000, 0.0000],
          [1.0000, 0.0000]]])

Parameters: table – a Cayley table, partially filled by -1’s
Returns: a probabilistic cube

neural_semigroups.utils.connect_to_db(database_name: str) → Cursor¶

open a connection to an SQLite database

Parameters: database_name – filename of a database
Returns: a cursor to the database

neural_semigroups.utils.create_table_if_not_exists(cursor: Cursor, table_name: str, columns: List[str]) → None¶

create a table if it does not exist

Parameters

cursor – a cursor to the database where to create a table
table_name – what table to create
columns – a list of strings of format “COLUMN_NAME COLUMN_TYPE”

Returns

neural_semigroups.utils.insert_values_into_table(cursor: Cursor, table_name: str, values: Tuple[str, ...]) → None¶

inserts values into a table

Parameters

cursor – a cursor to database where the target table is located
table_name – the target table
values – values to insert into the target table

Returns

neural_semigroups.utils.gunzip(archive_path: str) → None¶

extracts a GZIP file in the same folder

Parameters: archive_path – a path ending with .gz
Returns

generate_data_with_mace4¶

A script which generates semigroups with mace4 and saves them in an sqlite database.

neural_semigroups.generate_data_with_mace4.generate_data_with_mace4(input_args: Optional[List[str]] = None) → None¶

Parameters: input_args – a list of arguments (if None then ones from the command line are used)
Returns