Package Documentation

Associator Loss

class neural_semigroups.AssociatorLoss(*args: Any, **kwargs: Any)

probabilistic associator loss

__init__(discrete: bool = False)
Parameters

discrete – when False, the KL divergence is is used for measuring associativity in a continuous way. when True, returns 1 for associative and 0 for non associative samples.

forward(cayley_cubes: torch.Tensor) torch.Tensor

finds a probabilistic associator of a given probabilistic Cayley cube

First, we build two 4-index tensors representing probability distributions of products \(a\left(bc\right)\) and \(\left(ab\right)c\), respectively:

\(T_{ijkl}=P\left\{e_i\left(e_je_k\right)=e_l\right\}= \sum\limits_{m=1}^nP\left\{e_ie_m=e_l\vert e_je_k=e_m\right\} P\left\{e_je_k=e_m\right\}=\sum\limits_{m=1}^na_{iml}a_{jkm}\)

and

\(T\prime_{ijkl}=P\left\{\left(e_ie_j\right)e_k=e_l\right\}= \sum\limits_{m=1}^nP\left\{e_me_k=e_l\vert e_ie_j=e_m\right\} P\left\{e_ie_j=e_m\right\}=\sum\limits_{m=1}^na_{mkl}a_{ijm}\)

Then we calculate Kullback-Leibler divergence between \(T_{ijkl}\) and \(T\prime_{ijkl}\) to find a continuous measure of associativity of the input table.

Parameters

cayley_cubes – a batch of probabilistic Cayley cubes

Returns

the probabilistic associator

Constant Baseline

class neural_semigroups.ConstantBaseline(*args: Any, **kwargs: Any)

A model that always fills in the same number

__init__(cardinality: int, fill_in_with: int = 0)

initializes a constant model

>>> ConstantBaseline(2, 3)
Traceback (most recent call last):
    ...
ValueError: `fill_in_with` should be non-negative and less than `cardinality`, got 3 >= 2
>>> ConstantBaseline(2, 1).constant_distribution.cpu()
tensor([0., 1.])
Parameters
  • cardinality – the number of elements in a magma

  • fill_in_with – an item which will be suggested as a correct answer

forward(cayley_cube: torch.Tensor) torch.Tensor

forward pass inhereted from Module

>>> ConstantBaseline(2, 1)(torch.tensor([
...     [[[0., 1.], [0.5, 0.5]], [[1., 0.], [0., 1.]]],
...     [[[0., 1.], [1.0, 0.0]], [[0.5, 0.5], [0., 1.]]]
... ]).to(CURRENT_DEVICE)).cpu()
tensor([[[[0., 1.],
  [0., 1.]],

 [[1., 0.],
  [0., 1.]]],


[[[0., 1.],
  [1., 0.]],

 [[0., 1.],
  [0., 1.]]]])
Parameters

cayley_cube – probabilistic representation of a magma

Returns

a batch of constant values (set in the constructor)

Cyclic Group

class neural_semigroups.CyclicGroup(cardinality: int)

finite cyclic group

__init__(cardinality: int)
Parameters

cardinality – number of elements in a cyclic group

Denoising Autoencoder for Magmas

class neural_semigroups.MagmaDAE(*args: Any, **kwargs: Any)

Denoising Autoencoder for probability Cayley cubes of magmas

__init__(cardinality: int, hidden_dims: List[int], do_reparametrization: bool = False)
Parameters
  • cardinality – the number of elements in a magma

  • hidden_dims – a list of sizes of hidden layers of the encoder and the decoder

  • do_reparametrization – if True, adds a reparametrization trick

decode(encoded_input: torch.Tensor) torch.Tensor

represent an embedding vector as something with size aligned with the input

Parameters

encoded_input – an embedding vector

Returns

a vector of values from 0 to 1 (kind of probabilities)

encode(input_with_noise: torch.Tensor) torch.Tensor

represent input cube as an embedding vector

Parameters

input_with_noise – a tensor with two indices

Returns

some tensor with two indices and non-negative values

forward(cayley_cubes: torch.Tensor) torch.Tensor

forward pass inherited from Module

Parameters

cayley_cubes – a batch of probabilistic representations of magmas

Returns

auto-encoded probabilistic representations of magmas

reparametrize(mu_and_log_sigma: torch.Tensor) torch.Tensor

do a reparametrization trick

Parameters

mu_and_sigma – vector of expectation and standard deviation

Returns

sample from a distribution

Magma

class neural_semigroups.Magma(cayley_table: Optional[torch.Tensor] = None, cardinality: Optional[int] = None)

an implementation of a magma (a set with a binary operation)

__init__(cayley_table: Optional[torch.Tensor] = None, cardinality: Optional[int] = None)

constucts a new magma

>>> seed = torch.manual_seed(11)
>>> Magma(cardinality=2)
tensor([[1, 1],
        [0, 1]])
>>> Magma(torch.tensor([[0, 1], [1, 0]]))
tensor([[0, 1],
        [1, 0]])
>>> Magma()
Traceback (most recent call last):
    ...
ValueError: at least one argument must be given
>>> Magma(torch. tensor([[0]]), cardinality=2)
Traceback (most recent call last):
    ...
ValueError: cayley_table must be a `torch.Tensor` of shape (cardinality, cardinality)
Parameters
  • cayley_table – a Cayley table for a magma. If not provided, a random table is generated.

  • cardinality – a number of elements in a magma to generate a random one

property cardinality: int

number of elements in a magma

property has_inverses: bool

check whether there are solutions of equations \(ax=b\) and \(xa=b\)

property identity: int

find an identity element in a Cayley table

Returns

the index of the identity element or -1 if there is no identity

property is_associative: bool

check associativity of a Cayley table

Returns

whether the input table is associative or not

property is_commutative: bool

check commutativity of a Cayley table

Returns

whether the input table is commutative or not

property next_magma: Magma

goes to the next magma Cayley table in their lexicographical order

>>> Magma(torch.tensor([[0, 1], [1, 0]])).next_magma
tensor([[0, 1],
        [1, 1]])
>>> Magma(torch.tensor([[0, 1], [1, 1]])).next_magma
tensor([[1, 0],
        [0, 0]])
>>> Magma(torch.tensor([[1, 1], [1, 1]])).next_magma
Traceback (most recent call last):
    ...
ValueError: there is no next magma!
Returns

another magma

property probabilistic_cube: torch.Tensor

a 3d array \(a\) where \(a_{ijk}=P\left\{e_ie_j=e_k\right\}\)

Returns

a probabilistic cube representation of a Cayley table

random_isomorphism() torch.Tensor

get some Cayley table isomorphic to self.cayley_table form example

>>> Magma(torch.tensor([[0, 0], [0, 0]])).random_isomorphism()
tensor([[1, 1],
                [1, 1]])

Precise Guess Loss

class neural_semigroups.PreciseGuessLoss(*args: Any, **kwargs: Any)

loss for comparing probabilistic Cayley cubes precisely

forward(predicted_cubes: torch.Tensor, target_cubes: torch.Tensor) torch.Tensor

finds a percentage of predicted Cayley tables, identical to the target ones

Datasets

Random Dataset

class neural_semigroups.datasets.RandomDataset(*args: Any, **kwargs: Any)

an iterable dataset having fixed length and returning random tensors of pre-defined shape

>>> data = RandomDataset(2, ([5, 2], [1]))
>>> print([item.shape for item in data[1]])
[torch.Size([5, 2]), torch.Size([1])]
>>> for row in data:
...     print([item.shape for item in row])
...     break
[torch.Size([5, 2]), torch.Size([1])]
>>> data = RandomDataset(3, [4, 4])
>>> print(data[1].shape)
torch.Size([4, 4])
>>> for row in data:
...     print(row.shape)
...     break
torch.Size([4, 4])
>>> print(len(data))
3
__init__(data_size: int, data_dim: Union[torch.Size, Tuple[torch.Size, ...]])

Semigroups Dataset

class neural_semigroups.datasets.SemigroupsDataset(*args: Any, **kwargs: Any)

an extension of torch.util.data.TensorDataset similar to a private class torchvision.datasets.vision.VisionDataset

__init__(root: str, cardinality: int, transform: Optional[Callable] = None)
Parameters
  • root – root directory of dataset

  • cardinality – a semigroup cardinality to use.

  • transform – a function/transform that takes in a Cayley table and returns a transformed version.

Smallsemi Dataset

class neural_semigroups.datasets.Smallsemi(*args: Any, **kwargs: Any)

a torch.util.data.Dataset wrapper for the data from https://www.gap-system.org/Packages/smallsemi.html

>>> import shutil
>>> from neural_semigroups.constants import TEST_TEMP_DATA
>>> shutil.rmtree(TEST_TEMP_DATA, ignore_errors=True)
>>> os.mkdir(TEST_TEMP_DATA)
>>> smallsemi = Smallsemi(root=TEST_TEMP_DATA, cardinality=2)
Traceback (most recent call last):
   ...
ValueError: test_temp_data must have exactly one version of smallsemi
>>> smallsemi = Smallsemi(
...     root=TEST_TEMP_DATA,
...     cardinality=2,
...     download=True,
...     transform=lambda x: x
... )
>>> smallsemi[0][0]
tensor([[0, 0],
        [0, 0]])
__init__(root: str, cardinality: int, download: bool = False, transform: Optional[Callable] = None)
Parameters
  • root – root directory of dataset where smallsemi-*/data/data2to7/ exist.

  • cardinality – a semigroup cardinality to use. Corresponds to data{cardinality}.gl.gz.

  • download – if true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

  • transform – a function/transform that takes in a Cayley table and returns a transformed version.

download() None

downloads, unzips and moves smallsemi data

load_data_and_labels_from_smallsemi() None

loads data from smallsemi package

Mace4 Semigroups Dataset

class neural_semigroups.datasets.Mace4Semigroups(*args: Any, **kwargs: Any)

a torch.util.data.Dataset wrapper for the data of mace4 output stored in a sqlite database

>>> import shutil
>>> from neural_semigroups.constants import TEST_TEMP_DATA
>>> import os
>>> from neural_semigroups.generate_data_with_mace4 import (
... generate_data_with_mace4)
>>> shutil.rmtree(TEST_TEMP_DATA, ignore_errors=True)
>>> os.mkdir(TEST_TEMP_DATA)
>>> database = os.path.join(TEST_TEMP_DATA,"test.db")
>>> torch.manual_seed(42) 
<torch...
>>> generate_data_with_mace4([
... "--max_dim", "2",
... "--min_dim", "2",
... "--number_of_tasks", "1",
... "--database_name", database])
>>> mace4_semigroups = Mace4Semigroups(
...     root=database,
...     cardinality=2,
...     transform=lambda x: x
... )
>>> mace4_semigroups[0][0]
tensor([[0, 0],
        [0, 0]])
>>> mace4_semigroups.get_table_from_output("not a mace4 output file")
Traceback (most recent call last):
    ...
ValueError: wrong mace4 output file format!
__init__(cardinality: int, root: str, transform: Optional[Callable] = None)
Parameters
  • root – a full path to an sqlite database file which has a table mace_output with a string column output

  • cardinality – the cardinality of semigroups

  • transform – a function/transform that takes a Cayley table and returns a transformed version.

get_additional_info(cursor: Cursor) int

gets some info from an SQLite database with mace4 outputs

Parameters

cursor – an SQLite database cursor

Returns

a total number of rows in a table, a magma dimension

get_table_from_output(output: str) torch.Tensor

gets a Cayley table of a magma from the output of mace4

Parameters

output – output of mace4

Returns

a Cayley table

load_data_from_mace_output() None

loads data generated by mace4 from an sqlite database

utils

A collection of different functions used by other modules.

neural_semigroups.utils.random_semigroup(dim: int, maximal_tries: int) Tuple[bool, torch.Tensor]

randomly search for a semigroup Cayley table. Not recommended to use with dim > 4

Parameters
  • dim – number of elements in a semigroup

  • maximal_tries – how many times to try at most

Returns

a pair (whether the Cayley table is associative, a Cayley table of a magma)

neural_semigroups.utils.get_magma_by_index(cardinality: int, index: int) Magma

find a magma from a lexicographical order by its index

Parameters
  • cardinality – the number of elements in a magma

  • index – an index of magma in a lexicographical order

Returns

a magma with a given index

neural_semigroups.utils.import_smallsemi_format(lines: List[str]) torch.Tensor

imports lines in a format used by smallsemi GAP package. Format description:

  • filename is of a form data[n].gl, \(1<=n<=7\)

  • lines are separated by a pair of symbols \r\n

  • there are exactly \(n^2\) lines in a file

  • the first line is a header starting with ‘#’ symbol

  • each line is a string of \(N\) digits from \(0\) to \(n-1\)

  • \(N\) is the number of semigroups in the database

  • each column represents a serialised Cayley table

  • the database contains only cells starting from the second

  • the first cell of each Cayley table is assumed to be filled with 0

Parameters

lines – lines read from a file of smallsemi format

Returns

a list of Cayley tables

neural_semigroups.utils.download_file_from_url(url: str, filename: str, buffer_size: int = 1024) None

downloads some file from the Web to a specified destination

>>> import os
>>> TEMP_FILE = "test.html"
>>> if os.path.exists(TEMP_FILE):
...     os.remove(TEMP_FILE)
>>> download_file_from_url("https://python.org/", TEMP_FILE)
>>> os.path.exists(TEMP_FILE)
True
Parameters
  • url – a valid HTTP URL

  • filename – a valid filename

  • buffer_size – a number of bytes to read from URL at once

neural_semigroups.utils.find_substring_by_pattern(strings: List[str], starts_with: str, ends_before: str) str

search for a first occurrence of a given pattern in a string list

>>> some_strings = ["one", "two", "three"]
>>> find_substring_by_pattern(some_strings, "t", "o")
'tw'
>>> find_substring_by_pattern(some_strings, "four", "five")
Traceback (most recent call last):
   ...
ValueError: pattern four.*five not found
Parameters
  • strings – a list of strings where the pattern is searched for

  • starts_with – the first letters of a pattern

  • ends_before – a substring which marks the beginning of something different

Returns

a pattern which starts with starts_with and ends before ends_before

neural_semigroups.utils.get_newest_file(dir_path: str) str

get the last modified file from a diretory

>>> from pathlib import Path
>>> from os import makedirs, path
>>> from neural_semigroups.constants import TEST_TEMP_DATA
>>> shutil.rmtree(path.join(TEST_TEMP_DATA, "tmp"), ignore_errors=True)
>>> makedirs(path.join(TEST_TEMP_DATA, "tmp"))
>>> Path(path.join(TEST_TEMP_DATA, "tmp", "one")).touch()
>>> from time import sleep
>>> sleep(0.01)
>>> Path(path.join(TEST_TEMP_DATA, "tmp", "two")).touch()
>>> get_newest_file(path.join(TEST_TEMP_DATA, "tmp"))
'test_temp_data/tmp/two'
Parameters

dir_path – a directory path

Returns

the last modified file’s name

neural_semigroups.utils.make_discrete(cayley_cubes: torch.Tensor) torch.Tensor

transforms a batch of probabilistic Cayley cubes and in the following way:

  • maximal probabilities in the last dimension become ones

  • all other probabilies become zeros

>>> make_discrete(torch.tensor([
...    [[[0.9, 0.1], [0.1, 0.9]], [[0.8, 0.2], [0.2, 0.8]]],
...    [[[0.7, 0.3], [0.3, 0.7]], [[0.7, 0.3], [0.3, 0.7]]],
... ]))
tensor([[[[1., 0.],
          [0., 1.]],

         [[1., 0.],
          [0., 1.]]],


        [[[1., 0.],
          [0., 1.]],

         [[1., 0.],
          [0., 1.]]]])
Parameters

cayley_cubes – a batch of probabilistic cubes representing Cayley tables

Returns

a batch of probabilistic cubes filled in with 0 or 1

neural_semigroups.utils.count_different(one: torch.Tensor, two: torch.Tensor) torch.Tensor

given two batches of the same size, counts number of positions in these batches, on which the tensor from the first batch differs from the second

Parameters
  • one – one batch of tensors

  • two – another batch of tensors

Returns

the number of different tensors

neural_semigroups.utils.hide_cells(cayley_table: torch.Tensor, number_of_cells: int) torch.Tensor

set several cells in a Cayley table to math:-1

>>> torch.manual_seed(42) 
<torch...
>>> hide_cells(torch.tensor([[0, 1], [2, 3]]), 2).cpu()
tensor([[ 0,  1],
        [-1, -1]])
Parameters
  • cayley_table – a Cayley table

  • number_of_cells – a number of cells to hide

Returns

a Cayley table with hidden cells

neural_semigroups.utils.read_whole_file(filename: str) str

reads the whole file into a string, for example

>>> read_whole_file("README.rst").split("\n")[3]
'Neural Semigroups'
Parameters

filename – a name of the file to read

Returns

whole contents of the file

neural_semigroups.utils.partial_table_to_cube(table: torch.Tensor) torch.Tensor

create a probabilistic cube from a partially filled Cayley table -1 is translated to \(\frac1n\) where \(n\) is table’s cardinality, for example

>>> partial_table_to_cube(torch.tensor([[0, -1], [0, 0]])).cpu()
tensor([[[1.0000, 0.0000],
          [0.5000, 0.5000]],

         [[1.0000, 0.0000],
          [1.0000, 0.0000]]])
Parameters

table – a Cayley table, partially filled by -1’s

Returns

a probabilistic cube

neural_semigroups.utils.connect_to_db(database_name: str) Cursor

open a connection to an SQLite database

Parameters

database_name – filename of a database

Returns

a cursor to the database

neural_semigroups.utils.create_table_if_not_exists(cursor: Cursor, table_name: str, columns: List[str]) None

create a table if it does not exist

Parameters
  • cursor – a cursor to the database where to create a table

  • table_name – what table to create

  • columns – a list of strings of format “COLUMN_NAME COLUMN_TYPE”

Returns

neural_semigroups.utils.insert_values_into_table(cursor: Cursor, table_name: str, values: Tuple[str, ...]) None

inserts values into a table

Parameters
  • cursor – a cursor to database where the target table is located

  • table_name – the target table

  • values – values to insert into the target table

Returns

neural_semigroups.utils.gunzip(archive_path: str) None

extracts a GZIP file in the same folder

Parameters

archive_path – a path ending with .gz

Returns

generate_data_with_mace4

A script which generates semigroups with mace4 and saves them in an sqlite database.

neural_semigroups.generate_data_with_mace4.generate_data_with_mace4(input_args: Optional[List[str]] = None) None
Parameters

input_args – a list of arguments (if None then ones from the command line are used)

Returns