PINK

Parallelized rotation and flipping INvariant
Kohonen maps

HITS gGmbH
ADASS 2019

Self-organizing Kohonen Map

Low dimensional representation of the data
Unsupervised learning: No labeling

Assignment

Spatial transformations
- Rotation
- Flipping
Similarity by euclidean distance

Adaptation

Update assigned neuron with image
weighted by a distribution function

Neuron and euclidean distance dimension

Software Design

Generic C++17 design for static type safety

Data<DataLayout, T> data;
SOM<SOMLayout, NeuronLayout, T> som;
Trainer<SOMLayout, NeuronLayout, T, UseGPU> trainer(som);

trainer(data); \\ Execution of training

Data- and NeuronLayout: Cartesian<N>
with N = 1, 2, 3
SOMLayout: Cartesian<N>, Hexagonal
T: float32

Python Interface

Dynamic python interface using PyBind11
combined with C++ inheritance

som_dim = 8
neuron_dim = int(image_dim / math.sqrt(2.0) * 2.0)

np_som = numpy.random.rand(som_dim, som_dim,
    neuron_dim, neuron_dim).astype(np.float32)
som = pink.SOM(np_som, som_layout="cartesian-2d")

trainer = pink.Trainer(som)

iter = iter(tools.DataIterator("data.bin"))
for image in iter:
    trainer(pink.Data(image))

Code Quality

The quality of the PINK source code was verified by softwipe and has achieved 9.4 of 10 points.

Mixed precision

The precision for the euclidean distance can be reduced without loosing accuracy

float 32 bit

-3.4 x 10^38 … 3.4 x 10^38

int 32 bit

-2147483648 … 2147483647

int 16 bit

-32768 … 32767

int 8 bit

-128 … 127

Benchmark: CPU vs GPU

Time / s

Intel Gold 5118, 24 cores

35373

NVIDIA RTX 2080, int8

673

Radio Galaxy Zoo, hexagonal SOM 21x21, neurons 64x64

Benchmark: Mixed Precision

	Time / s
NVIDIA RTX 2080, float	1867
NVIDIA RTX 2080, int16	1062
NVIDIA RTX 2080, int8	673

Radio Galaxy Zoo, hexagonal SOM 21x21, neurons 64x64