Special Orthogonal Group Example

In this notebook we will use SymDet to extract generators of the Lie algebra from SO(2) and SO(3) data sets.

[1]:
import symdet

Generating Data

As we are looking at SO(2) and SO(3) the obvisou systems use are circles and spheres. Conveniently, we have written some example scripts to generate and look at this data. We also want to include some noise in the data to simulate a non-ideal system.

[2]:
circle = symdet.SO2(noise=True,
             variance=0.05)
sphere = symdet.SO3(noise=True,
                    variance=0.05)

Now let’s use these classes to generate some data for study.

[3]:
circle.load_data(500)
sphere.load_data(1000)

We can plot this data to investigate it.

[4]:
circle.plot_data()
sphere.plot_data()
../_images/examples_SO_example_8_0.png
../_images/examples_SO_example_8_1.png

Notice the noise around the images. This is adjustable in the class instantiation.

Extracting Generators

Now that we have some data, we can begin extracting generators from it.

SO(2)

We will start by defining a generator_extractor which as a few parameters to help us fit the data.

[5]:
generator_extractor = symdet.GeneratorExtraction(
                          circle.domain,    # clustered data
                          delta=0.5,        # distance of points to hyperplane
                          epsilon=0.3,      # distance between points connected by a generator
                          candidate_runs=5  # Number of times to run the extraction loop
                        )

We can go through this briefly with specific attention to the arguments.

  • circle.data is the (x, y) data generated by the circle class

  • delta is the distance between point cloud points and the hyperplane formed by your randomly chosen vectors. This should be tuned such that enough points are used in the fitting. If it is too big, all points will be considered to be within the generator application range and the fitting may become innacurate.

  • epsilon is the allowed distance between points in the point cloud suxh that we consider them accessible by generaotr action. Again, if this is too big, the fitting will not be accurate.

  • candidate_runs is the number of times the fitting process should be run. In each run, a new basis set is constructed and therefore, a new set of generators found. If you have some idea of the number of generators required describe the group you should make this values 5 or so times this number. If you have no idea, set it high. There is no downside to running the loop many times, only computational detriment.

Now we can get the generators by running the following.

[6]:
generators, variance_list = generator_extractor.perform_generator_extraction(pca_components=4,
                                                                             plot=True)
Producing generator candidates: 100%|█████████████████████████████████| 5/5 [05:01<00:00, 60.35s/it]
Principle Component 1: Explained Variance: 1.0
[[-0.00394716 -1.00563269]
 [ 0.99432246  0.00318606]]


Principle Component 2: Explained Variance: 1.3757697081439634e-36
[[-0.6030852  -0.85736421]
 [-0.87072257  0.37822883]]


Principle Component 3: Explained Variance: 3.6551465466088217e-38
[[0.16740142 0.24032105]
 [0.23935278 1.36269322]]


Principle Component 4: Explained Variance: 1.2289433030919476e-70
[[-1.26816774e+00  4.42578002e-01]
 [ 4.42578002e-01  1.00352521e-14]]



../_images/examples_SO_example_15_3.png

First we will discuss the parameters before getting into what we are looking at in the output.

  • PCA Components is the number of critical components the PCA should look for. This should be no greater than d^{2} where d is the dimension of the system. It is in fact a good idea to set it to d^{2} to make sure you do not miss anything.

  • plot just asks the method to plot the results. The method will also simply return the generated data as seen in the generator, variance_list unpacking.

So now we can discuss the output. The first thing that is printed after the loading bar are the generators extracted from the data. They are accompanied by their value for explained variance which will tell you which of these are most important. In this case, we have only SO(2). We expect then that we have only a single generator as it reproduced in the example.

[7]:
generator_extractor = symdet.GeneratorExtraction(sphere.domain,
                                          delta=0.5,
                                          epsilon=0.3,
                                          candidate_runs=20)
generators, variance_list = generator_extractor.perform_generator_extraction(pca_components=9,
                                                                             plot=True)
Producing generator candidates: 100%|██████████████████████████████| 20/20 [39:25<00:00, 118.29s/it]
Principle Component 1: Explained Variance: 0.46671539974762427
[[ 0.05844208 -0.32381881  0.62057826]
 [ 0.28774448 -0.02250414 -1.03720561]
 [-0.58151916  1.00458252  0.01261541]]


Principle Component 2: Explained Variance: 0.3661267379481377
[[ 0.01377112 -0.21760786 -1.04643506]
 [ 0.24085093 -0.02704169 -0.54790929]
 [ 1.08442963  0.5678619   0.00557409]]


Principle Component 3: Explained Variance: 0.16367819938563596
[[-0.00900815  1.14664787 -0.07380354]
 [-1.17862591  0.01075561 -0.35990301]
 [ 0.03008574  0.39905439  0.02675986]]


Principle Component 4: Explained Variance: 0.0019987866414263628
[[-0.17531942  0.95499605 -0.72271103]
 [ 0.97772672  0.02657505  0.05269328]
 [-0.74880255  0.10521146  0.06105195]]


Principle Component 5: Explained Variance: 0.0008879535857574221
[[ 0.73828142  0.51982124  0.51438996]
 [ 0.4450838  -1.17974628 -0.15465756]
 [ 0.49909809 -0.21904251  0.09607828]]


Principle Component 6: Explained Variance: 0.0002806185888669158
[[-0.03988915  0.21619255  0.15018774]
 [ 0.18116085  0.6416449  -1.10590239]
 [ 0.17993401 -1.09788315  0.15440886]]


Principle Component 7: Explained Variance: 0.00023051641457079604
[[ 0.54214099 -0.54510292 -0.82754125]
 [-0.52557848 -0.64373052 -0.37601348]
 [-0.79104147 -0.4099838   0.31349681]]


Principle Component 8: Explained Variance: 7.545114250391106e-05
[[-1.45655983 -0.08268353  0.04563956]
 [-0.07316153 -0.86646382 -0.24725111]
 [ 0.02532708 -0.2102651   0.08611501]]


Principle Component 9: Explained Variance: 6.3365454765386406e-06
[[-0.05867582  0.00650376  0.13430265]
 [ 0.03976397  0.17112102  0.20551123]
 [ 0.12796177  0.18002171  1.69014709]]



../_images/examples_SO_example_17_3.png