# Review — UNI-SNE: Visualizing Similarity Data with a Mixture of Maps

## Introducing A Background Map to Solve the Crowding Problem

--

In this paper, **Visualizing Similarity Data with a Mixture of Maps, UNI-SNE**, by University of Toronto, is briefly reviewed since UNI-SNE is mentioned in t-SNE. This is a paper by Prof. Hinton. In this paper:

**Aspect Maps**are introduced for Data with a Mixture of Maps.**A Background Map**is used to solve the crowding problem.

This is a paper in **2007 ICAIS **with over **100 citations**. (Sik-Ho Tsang @ Medium)

# Outline

**Brief Review of****SNE****& Symmetric SNE****Aspect Maps****UNI-SNE: A Background Map**

**1. Brief Review of **SNE **& Symmetric SNE**

## 1.1. SNE

- To visualize the high dimensional data, we need to map those data to a low dimensional space such as 2D or 3D space.
- Additional to this, the structure of high dimensional data should be preserved after mapping to low dimensional space for proper visualization.

- A spherical Gaussian distribution centered at
*xi*defines a probability density at each of the other points. - When these densities are normalized, we get a probability distribution,
*Pi*, over all of the other points that represents their similarity to*i*.

- A circular Gaussian distribution centered at
*yi*defines a probability density at each of the other points. - When these densities are normalized, we get a probability distribution over all of the other points that is our low dimensional model,
*Qi*of the high-dimensional*Pi*.

- For each object,
*i*, we can associate a cost with a set of low-dimensional*y*locations by using the Kullback-Liebler divergence to measure how well the distribution*Qi*models the distribution*Pi*:

- The above cost
*C*can be differentiated and minimized by gradient descent.

## 1.2. Symmetric SNE

- An alternative is to define a single joint distribution over all non-identical ordered pairs:

- This leads to simpler derivatives and easier to optimize.

# 2. Aspect Maps

- Different senses of a word occur in different maps.
- e.g.: river and loan can both be close to bank without being at all close to each other.
- Each object,
*i*, has a mixing proportion*πmi*in each map,*m*, and the mixing proportions are constrained to add to 1.

- (Symmetric SNE is not used here.)
- (There is large passage for minimizing the cost using the above aspect maps version of
*qj*|*i*. Please read the paper if interested.)

- The above figure shows the 2 of 50 aspect maps for “can” and “field” examples.

# 3. UNI-SNE: A Background Map

- One of the aspect maps would keep all of the objects very close together, while the other aspect map would create widely separated clusters of objects.
**The objects in the middle will be crushed together too closely**, causing**crowding problem**.- A background map in which all of the objects are very close together gives all of the
*qj*|*i*a small positive contribution. - Here, for UNI-SNE, symmetric SNE is used, and
*qij*is:

- Principal components analysis (PCA) is applied on all 60,000 MNIST training images first to reduce each 28×28 pixel image to a 30-dimensional vector.
- Then,
**Symmetric SNE**is applied to 5000 of these 30-dimensional vectors with an equal number from each class. - The above figure shows that
**the 10 digit classes are not well separated.** - The above figure shows that Symmetric SNE is also unable to separate the clusters 4,7,9 and 3,5,8 and it does not cleanly separate the clusters for 0, 1, 2, and 6 from the rest of the data. (The numbers are shown below as reference)

**Using UNI-SNE**, with 0.2 of the total probability mass uniformly distributed between all pairs,**the 10 digit classes are much well separated compared with Symmetric SNE.**- Of course, later on, t-SNE is proposed which is better and more popular than UNI-SNE.

## Reference

[2007 ICAIS] [UNI-SNE]

Visualizing Similarity Data with a Mixture of Maps

## Data Visualization

**2002** [SNE] **2007 **[UNI-SNE] **2008 **[t-SNE]