Tuesday, May 7, 2013

Sorting high dimensional data with RPM maps

I have recently published a dataset  that contains the daily prices of the S&P 500 component companies for a period of 5 years. The mountain view of VisuMap provides a pretty good overview of the whole dataset in 3D styles. The following short video clips shows the normalized values (so that all price history start from 1.0) of the dataset:



We notice that the curves are colored using the k-Mean algorithm, so that stocks with similar development will have common colors. However, these curves are not ordered according to their group, but according to the alphabetic order of the stock tick, so that curves of different colors are all mixed with each other.

A simple way to enhance the 3D view is just to re-order the curves, so that curves of the same group will be located together. We can do this easily with the heat-map view of VisuMap. However, this method does not reorder the groups in a meaningful way, so that closely located groups, are not necessarily similar groups.

The RPM map provides a better way to sort high dimensional data according to similarities. To do this with VisuMap, we first create 3D RPM map for the dataset with very large width (e.g. 1200 pixels) but small height and depeth (e.g. 50 pixels), so that RPM map geometrically resembles a ring. Then we sort the  data points according to the x-coordinates in the 3D RPM map ( that determines the data point's position on the ring.) The nature of the RPM algorithm will make sure that closely located data points on the RPM map will be similar to each other.

The following short video shows how to do this with VisuMap. Notice that we have already created the 3D RPM map. To reorder the data points, we opened a table for the XYZ coordinates, then sorted the table on the X-coordinate column. The mountain view has been configured to automatically re-order its content when re-ordering-event occurred.




Friday, May 3, 2013

Daily Price of S&P Companies in last 5 years.

I have just released a new VisuMap dataset that contains the daily prices of S&P 500 companies for the year 2008 to 2012. This dataset, available at our web site, provides convenient access to these stock prices for that period. The dataset also contains scripts which download automatically those historical data from Google server.

The following is short video to display these data in 3D mountain view:

Saturday, April 20, 2013

Visual interface kNN data clustering services in VisuMap

I have just uploaded a short video tutorial (ca. 10 minutes) that demonstrates the visual interface for the kNN (k-nearest neighbors) data clustering service in VisuMap version 4.0.892. The tutorial shows, with two scenarios, how to combine kNN with supervised and unsupervised classification methods.

Although kNN has been one of the simplest and most widely used classification algorithm, there is no software on the market that provides kNN services with decent visual interface. VisuMap implemented a unique visual interface for kNN with the help of MDS mapping and linked data views.


Thursday, April 18, 2013

Azhimutal projection for RPM on projective spaces

Among those manifolds used by relational perspective map (RPM), the 2-dimensional real projective plane (P2) has been one of my favorite space. As a visualization space for high dimensional data, P2 works similarly as the 2-dimensional flat torus (T2) used initially by RPM.  Both T2 and P2  provide a boundaryless view of data by connecting the opposite sides of the space. Both spaces are completely isometrics and shifting/rotation symmetrical.

P2 has, however, one advantage over T2 in that P2 is isotropic but not T2. This means that P2 is equivalent in all directions, but T2 treats different directions differently. For instance, the diagonal directions of T2 usually aligns with the larger portion of the data as I blogged about this before. P2 doesn't have such space specific artifacts: P2 maps are usually symmetric in all direction and can be rotated freely.

On the other side, T2 do have a significant advantage over P2 as it a flat but not P2:  T2 maps are represented as rectangular maps whereas P2 maps are maps on the semi-sphere. This makes P2 maps harder to explore on conventional flat media like paper or computer screen. VisuMap has implemented a special sphere view to facilitate the exploration of spherical maps, but the rectangular T2 maps are still fare more easier to explore than the sphere view.

The problem to represent spherical surface through flat map is, of course,  a quite old problem. A plethera of methods have been invented in the past to create flat maps for the spherical earth surface. Looking in to method list at the page Map Projection, I have picked the Azhimutal projection to project the semi-spherical P2 maps to the flat plane. This additional projection has been implemented in the latest version of VisuMap 4.0.892 with which you can make disk-like snapshot of a spherical P2 map in the sphere view. The following picture illustrates how VisuMap maps data from high dimensional space to P2 and then to the flat plane with the Azhimutal projection:


We notice that the Azhimutal projection is a non-linear project, it will stretch the area a the boundary of the semi-sphere somewhat. This kind of non-linear stretch will mathematically induce non-zero curvature on the flat disk, so that the total curvature of the stretched disk equals that of the semi-sphere (as the Gauss-Bonnet theorem dictates.) The following short screen-cast shows how VisuMap maps the full sphere on the the semi-spherical P2 space. The RPM algorithm will split the sphere into 4 pieces. Notice how these fragments stretch when rotating them from center to the boundary.



Going one step further, we have extend the Azhimutal projection to project 3-dimensional projective space (P3) to the 3D flat space where P3 is realized as a semi-sphere in 4-dimensional space. In VisuMap, such a 3D map is called a projective ball (where the opposite points on the surface have been considered stuck together.) The following screen-cast shows how VisuMap creates RPM-3P map for the S&P 500 dataset:


Notice that the RPM 3P actually resides in the 4D space. We have developed a special view that in addition to rotation among the first three dimensions, also allows rotation between the first 3 dimensions and the forth dimension (active when pressing-and-holding the control-key.)