<< Chapter < Page Chapter >> Page >

As a simple example of dimensionality reduction, consider the case of a bending string of beads, as depicted in figure 2. The input data has 3x7=21 dimensions (if given as the (x,y,z) coordinates of each bead) but the beads always move collectively from the "bent" arrangement (left) to the "straight" arrangement (right). Under this simplified view, the process can be considered as one-dimensional, and a meaningful axis for it would represent the "degree of straightness" of the system. Using this axis, each string of beads can be substituted by one single number, its "coordinate" along the proposed axis. Thus, the location of a shape along this axis can quickly indicate in what stage of the bending process it is.

Sampled data from a chain of 7 beads can be given by the (x,y,z) positions of each bead, for a total of 21 "apparent" degrees of freedom. However, the process is inherently one-dimensional.

When dimensionality reduction methods are applied to molecular motion data, the goal is to find the main "directions" or "axes" collectively followed by the atoms, and the placement of each input conformation along these axes. The meaning of such axes can be intuitive or abstract, depending on the technique used and how complex the system is. We can reword the definition of dimensionality reduction when working with molecular motion samples as:

    Dimensionality reduction of molecular motion data

  • INPUT: A set of molecular conformations sampled from some physical process, given as the (x,y,z) coordinates for each atom. These are 3N-dimensional points for a molecule with N atoms.
  • OUTPUT: A set of d coordinates for each input conformation, such that d<<3N . These d coordinates should help classify the conformations throughout the main stages of the studied process.
Dimensionality reduction methods can be either linear or non-linear . Linear methods typically compute the low-dimensional representation of each input point by a series of mathematical operations involving linear combinations and/or linear matrix operations. Non-linear methods use either non-linear mathematics or modify linear methods with algorithmic techniques that encode the data's "curvature" (such as Isomap, explained later). Both categories of methods have advantages and disadvantages, which will become clear through the rest of this module. More information on different linear and non-linear techniques for dimensionality reduction are given in [7] . Also, a more extensive classification of dimensionality reduction techniques into several categories, and examples of different algorithms can be found navigating this hypertext article .

The remainder of this module describes two dimensionality reduction techniques and their application to molecular motion data. These are Principal Components Analysis (PCA), a linear method, and ISOmetric feature MAPping (Isomap), a non-linear method.

Principal components analysis

We will start our discussion of Principal Components Analysis, or PCA, considering a very general data set of points as depicted in figure 3. Each point in this simple data set is given as a 3-dimensional vector (x,y,z) (the discussion will later be turned to the molecular motion domain, and the interpretation of such data). Even though this data set is given as 3-dimensional points, it is obvious from the figure that the data points are distributed mostly on a two-dimensional surface. Our objective is then to find the inherent, 2-dimensional parameterization of this data set. (For a full discussion on PCA, see Principal Components Analysis by I. T. Jolliffe).

Questions & Answers

An investment account was opened with an initial deposit of $9,600 and earns 7.4% interest, compounded continuously. How much will the account be worth after 15 years?
Kala Reply
lim x to infinity e^1-e^-1/log(1+x)
given eccentricity and a point find the equiation
Moses Reply
12, 17, 22.... 25th term
Alexandra Reply
12, 17, 22.... 25th term
College algebra is really hard?
Shirleen Reply
Absolutely, for me. My problems with math started in First grade...involving a nun Sister Anastasia, bad vision, talking & getting expelled from Catholic school. When it comes to math I just can't focus and all I can hear is our family silverware banging and clanging on the pink Formica table.
find the 15th term of the geometric sequince whose first is 18 and last term of 387
Jerwin Reply
I know this work
The given of f(x=x-2. then what is the value of this f(3) 5f(x+1)
virgelyn Reply
hmm well what is the answer
how do they get the third part x = (32)5/4
kinnecy Reply
can someone help me with some logarithmic and exponential equations.
Jeffrey Reply
sure. what is your question?
okay, so you have 6 raised to the power of 2. what is that part of your answer
I don't understand what the A with approx sign and the boxed x mean
it think it's written 20/(X-6)^2 so it's 20 divided by X-6 squared
I'm not sure why it wrote it the other way
I got X =-6
ok. so take the square root of both sides, now you have plus or minus the square root of 20= x-6
oops. ignore that.
so you not have an equal sign anywhere in the original equation?
is it a question of log
I rally confuse this number And equations too I need exactly help
But this is not salma it's Faiza live in lousvile Ky I garbage this so I am going collage with JCTC that the of the collage thank you my friends
Commplementary angles
Idrissa Reply
im all ears I need to learn
right! what he said ⤴⤴⤴
what is a good calculator for all algebra; would a Casio fx 260 work with all algebra equations? please name the cheapest, thanks.
Kevin Reply
a perfect square v²+2v+_
Dearan Reply
kkk nice
Abdirahman Reply
algebra 2 Inequalities:If equation 2 = 0 it is an open set?
Kim Reply
or infinite solutions?
The answer is neither. The function, 2 = 0 cannot exist. Hence, the function is undefined.
Embra Reply
Jeannette has $5 and $10 bills in her wallet. The number of fives is three more than six times the number of tens. Let t represent the number of tens. Write an expression for the number of fives.
August Reply
What is the expressiin for seven less than four times the number of nickels
Leonardo Reply
How do i figure this problem out.
how do you translate this in Algebraic Expressions
linda Reply
why surface tension is zero at critical temperature
I think if critical temperature denote high temperature then a liquid stats boils that time the water stats to evaporate so some moles of h2o to up and due to high temp the bonding break they have low density so it can be a reason
Need to simplify the expresin. 3/7 (x+y)-1/7 (x-1)=
Crystal Reply
. After 3 months on a diet, Lisa had lost 12% of her original weight. She lost 21 pounds. What was Lisa's original weight?
Chris Reply
Got questions? Join the online conversation and get instant answers!
QuizOver.com Reply

Get the best Algebra and trigonometry course in your pocket!

Source:  OpenStax, Geometric methods in structural computational biology. OpenStax CNX. Jun 11, 2007 Download for free at http://cnx.org/content/col10344/1.6
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Geometric methods in structural computational biology' conversation and receive update notifications?