The SVD is intimately related to the familiar theory of
diagonalizing a symmetric matrix. Recall that if A is a symmetric
real n × n matrix, there is an orthogonal matrix V and a diagonal D
such that A = VDVT. Here the columns of V are eigenvectors for A
and form an orthonormal basis for Rn; the diagonal entries of D are
the eigenvalues of A. To emphasize the connection with the SVD, we
will refer to VDVT as the eigenvalue decomposition, or EVD, for A.
For the SVD we begin with an arbitrary real m×n matrix A. As we
shall see, there are orthogonal matrices U and V and a diagonal
matrix, this time denoted Σ, such that A = UΣV T. In this case, U
is m×m and V is n×n, so that Σ is rectangular with the same
dimensions as A. The diagonal entries of Σ, that is the Σii = σi,
can be arranged to be nonnegative and in order of decreasing
magnitude. The positive ones are called the singular values of A.
The columns of U and V are called left and right singular vectors,
for A.
The analogy between the EVD for a symmetric matrix and SVD for an
arbitrary matrix can be extended a little by thinking of matrices
as linear transformations. For a symmetric matrix A, the
transformation takes Rn to itself, and the columns of V define an
especially nice basis. When vectors are expressed relative to this
basis, we see that the transformation simply dilates some
components and contracts others, according to the magnitudes of the
eigenvalues (with a reflection through the origin tossed in for
negative eigenvalues). Moreover, the basis is orthonormal, which is
the best kind of basis to have. Now let’s look at the SVD for an
m×n matrix A. Here the transformation takes Rn to a differentspace,
Rm, so it is reasonable to ask for a natural basis for each of
domain and range. The columns of V and U provide these bases. When
they are used to represent vectors in the domain and range of the
transformation, the nature of the transformation again becomes
transparent: it simply dilates some components and contracts
others, according to the magnitudes of the singular values, and
possibly discards components or appends zeros as needed to account
for a change in dimension. From this perspective, the SVD tells us
how to choose orthonormal bases so that the transformation is
represented by a matrix with the simplest possible form, that is,
diagonal. How do we choose the bases{v1,v 2,···,vn}and{u1,u
2,···,um}for the domain and range? Thereis no difficulty in obtaining
a diagonal representation. For that, we need only Avi = σiui, which
is easily arranged. Select an orthonormal basis {v1,v 2,···,vn} for
Rn so that the first k elements span the rowspace of A and the
remaining n−k elements span the null space of A, where k is the
rank of A. Then for1 ≤ i ≤ k define ui to be a unit vector parallel
to Avi, and extend this to a basis for Rm. Relative to these bases,
A will have a diagonal representation. But in general, although the
v’s are orthogonal, there is no reason to expect the u’s to be. The
possibility of choosing the v–basis so that its orthogonality is
preserved under A is the key point. We show next that the EVD of
the n×n symmetric matrix ATA provides just such a basis, namely,
the eigenvectors of ATA.
Let ATA = VDVT, with the diagonal entries λi of D arranged in
nonincreasing order, and let the
These equations can be used to obtain equation (6), and also to
show that the unit vectors of the x y z coordinate system are
related by uz = ux !X uy, ux = uy !X uz, and uy = - ux! X uz. The
foregoing is based on the assumption that the x y z coordinate
system was defined as a right-handed triad.
--------------------------------------------------------------------------------------------------------------------Since
t = uz, it is easy to see that if wx = 0, the Frenet-Serret
equations (4) are a special case of equations (6), with k n = wy
ux, b = uy, and - t n = -wz ux. Since k= wy, we can consider the
curvature to be a vector k = wy uy = k b, and similarly the torsion
can be considered to be a vector t = wz uz = t t. With wx = 0, the
total angular velocity is
w = wy uy + wz uz = k + t = k b + t t = D. (9)
When written in the form k b + t t , the total angular velocity is
referred to as the Darbaux vector, represented here as D [Goetz,
1970]. In the more general case, when wx is not 0, the curvature
vector, k, which must be perpendicular to t, is in the x,y plane
and k = k x + k y = kxux + kyuy = wxux + wyuy. Let q be the angle
between k and ux. The case where q is constant along s is of little
interest; it is simpler to redefine the x y z system so that wx =
0. It is more useful to interpret the x y z coordinate system as a
system that is fixed in the material of a structure that is bending
to form the space curve, and refer to it as the body coordinate
system. In this case, the torsion is interpreted as rotation of the
curvature vector in the x,y plane of the body coordinate
system:
dq/ds = t. (10)
Specification of k(s) as a vector function of arc length determines
q(s), and therefore t(s), and is sufficient to define the
curve.
5
Three-dimensional view of the coordinate systems for a space curve,
which is the axis of the bent cylinder shown in this figure. The x,
y plane is perpendicular to the curve, as is the n, b plane. The
tangent to the curve is in the t and +z directions. The curvature
vector is in the b direction, and the angle q is represented by the
blue arc drawn from the +x axis to the b vector.
The Circular Helix
A particularly simple curve defined by k and t is the circular
helix, which results when the values of k and t are constant and
non-zero. The axis of the helix is parallel to the Darbaux vector.
The standard parameters of the helix, the radius, r, and the pitch,
h, are given by
r = k /( k 2 + t 2) and h = 2πt/( k 2 + t 2) . (11)
The pitch angle fH is the angle between the tangent and the Darbaux
vector, and is given by
tan(fH) = 2πr/h = k / t (12)
The arc length in one turn of the helix is S:
S = sqrt ( (2πr)2 + h2 ). (13)
6
The position components of points on the curve, in a global X Y Z
coordinate system, are given by
X = –r sin(2πs/S), Y = r cos(2πs/S), Z = hs/S. (14)
This representation produces a right-handed helix starting at {0,
r, 0} and aligns the Darbaux vector and the axis of the helix with
the Z coordinate axis. When the curvature is specified as a vector
function of s, a circular helix results when the components of the
curvature vector are {kcos(ts), -ksin(ts), 0}. These equations give
a right-handed helix, when k, t, r, and h are all positive.
Negative values of torsion produce a left-handed helix, and suggest
that the pitch should also be negative. If the pitch is negative, S
should also be negative, and the effect is then just to change the
sign of the X term in equation (14).
1.3 Twist In order to discuss twist, it is necessary to go beyond
the idea of a space curve and describe a structure that is
following a space curve. The simplest structure, a ribbon, is often
used by mathematicians to illustrate such discussions, but here we
will use a cylinder, with one or more elements specified on the
surface of the cylinder. This structure resembles the internal
cytoskeleton of cilia and flagella, known as the axoneme, which
usually is a cylinder with 9 outer microtubular doublets. Of
course, the axis of the cylinder and any one of the elements define
a ribbon, of constant width. The axis of the cylinder is a space
curve, and when we refer to the curvature or torsion of the
cylinder we are referring to the curvature or torsion of this space
curve. Now consider that we have a straight cylinder, and apply
equal and opposite torques, or moments, to each end, so that one
end of the cylinder rotates relative to the other end, while the
cylinder remains straight (no bending). If this rotation were to
occur while the two end surfaces of the cylinder remain planar and
parallel, the elements on the surface must become longer than the
axis of the cylinder. If the elements on the surface retain
constant length, the ends of the cylinder must be deformed. So
twisting requires strain within the cylinder, and will be resisted
by the elastic resistances of the material of the cylinder. This
resistance will be referred to as the elastic twist resistance,
without determining whether it results from stretch resistance or
shear resistance of the material, or a combination of both. If the
moments are removed, the torque produced by the elastic twist
resistance will reverse the original rotation and the cylinder will
return to its original configuration. What is the relation between
twist and torsion? To examine this, the body coordinate system of
the cylinder, described in Section 1.1.2, is used. The z axis of
the body system lies along the axis of the cylinder, and a
particular element is chosen that is always on the +y axis. So, if
the cylinder is
7
twisted, the body coordinate system rotates locally around the z
axis. To finish the definition of this coordinate system, a “basal”
end of the cylinder is defined, at which arc length s along the
cylinder = 0. Then z and s increase in the same direction, from
base to “tip” of the cylinder. The direction of the x coordinate is
then specified for a right-handed x y z coordinate system, as shown
in Figure 1. Since the Frenet-Serret system is a right-handed
system with z increasing in the +s direction, n and b must always
lie in the x,y plane, and when n is in the y direction, b must be
in the –x direction. In an untwisted cylinder, rotation of the
curvature vector by the torsion must cause rotation of the
curvature vector in the x,y plane. The position of the curvature
vector is specified by an angle q measured from the x axis towards
the y axis (Figure 1), and the torsion is then dq/ds, as in
equation (10). However, if the cylinder is twisted, as described in
the preceding paragraph, this twist will cause an additional
rotation of the curvature vector, resulting from rotation of the x
y z coordinate system. In general then
torsion t = dq/ds + kz, (15)
where the curvature kz measures the rate of change of twist angle
along the length [Gueron and Liron, 1993]. Twist is torsion, but
not all torsion is twist. In many contexts, the terms twist and
torsion are used interchangably. In the absence of bending, this
causes no problems. However, with flagella and cilia, where bending
is important, it is preferable to use the term twist to refer to
twist, and not complicate the interpretation by referring to twist
as torsion. Twist of an axoneme could be generated by mechanisms
other than application of external torque. The internal motors,
known as dyneins, that generate flagellar bending generate sliding
forces between the outer doublets. If these forces are parallel to
the doublets, they generate negligible twisting moments [Hines and
Blum, 1984]. However, observations of in vitro movement of
microtubules by dyneins show microtubule rotation, indicating that
dyneins can, under some circumstances, produce twisting moments.
Twist resulting from such internal twisting moments will be
resisted by the elastic twist resistance of the axoneme. If
generation of moments by the dyneins is terminated, the elastic
twist resistance of the axoneme will restore its untwisted
conformation. Alternatively, twist of the axonemal components might
be a built-in, permanent, feature of the structure, which is
independent of moments generated by the dyneins or applied
externally. Most cilia and flagella do not show evidence of
built-in twist, and the doublets of an inactive flagellum appear to
run straight along the axoneme, as elements of a cylinder. However,
exceptions are found in the flagella of a few types of insect sperm
[Phillips, 1969, 1971].
1.4 Transformation and rotation
8
Given two coordinate systems U1 and U2, and a vector v specified by
its coordinates in one of these systems, eg. v1 = {v1xu1x, v1yu1y,
v1zu1z }, the specification of v in system U2 can be found by
multiplying v1 by a transformation matrix, A12:
v2 = A12 v1. (16)
--------------------------------------------------------------------------------------------------------------------Insert
2. Matrix multiplication
Multiplication of a vector by a matrix: v* = A v
If the matrix A is a 3 x 3 matrix with terms A11 A12 A13 A21 A22
A23 A31 A32 A33 The components of the result vector are
v*x = A11 vx + A12 vy + A13 vz v*y = A21 vx + A22 vy + A23 vz v*z =
A31 vx + A32 vy + A33 vz
Multiplication of a matrix B by a matrix A, written as C = AB, must
be defined so that v* = Cv = (AB)v = A(Bv). To satisfy this
requirement, Matrix B is interpreted as a set of column vectors
{v1, v2, v3} and Matrix C is interpreted as a set of column vectors
{v*1, v*2, v*3}, where v*1 = Av1, etc. Note that this
multiplication operation is not commutative: AB is not equal to
BA.
--------------------------------------------------------------------------------------------------------------------A
transformation matrix can be used to provide information about
translation of the origin of the coordinate systems, but for
changes in specification of a vector that possibility is not used,
and the transformation matrix is a rotation matrix. To find the
components of the transformation matrix, consider the case where v1
is the unit vector u1x, represented by {1, 0, 0}. Then we see that
{A11,A21,A31} represents the components of the unit vector u1x, in
the U2 coordinate system. These quantities are also referred to as
the direction cosines of the u1x vector in the U2 coordinate
system, and they can be represented as dot products {u2x l u1x, u2y
l u1x, u2z l u1x }. The remainder of the matrix can be established
in the same manner. The same procedure can be used to obtain A-1,
the inverse of A, such as the matrix A21 that converts a vector
expressed as coordinates in system U2 to coordinates in system U1.
We then discover that the inverse of a rotation matrix, such as A,
is simply its transpose AT -- the matrix
9
formed by converting rows into columns [Hines and Blum, 1983]. This
greatly simplifies computation of the inverse of A. Another
important property of a rotation matrix is that the sum of the
squared components of each row or column equals 1.0 (the
normalization condition). Any transformation by a rotation matrix
is equivalent to rotation of the coordinate system by an angle q
around some axis represented by a unit vector a [see Goldstein,
1980]. The rotation formula, for rotation of a vector in a fixed
coordinate system [Goldstein, 1980], can be written in matrix form,
with the components of a represented by {ax, ay, az }:
axax(1–cosq) + cosq, axay(1–cosq) + azsinq, axaz(1–cosq) –aysinq
A12 = axay(1–cosq) – azsinq, ayay(1–cosq) + cosq, ayaz(1–cosq) +
axsinq (17) axaz(1–cosq) +aysinq, ayaz(1–cosq) – axsinq,
azaz(1–cosq) + cosq
This matrix can be found in many computer graphics texts. It was
derived in Brokaw [2002] from the rotation formula in Goldstein
[1980], and a more complete derivation is given in Rogers and Adams
[1976]. If the coordinate system U2 is rotated to obtain U1, A21 is
formed by using the rotation formula with -q (Goldstein, 1980) and
{ax, ay, az } expressed in the U2 system. Then A12 is the transpose
of A21, which is equivalent to using +q, as written in equation
(16). For very small q, A12 becomes the infinitesimal
transformation matrix:
azq, –ayq 0, wz, –wy 0, t, –k –azq, 1, axq = I + ds
–wz, 0, wx = I + ds –t, 0, 0 (18) ayq, –axq, 1 wy, –wx, 0 k, 0,
0
I represents the identity matrix, with Aij = 0 except for A11 = A22
= A33 = 1, and the w or t, k terms in the matrices are recognizable
as the matrix form of equations (6). Note that the infinitesimal
transformation matrix is only an approximation, which does not
satisfy the normalization condition. Also, the infinitesimal matrix
is antisymmetric, while the finite transformation matrix (equation
17) does not have this property. Given the coefficients of a finite
rotation matrix, the recommended method (Glassner, 1990) for
extracting the rotation parameters is to calculate
cosq = (A11 + A22 + A33 - 1)/2, (19)
and then use the arccos function to extract q. Although the arccos
function does not reveal whether q is + or -, this is unimportant
in the present context because the curvature is considered to
always be +. The arccos function can introduce significant
ambiguities if the magnitude of q is greater than
10
p, and these must be dealt with. The a vector components can then
be obtained from
ax = (A23 - A32)/2sinq, etc.
3. Suppose A is a real square n x n matrix with SVD given by A USVT Using MATLAB's eig and svd, i...
Question 4 [35 marks in totalj An n x n matrix A is called a stochastic matrix if it! satisfies two conditions: (i) all entries of A are non-negative; and (ii) the sum of entries in each column is one. If the (,) entry of A is denoted by any for ij € {1, 2,...,n}, then A is a stochastic matrix when alij 20 for all i and j and I j = 1 for all j. These matrices are...