Tensors and Ellipsoids

The ellipsoid as a visualization aid for rank-2 symmetric Euclidean tensors, with examples in mechanics, elasticity and optics


  1. Tensors and Ellipsoids
  2. The Inertia Tensor and Rigid Body Motion
  3. The Strain and Stress Tensors and Elastic Constants
  4. The Dielectric Tensor and Crystal Optics
  5. Exercises
  6. References

Tensors and Ellipsoids

An ellipsoid is a quadric surface described in canonical form by the equation ax2 + by2 + cz2 = 1. The lengths of the semiaxes of the ellipsoid are 1/a1/2, 1/b1/2 and 1/c1/2 along the x, y and z axes, respectively. If b = c, the ellipsoid degenerates into a spheroid, and if a = b = c, into a sphere. The center of the ellipsoid is at the origin.

If the axes are rotated to a general position, then the equation for the ellipsoid can be written ax2 + by2 + cz2 + dyz + ezx + fxy = 1, where a,b and c are new constants, as well as d, e, f. The absence of linear terms means that we keep the origin at the center of the ellipse, since displacing the center of the ellipsoid will not add anything essential. There are six constants in this expression, and they specify the shape and size of the ellipse and the directions of the axes, which remain orthogonal. Any section of an ellipsoid by a plane is an ellipse, notably the principal sections that contain its axes.

A rank-2 symmetrical tensor Φij has six components, three diagonal and three off-diagonal, which occur in symmetric pairs. The equation xiΦijxj = 1 is the equation of an ellipsoid associated with the tensor. We are using index notation, which is explained in Euclidean Tensors. A radius vector x is expressed as its components xi, i = 1, 2, 3 instead of x, y, z.

It is possible to find the rotation that transforms a symmetric rank-2 tensor into the diagonal form Φij = a(i)δij. This will also transform the ellipsoid into canonical form and give the lengths of the semiaxes. To do this, let ni be a unit vector (the components are direction cosines) and form the equation Φijnj = λnj, where λ is some constant, which we may have to determine. This expresses the condition that the vector resulting when the tensor Φ "operates" on ni gives a vector in the same direction, but of possibly different magnitude. It is the fundamental duty of rank-2 tensors to operate in this way on one vector (by this is meant multiplication followed by contraction) to give another vector. The expression is really three simultaneous equations for the three direction cosines ni.

Therefore, we have the three equations (Φij - λδij)nj = 0 for the three values of i. These are homogeneous equations, and only have the solutions ni = 0, unless the determinant of the coefficients vanishes. If it does, we can solve for the ratios of two of the direction cosines to the remaining one, and then determine them all by the condition that the sum of their squares must be unity. The constant λ must be chosen so that the determinant of the coefficients vanishes, |Φij - λδij| = 0. Since this equation is of third order, there are three roots for λ. If the three roots are equal, then the tensor is already diagonal, and is the constant tensor λδij. The corresponding ellipsoid is just a sphere. If two roots are equal, then the remaining root gives the direction of an axis of symmetry, and the orientation of the axes perpendicular to this axis can be chosen any way one likes. The ellipsoid is then a spheroid, and Φ11 = λ(1), Φ22 = Φ33 = λ(2) = λ(3). Finally, in the general case all three roots are unequal, and we have an ellipsoid with unequal axes, and the diagonal elements of Φ are the three values of λ.

If the three values of λ are unequal, the resulting principal axes will all be orthogonal, that is, at right angles to each other. If two values are equal, then the axis will be orthogonal to the plane of the other two axes. The values of λ are called eigenvalues ("own values") and the unit vectors belonging to them are called eigenvectors. Working with principal axes simplifies the algebra greatly, and we usually assume that any symmetrical rank-2 tensor has been diagonalized in the way we have discussed. All of these matters are discussed at length in texts on linear algebra.

We now have enough theoretical background to consider the most important rank-2 tensors met with in classical physics. We'll discuss the inertia tensor, the dielectric tensor, the strain tensor, and the stress tensor in this article. Not in detail, of course, but enough to show how the tensors are defined and used, making their application easier. See the Examples for exercises in tensor diagonalization. There are good numerical methods for the diagonalization of matrices, and the basic theorems are proved in any quantum mechanics text.

The Inertia Tensor and Rigid Body Motion

The inertia tensor in the dynamics of rigid bodies is an excellent example of a rank-2 tensor where the associated ellipsoid aids in the visualization of the motion. We shall start from first principles, using index notation, and find the motion of a free rigid body--that is, a body under the action of no external forces. Any rigid body can be considered as an assembly of mass elements m with fixed positions xi relative to an orgin O, in a system of coordinates fixed in the body. The origin O can be a fixed point in space and the body, or the center of mass of the body. In either case, the motion separates into translational and rotational parts. We are not concerned with the translational parts here, and will consider only the rotation. The most general motion of the body is then a rotation with angular velocity ωi. The direction of the vector specifies the direction of the axis of rotation. The angular velocity may change in magnitude and direction with time, in a quite general manner. The body coordinates rotate in the same way, and so do not form an inertial system.

The rate of change of any vector quantity vi in a fixed coordinate system with origin O is [dvi/dt]fix = [dvi/dt]rot + εijkxjvk. This is merely the relation that in vector notation is [dv/dt]fix = [dv/dt]rot + r x v, which is illustrated in the diagram, which applies to any vector, not just a radius vector. Accordingly, the velocity of an element of mass m in the inertial system is vi = εijkωjxk, since it is fixed in the rotating system. Summing over all mass elements, the angular momentum Hi = Σm εijkxj εklmωlxm. Using the value of the contraction of two antisymmetric tensor densities, we find that Hi = [Σm(xkxkδij - xixj] ωj, The rank-2 symmetric tensor multiplying ωj is the inertia tensor Iij of the body. The diagonal elements are called moments of inertia, and the off-diagonal elements products of inertia. For a continuous body, they may be found by integration, of course. We note that the angular momentum is not necessarily in the direction of the angular velocity.

Now we find the kinetic energy T of the body by adding up the kinetic energies of each element of mass: 2T = Σvkvk. Using the expression for the space velocity, we again get a contraction of two antisymmetric tensor densities. In fact, 2T = Σm εijkωjrk εirsωrrs = Ijrωjωr. This work is a good example of how much easier it is to use the index notation than vector notation in these things.

Newton's Second Law can be applied in the fixed system to each of the elements of mass, and we find that the net torque Li = ΣεijkxjFk is the rate of change of angular momentum Hi in the fixed system. From the above equations, we have Li = Iij(dωj/dt) + Iklεijkωjωk. These are three differential equations for the components of the angular velocity of the body relative to the body axes.

It is always possible to diagonalize the inertia tensor by a suitable choice of the orientation of the body axes. Let us assume this has been done, and the principal moments of inertia are A(i). The index is in parentheses because it is not a tensor index, merely an identifying label. For ease of writing, we may also represent these quantities by A, B and C. Then, Iij = A(i)δij, and the equations of motion of the preceding paragraph become Li = A(i)(dωi/dt) + εijkA(k)ωjωk. These equations are known as Euler's equations.

Now we can consider the problem of a body moving freely with one point fixed, which can be taken as the center of mass. An asteroid moving in space is a good example of such a body, since the torques exerted by unequal solar attraction are small and have little effect if the body is rotating rapidly. Such torques indeed cause the axis of the earth to precess, but this is a slow effect. We can safely neglect such torques with an asteroid rotating with periods of hours or less.

First of all, the kinetic energy T of the body must be constant, so 2T = Hiωi = Iijωiωj = constant. The angular momentum is also constant, since no external torques act, so this says that the component of the angular velocity in the direction of H is also a constant. If we consider a space in which the components of the angular velocity are the coordinates, then this equation also defines an ellipsoid in this space, which is similar to the inertia ellipsoid. We shall call this the energy ellipsoid, since its size depends on the kinetic energy of the motion. We shall choose the body axes so that the inertia tensor is diagonalized, so the body axes are also the symmetry axes of the energy tensor. The semiaxes of the energy ellipsoid are √(2T/A(k)). The angular velocity is a vector from the origin O to a point on the energy ellipsoid P, as shown in the diagram at the left.

If the equation for the energy ellipsoid is differentiated, we find that Iijωji = 0. Now dωi is a vector in the tangent plane at P, so this says that the angular momentum is perpendicular to the tangent plane. Because the angular momentum is constant, this means that the orientation of the tangent plane remains constant during the motion. Also, since the component of the angular velocity in the direction of the angular momentum is constant, the distance d from the origin O to the tangent plane is constant. Therefore, we find the important result that the tangent plane to the energy ellipsoid is fixed in space. We call it the invariant plane. Since the angular velocity passes through the point of contact P, there is no relative motion between the energy ellipsoid and the invariant plane: the energy ellipsoid must roll on the invariant plane without slipping.

The angular velocity must terminate on the energy ellipsoid in any case. For a certain d, it must also lie on the surface obtained by squaring the relation Hd = 2T, which is Σ(A(i)i)2 = 4T2. Eliminating the constant terms between these two equations, we find that Σ(A(i)/2T)[(1/d2) - (A(i)/2T)]ωi2 = 0, which is the equation of a cone (not in general a circular one) with center at O, on which the angular velocity must lie. This cone intersects the energy ellipsoid in a curve called the polhode (Greek: "path of the pole"). This cone rolls without slipping on a similar cone fixed in space determined by the path of the angular velocity in the invariant plane, which is called the herpolhode. We have now arrived at a complete description of the motion of the body, making use of the inertia ellipsoid.

Let's now specialize to the case frequently met when two of the moments of inertia are the same, say A = B, with C different. The case when C < A is shown in the diagram, where the inertia ellipsoid is prolate, and the 3-axis is the symmetry axis. Now the polhode and herpolhode determine circular cones that roll on each other without slipping, and the motion can be easily visualized. The component of angular velocity Ω along the symmetry axis is called the spin. It is given by Ω = ω cos α. The radius of the polhode is ω sinα, and the radius of the herpolhode is ω sin (θ - α). The angle between the symmetry axis and the direction of H is θ. It is related to α by tanα = (C/A) tanθ. The rotation of the symmetry axis around the direction of H is called precession, and its amount is dψ/dt = CΩ/(A cosθ).

In finding the inertia tensor for bodies made up of simple geometric shapes, the Parallel-Axis Theorem is useful. It says that the moment of inertia about an axis parallel to an axis through the center of gravity is equal to the moment about the axis through the center of gravity plus the total mass of the body times the square of the distance between the axes, or Iii = Iii' + md2. For products of inertia (off-diagonal elements), the product of inertia is the product of inertia referred to the center of gravity plus the mass times the product of the coordinate differences. That is, Iij = Iij' + mxy. Tables of the moments of inertia of simple figures can be found in handbooks and in texts of engineering mechanics. The moment of inertia of a parallelepiped about an axis through the centroid perpendicular to a face is (m/12)(a2 + b2), where a and b are the sides of the face. The centroidal moment of inertia of a sphere is (2/5)mr2, and of a disk relative to an axis perpendicular to its plane, like a wheel and axle, (1/2)mr2.

The Strain and Stress Tensors and Elastic Constants

Consider a solid elastic medium that is deformed by forces applied to its boundaries, or by forces exerted directly on the material by external influences. In response to these forces, a general point originally at xi before the forces are applied moves to a point xi + ξi. The displacement ξi is not necessarily small, but its derivatives Eij = ∂iξj we presume are much less than unity. These quantities are called strains, and form a rank-2 tensor that is a function of position. It is convenient to separate Eij into symmetric and antisymmetric parts, Eij = eij + Ωij. The symmetric part eij is called the pure strain tensor, while the antisymmetric part Ωij represents the rotation due to the deformation. The vector associated with Ωij gives the rotational axis and the angle of rotation. We will be mainly interested in the pure strain, which represents the deformation of the medium.

If the medium returns to its initial state when the forces are removed, the deformation is called elastic, and all the strain energy put into the body is recovered. If the deformation is too large, the body does not return to the initial state, mechanical energy is dissipated, and the deformation is called plastic. For solid bodies, deformation is usually elastic for sufficiently small strains. We shall deal with this case exclusively here.

Since the pure strain is symmetric, it can be diagonalized. Let the diagonal elements be a, b, and c. Then, a small parallelepiped with sides dx, dy and dz and volume V = dxdydz becomes a small parallelepiped with sides (1 + a)dx, (1 + b)dy and (1 + c)dz after deformation, of volume V + dV = (1 + a + b + c)dxdydz, so that dV = (a + b + c)V. Quantities of the order of squares of strains have been neglected. The sum a + b + c = dV/V is the relative increase in volume at the point, called the dilatation. Now, a + b + c is just the trace of the strain tensor, eii = ∂iξi, which we know is invariant under rotation.

As an example, consider a body strained as shown in the figure by a shearing force. The only nonzero component of the strain tensor is E12 = a, where a is a small quantity equal to the angle of strain θ. Then the pure strain tensor has e12 = e21 = a, and the rotation tensor has Ω12 = -Ω 21 = a. We note that the deformation is indeed equal to a rotation of a radians about the 3-axis plus an extension along the 2-axis and a contraction along the 1-axis. Indeed, the pure strain tensor has eigenvalues ±a/2. The trace is zero, so there is no dilatation. This sort of deformation is called pure shear.

Now we must consider the causes of deformation and rotation, stresses. If we imagine a plane surface in the body and that the material on one side of the plane surface is removed, the forces that it exerted on the material on the other side can be represented by forces acting on this plane surface. Let the unit normal vector to the surface be taken as the normal pointing towards the removed material, or outwards. The normal force is positive when it is a pull, or in the direction of the normal, or a tension. There may also be a force in the plane of the surface, perpendicular to the normal, called a shear force, and in any particular case the positive directions of its two components in the surface will be agreed upon. A force per unit area of surface is a stress. If a hydrostatic pressure p acts on the surface, then the normal stress is -p and the shear stresses are zero.

The stresses will, in general, depend on the direction of the normal. For any normal vector ni, we will have stresses Fi that are a function of the ni. This defines a tensor Sij such that Fi = Sijnj, called the stress tensor. That this is a satisfactory definition is can be seen by considering the equilibrium of a small tetrahedron, as shown in the figure at the left. The three components of the force on the inclined face of area dS are balanced by equal and opposite forces on the other three sides. Note that the area of each side is nidS, and the force on it will be in the opposite direction to the force on the face dS. To prove this, write out the forces in the 1, 2 and 3 directions separately and show that they balance.

The stress tensor must be symmetric, or there would be unbalanced torques on small areas that would cause them to rotate. The absence of such rotation causes the stress tensor to be symmetric. In the figure, we take the moments of the forces tending to rotate the cube about the 3-axis that come from the 1-2 component of the tensor, showing that S12 = S21. The same holds for each of the other off-diagonal elements.

We now look for the connection between the stress and pure strain tensors. Since the strains are small compared to unity, we assume that Hooke's Law holds, and the stresses are linear functions of the strains. Actually, Hooke's Law is always valid for elastic deformations, since the strain is still small when inelasticity begins. A systematic way to do this is to consider a deformation energy F defined by ∂F/∂eij = Sij. Hooke's Law then requires that S be proportional to e, or that F be a homogeneous quadratic function of the strains. The most general isotropic quadratic homogeneous function of the strains is F = (1/2)λ(ekk)2 + μ(eij)2, where the constants λ and μ are called the Lamé coefficients. If we differentiate F, we find Sij = λekkδij + 2μeij, which is the relation between stress and strain for an isotropic, elastic body.

Poisson took λ = μ, as did Cauchy and others who thought there should be only one elastic constant for an isotropic medium. Their theories were based on central forces between supposed "molecules" that have little in common with our modern molecules. There was an extended debate between those who demanded more elastic constants and those who demanded fewer. In the general case (as we shall discuss below) 21 constants were required by those who favored more, and only 18 by those favoring less. Time and experiment have come down on the side of those who think there must be the larger number.

Elastic constants useful in specific problems can be defined in terms of λ and μ. If we assume an isotropic pressure p, then -p = λekk + 2μe(k)(k) = (λ + 2μ/3)dV/V. The bulk modulus k is defined by dV/V = -p/k, so k = λ + 2μ/3. If an axial force F is applied to a rod of length L and cross-sectional area A, the rod lengthens by ΔL = FL/AY, where Y is Young's modulus. At the same time, the rod contracts laterally, and the ratio of the lateral strain to the longitudinal strain is σ, Poisson's Ratio. Under these conditions, we find that σ = λ/(2λ + 2μ) and Y = 2(σ + 1)μ (see Exercise 5). Poisson's ratio can be between 0 and 1/2, but is usually not far from 1/4, which Poisson assumed was the case. It is 1/2 for a body with no shear strength, such as a liquid. The modulus μ itself is the ratio of a shearing stress to the shearing strain (an angle), so is called the modulus of rigidity. For more details on each of these quantities, refer to any book on Strength of Materials. A quantity called a modulus has the dimensions of a stress, it should be noted.

Steel has Y = 30 x 106 psi or 2.07 x 1012 dyne/cm2, μ = 12 x 106 psi or 8.27 x 1011 dyne/cm2, and σ = 0.27. Its bulk modulus, calculated from these figures, is k = 20 x 106 psi or 1.38 x 1012 dyne/cm2. Water, on the other hand, has μ = 0, since it lacks rigidity, and k = 3.16 x 105 psi or 2.18 x 1010 dyne/cm2. It is about 63 times more compressible than steel. As for gases, dV/V = -dp/p for an ideal gas, so k = p. At one atmosphere, this is 1.47 psi or 1.013 x 106 dyne/cm2, about 22,000 times more compressible than water.

Rubber is a curious material with σ ≈ 0.5, which implies that it has very low rigidity, a kind of "solid fluid." For rubber, Y = 100 to 600 psi, and &mu = 30 to 200 psi, so its rigidity is indeed small. Glass is on the opposite pole, with Y = 10 x 106 psi and μ = 5 x 106 psi. Its Poisson's ratio is low, 0.20 to 0.27, which implies that it is quite rigid compared to its longitudinal elasticity.

A medium whose properties depend on direction is often called anisotropic, but a less barbarous word is aelótropic (Greek: change-turning). The above expression for F suggests the generalization to an aelotropic medium by setting F = (1/2)λijkleijekl, which gives S1j = λijklekl. The rank-4 tensor λijkl is called the elastic constant tensor, an operator connecting two rank-2 tensors just as a rank-2 tensor is an operator connecting two vectors. In general, it has 81 components, but there is a great deal of symmetry here, since the energy is unchanged if i,j or k,l are interchanged (the strain tensors are symmetric), or if i,k and j,l are simultaneously interchanged. As a result, there are only 21 independent components at worst. It is confusing to work this out, but all 81 components can be listed, and equal ones sorted out, as a last resort.

The symmetry of crystals will introduce more relations between the constants, reducing their number. Even in a triclinic crystal, the least symmetric, the possibility of choosing convenient coordinates reduces the number to 18. A monoclinic crystal has 12, an orthorhombic crystal 9. A tetragonal crystal has 6, as does a trigonal (rhombohedral) crystal like calcite or quartz. A hexagonal crystal has 5, and a cubic crystal 3. Cubic crystals are often considered to be isotropic in many respects, but they have an extra elastic constant that truly isotropic materials do not have. The constant corresponding to λ splits into two, one the coefficient of the squares of the diagonal elements of e, the other the coefficient of the cross products. This reduction in the number of constants is found by applying the symmetry elements of the crystals, which turns some constants into their negatives, so they must vanish. This is an interesting point, but we cannot go into it further at this time.

An interesting application of this theory is to the propagation of mechanical waves in solids, which has application to seismology, nondestructive testing, and other fields. Such waves are almost always elastic, and have an interesting variety of properties that have been intensively studied. The theory is generally more complicated than that of electromagnetic waves, and includes the subject of surface waves (Rayleigh and Love waves), as well as waves in an infinite medium.

The Dielectric Tensor and Crystal Optics

The relation between the electric displacement D (statcoulombs/cm2) and the electric intensity E (statvolts/cm), considered in the first approximation as a proportionality, will in general be given by Di = εijEj, where the dielectric tensor satisfies εij = εji. The requirement of symmetry comes from several sources, one of which is simply that the tensor should be diagonalizable by an ordinary rotation, which establishes the three orthogonal principal axes of polarization, and the three principal dielectric constants, its eigenvalues. This yields D(k) = ε(k)E(k), where k = 1, 2 and 3. An electric intensity in a principal direction gives a displacement in the same direction. If all three directions are equivalent in the medium, then the three constants are equal: ε(k) = ε, and an electric intensity in any direction creates a displacement in the same direction; the medium is then isotropic.

In crystals belonging to the cubic or isometric system, three equivalent axes can be found, so the medium is isotropic. In the trigonal (rhombohedral), tetragonal and hexagonal systems, two equivalent axes perpendicular to the third can be found. If the unique axis is the 3-axis, then ε(1) = ε2, while ε3 is different. In the orthorhombic, monoclinic and triclinic systems, all three principal dielectric constants may differ. In any case, dielectric constants may accidentally be closely the same, so the medium will resemble a more symmetric one. Noncrystalline materials, such as polymer sheets, may be anisotropic as well. Nonisotropic transparent materials may show double refraction, discovered in 1669 by Bartholinus in Iceland spar, or transparent calcite, CaCO3, and first explained by Huygens not long afterwards. Such materials are also called birefringent.

We must first discuss the propagation of plane electromagnetic waves in an anisotropic medium. We shall assume throughout that B = H, or μ = 1, which is true in transparent crystals. Assume that the wave depends on space and time through a factor exp[iω(t - nr·s/c)], where ω is the angular frequency, v = c/n is the phase velocity, r the position vector and s is the unit vector normal to the wavefront. Then we can replace time derivatives by iω and del by -iωns/c in Maxwell's equations. We find: s·D = 0, s·H = 0, ns x H = -D, and ns x E = H. All the terms in these expressions are constant vector amplitudes.

The spatial relations between the vectors used to describe the wave are shown in the diagram at the right. The wavefront is the plane of H and D, with s the wavefront normal. E lies in the plane of s and D making an angle α with D. Since browsers do not yet support the perpendicular symbol, the electric intensity along D will be represented in the text by E+. The Poynting vector S is in the direction of t, the ray unit vector. This gives the direction of energy flow, which is not normal to the wavefront. The phase velocity is vs, with v = c/n, and the ray velocity is vrt, with vr = c/nr. The electric energy density we = E·D/8π = (n/8π)H·(s x E) and the magnetic energy density wm = H·B/8π is given by exactly the same expression, so the total field energy w = (n/c)s·S, where S = (c/4π)(E x H) is the Poynting vector. Now, vr = S/w, so v = vrs·t = vr cos α. The indices of refraction are then related by nr = n cos α.

The magnetic field can be eliminated between the Maxwell curl equations with the important result that D = n2[E - s(s·E)] = n2E+. This is a relation between E and D that must be satisfied by the wave, and is completely independent of the relation between them given by the dielectric tensor. Both relations must be satisfied in the wave. Since E+ = (E·D)D/d2, this relation can also be written as n2 = D2/(E·D). Similarly, it can be shown that nr2 = (D·E)/E2.

Applying the diagonalized dielectric tensor (which means everything must be referred to principal axes) we find εkEk = n2[Ek - sk(s·E)]. The k here is not a tensor index, merely an indentifier of one of the three components. Written out fully, we find three homogeneous equations for the field components Ek. Such a system can have non-zero solutions only if the determinant of the coefficients vanishes. This condition will determine the value of n, or the phase velocity, for which both Maxwell's equations and the dielectric relation are satisfied. Fresnel discovered a cunning way to do this that yields a symmetrical equation for v in terms of the components of the wave normal vector.

Rearrange the equation to Ek = [n2sk/(n2k)] (s·E). This is valid if the quantity in parentheses does not vanish. Now multiply by sk and add the equations for k = 1, 2 and 3. The dot product cancels, and we find a sum of three similar terms that adds to 1. Now the sum of the squares of sk is also 1, so subtract this from both sides and combine the terms with the same sk's. Now change from n's and ε's to the velocities. The final result is s12/(v2 - v12) + s22/(v2 - v22) + s32/(v2 - v32). This is Fresnel's equation of wave normals. To use it, multiply by the denominators to clear of fractions. The result is a quadratic equation in v2, which gives two values for any value of s. The squares mean that the wave can travel in either direction. We will see that each pair of solutions for the same s gives us two plane-polarized waves polarized at right angles to each other which, in general, travel at different velocities.

A similar process in terms of t instead of s yields Fresnel's ray equation, which gives us the ray velocity vr in terms of the ray unit vector. This can be found in Born and Wolf. What is more interesting is to find t in terms of s and the phase velocity. The result is tk = sk{[v2 + g2/(v2 - vk2)]/√(v4 + g2)}, where g2 = v2(vr2 - v2) = {[s1/(v12 - v2)]2 + [s2/(v22 - v2)]2 + [s3/(v32 - v2)]2}-1. From these formulas, we can find the ray vector and the ray (energy) velocity. When the phase and ray velocities re known, then the angle α can be found by a formula given above.

Now we can proceed to consider a geometric construction to find the directions of polarization of a wave travelling in an arbitrary direction in an anisotropic medium. The field energy w is given by 8πw = D121 + D222 + D323. Taking new variables xk = Dk/√4πw, we have the equation x121 + x222 + x323, which is the equation of an ellipsoid whose semiaxes are the square roots of the principal dielectric constants, or the indices of refraction for waves polarized along a principal axis. This ellipsoid has been called the optical indicatrix or the index ellipsoid. Consider the intersection of the plane x1s1 + x2s2 + x3s3 = 0 normal to a wave vector s with the ellipsoid. This intersection will be an ellipse. The axes of this ellipse are the directions of polarization of the two waves with the given wave vector.

It should be clear that once we assume a direction for the wave, the polarization in the perpendicular plane must be carefully chosen so that E and D are coplanar with s, which in general they will not be. This happens for two polarizations that differ by 90°, and the two polarizations will travel with different velocities. This allows the two polarizations to be separated by refraction, as in a Nicol prism.

This is proved by finding the extrema of the vector from the origin to the ellipse, and showing that the result is the same condition that we found above from the combination of Maxwell's equations and the dielectric tensor. Any direction s for which the intersection is a circle is a direction along which a wave may have any polarization, and all polarizations travel at the same velocity. Such a direction is called an optic axis of the crystal. For an ellipsoid with axes all of different lengths, there will be two such directions symmetrically located in the plane of the two axes with the largest and smallest indices on either side of the axis of greatest index. Such media are called biaxial.

If two indices are equal, then a plane perpendicular to the axis of the third index cuts the ellipsoid, which is now a spheroid, in a circle. Therefore, this axis is the optic axis, which is the only one. Such media are called uniaxial. If the third index is greater than the other two, the medium is called positive, and negative otherwise. The two equal indices, or velocities, are called ordinary and denoted by a subscript "o" while the third is called extraordinary and denoted by a subscript "e." If we write Fresnel's equation of wave normals for this case, for a wave travelling in a direction making an angle θ with the optic axis, we easily find the two solutions v2 = vo2 and v2 = vo2 cos2 θ + ve2 sin2 θ. This gives two surfaces, one a sphere of radius vo for the "ordinary" wave, and one a surface of fourth order, an ovoid, for the "extraordinary" wave. For a positive uniaxial medium, the ovoid is inside the sphere, while for a negative uniaxial medium, it is outside, as shown in the diagram. These figures are not the index ellipsoid.

The distance between the ordinary and extraordinary wave normal surfaces is greatly exaggerated in the diagrams, as can be seen from the actual figures for calcite and quartz that are given. The extraordinary velocity surface is drawn as an ellipse for convenience, but it is not an ellipse, being blunter at the ends of the major and minor axes than an ellipse. Huygens took it for an ellipse, which indeed it closely resembles, and elementary Optics texts have followed him in this, but this is an error, although it gives the ray direction quite well using Huygen's construction. This is reminiscent of the Bohr atom, in which erroneous concepts give the correct result. In quartz, the two surfaces do not exactly meet on the optic axis, and this is the source of optical activity in quartz, but the effect can be neglected here. The optic axis of a cleavage rhomb of calcite passes through a blunt corner making equal angles with the sides and faces. If you look at a dot through the crystal, two images will be seen. When you rotate the crystal, the ordinary image will remain fixed while the extraordinary image rotates around it. The reason is that the ray velocity of the extraordinary wave is inclined to the wavefronts, lying in the plane containing the wave normal and the optic axis, as shown in the diagram. The ordinary wave is polarized at right angles to the optic axis, the extraordinary ray is polarized parallel to the optic axis, a fact easily checked with a Polaroid filter.


1. Diagonalize the symmetric matrix with rows: 6 0 0; 0 34 12; 0 12 41, finding the eigenvalues and the eigenvectors. Sketch the associated ellipsoid. (Leigh Page) Answers: the eigenvalues are 6, 50, and 25. The eigenvectors are (1,0,0), (0,3/5,4/5), (0,3/5,-4/5).

2. Find the inertia tensor for two masses of 100 g that are 10 cm from the x-axis, but on opposite sides and 10 cm apart vertically. Find the angular momentum when the masses rotate at 600 rpm about the x-axis, and the resulting moment. What rotating torque must be supplied? Check by analyzing the problem as two point masses. Find the principal axes by inspection. Answers: I = 20,000 -10,000 0; -10,000 5000 0; 0 0 25,000. Eigenvalues 25,000, 0, 25,000. H1 = 20,000ω, H2 = 10,000ω, H3 = 0. Torque 10,000ω2 dyne-cm normal to plane of masses. Principal axes: line joining the masses, line normal to this, line at right angles to these two axes.

3. The body shown in the diagram is constructed from three identical cubes of side a and mass m. C is the center of mass of the body. Find the inertia tensor with the axes shown, using the parallel-axis theorem. Diagonalize the tensor, and find the principal moments of inertia and the principal axes. The problem can be solved using rational numbers. The principal axes can be determined by inspection, which will be a check on your solution. Answers: Principal moments of inertia 3/2, 5/6 and 1/2, in units of ma2. Principal axes: rotate x,z 90° about the y-axis.

4. Find the angle between the wave normal and the ray for light perpendicularly incident on a cleavage face of a calcite rhomb. The optic axis passes through a blunt corner where three faces meet with face angles of 101° 55'. From this information, find the inclination θ of the wave normal s to the optic axis. Find the phase and ray velocities for this case. The indices of refraction are no = 1.6584, ne = 1.4864. Answers: θ = 44.610°, v = 0.63836c, vr = 0.64215c, α = 6.228°.

5. From the equation giving the stress tensor in terms of the strain tensor, k and μ, find the relation between the traces of the strain and stress tensors, and from this the equation giving the strain tensor in terms of the stress tensor. Use this equation to find the strains when only σ11 is nonzero. This is the case of uniaxial homogeneous stress in a uniform rod, when both stress and strain tensors are diagonal. Find expressions for Young's modulus, Y = σ11/e11, and Poisson's ratio, σ = -e22/e11 = -e33/e11, in terms of k and μ.


L. Page, Introduction to Theoretical Physics, 3rd ed. (New York: D. Van Nostrand, 1952). Chapters II and III.

M. Born and E. Wolf, Principles of Optics (London: Pergamon Press, 1959). Chapter XIV.

A. E. H. Love, A Treatise on the Mathematical Theory of Elasticity, 4th ed. (New York: Dover, 1944; reprint of the 1927 edition by Cambridge University Press). Contains a good Historical Introduction.

L. D. Landau and E. M. Lifshitz, Theory of Elasticity (London: Pergamon Press, 1959). An excellent concise treatment, using index notation.

Return to Physics Index

Composed by J. B. Calvert
Created 13 October 2002
Last revised 2 June 2009