The ellipsoid as a visualization aid for rank-2 symmetric Euclidean tensors, with examples in mechanics, elasticity and optics

- Tensors and Ellipsoids
- The Inertia Tensor and Rigid Body Motion
- The Strain and Stress Tensors and Elastic Constants
- The Dielectric Tensor and Crystal Optics
- Exercises
- References

An ellipsoid is a quadric surface described in canonical form by the equation ax^{2} + by^{2} + cz^{2} = 1. The lengths of the semiaxes of the ellipsoid are 1/a^{1/2}, 1/b^{1/2} and 1/c^{1/2} along the x, y and z axes, respectively. If b = c, the ellipsoid degenerates into a spheroid, and if a = b = c, into a sphere. The center of the ellipsoid is at the origin.

If the axes are rotated to a general position, then the equation for the ellipsoid can be written ax^{2} + by^{2} + cz^{2} + dyz + ezx + fxy = 1, where a,b and c are new constants, as well as d, e, f. The absence of linear terms means that we keep the origin at the center of the ellipse, since displacing the center of the ellipsoid will not add anything essential. There are six constants in this expression, and they specify the shape and size of the ellipse and the directions of the axes, which remain orthogonal. Any section of an ellipsoid by a plane is an ellipse, notably the principal sections that contain its axes.

A rank-2 symmetrical tensor Φ_{ij} has six components, three diagonal and three off-diagonal, which occur in symmetric pairs. The equation x_{i}Φ_{ij}x_{j} = 1 is the equation of an ellipsoid associated with the tensor. We are using index notation, which is explained in Euclidean Tensors. A radius vector **x** is expressed as its components x_{i}, i = 1, 2, 3 instead of x, y, z.

It is possible to find the rotation that transforms a symmetric rank-2 tensor into the *diagonal* form Φ_{ij} = a^{(i)}δ_{ij}. This will also transform the ellipsoid into canonical form and give the lengths of the semiaxes. To do this, let n_{i} be a unit vector (the components are direction cosines) and form the equation Φ_{ij}n_{j} = λn_{j}, where λ is some constant, which we may have to determine. This expresses the condition that the vector resulting when the tensor Φ "operates" on n_{i} gives a vector in the same direction, but of possibly different magnitude. It is the fundamental duty of rank-2 tensors to operate in this way on one vector (by this is meant multiplication followed by contraction) to give another vector. The expression is really three simultaneous equations for the three direction cosines n_{i}.

Therefore, we have the three equations (Φ_{ij} - λδ_{ij})n_{j} = 0 for the three values of i. These are *homogeneous* equations, and only have the solutions n_{i} = 0, unless the determinant of the coefficients vanishes. If it does, we can solve for the ratios of two of the direction cosines to the remaining one, and then determine them all by the condition that the sum of their squares must be unity. The constant λ must be chosen so that the determinant of the coefficients vanishes, |Φ_{ij} - λδ_{ij}| = 0. Since this equation is of third order, there are three roots for λ. If the three roots are equal, then the tensor is already diagonal, and is the constant tensor λδ_{ij}. The corresponding ellipsoid is just a sphere. If two roots are equal, then the remaining root gives the direction of an axis of symmetry, and the orientation of the axes perpendicular to this axis can be chosen any way one likes. The ellipsoid is then a spheroid, and Φ_{11} = λ(1), Φ_{22} = Φ_{33} = λ(2) = λ(3). Finally, in the general case all three roots are unequal, and we have an ellipsoid with unequal axes, and the diagonal elements of Φ are the three values of λ.

If the three values of λ are unequal, the resulting *principal axes* will all be *orthogonal*, that is, at right angles to each other. If two values are equal, then the axis will be orthogonal to the plane of the other two axes. The values of λ are called *eigenvalues* ("own values") and the unit vectors belonging to them are called *eigenvectors*. Working with principal axes simplifies the algebra greatly, and we usually assume that any symmetrical rank-2 tensor has been *diagonalized* in the way we have discussed. All of these matters are discussed at length in texts on linear algebra.

We now have enough theoretical background to consider the most important rank-2 tensors met with in classical physics. We'll discuss the inertia tensor, the dielectric tensor, the strain tensor, and the stress tensor in this article. Not in detail, of course, but enough to show how the tensors are defined and used, making their application easier. See the Examples for exercises in tensor diagonalization. There are good numerical methods for the diagonalization of matrices, and the basic theorems are proved in any quantum mechanics text.

The inertia tensor in the dynamics of rigid bodies is an excellent example of a rank-2 tensor where the associated ellipsoid aids in the visualization of the motion. We shall start from first principles, using index notation, and find the motion of a free rigid body--that is, a body under the action of no external forces. Any rigid body can be considered as an assembly of mass elements m with fixed positions x_{i} relative to an orgin O, in a system of coordinates fixed in the body. The origin O can be a fixed point in space and the body, or the center of mass of the body. In either case, the motion separates into translational and rotational parts. We are not concerned with the translational parts here, and will consider only the rotation. The most general motion of the body is then a rotation with angular velocity ω_{i}. The direction of the vector specifies the direction of the axis of rotation. The angular velocity may change in magnitude and direction with time, in a quite general manner. The body coordinates rotate in the same way, and so do not form an inertial system.

The rate of change of any vector quantity v_{i} in a fixed coordinate system with origin O is [dv_{i}/dt]_{fix} = [dv_{i}/dt]_{rot} + ε_{ijk}x_{j}v_{k}. This is merely the relation that in vector notation is [d**v**/dt]_{fix} = [d**v**/dt]_{rot} + **r** x **v**, which is illustrated in the diagram, which applies to any vector, not just a radius vector. Accordingly, the velocity of an element of mass m in the inertial system is v_{i} = ε_{ijk}ω_{j}x_{k}, since it is fixed in the rotating system. Summing over all mass elements, the angular momentum H_{i} = Σm ε_{ijk}x_{j} ε_{klm}ω_{l}x_{m}. Using the value of the contraction of two antisymmetric tensor densities, we find that H_{i} = [Σm(x_{k}x_{k}δ_{ij} - x_{i}x_{j}] ω_{j}, The rank-2 symmetric tensor multiplying ω_{j} is the *inertia tensor* I_{ij} of the body. The diagonal elements are called *moments of inertia*, and the off-diagonal elements *products of inertia*. For a continuous body, they may be found by integration, of course. We note that the angular momentum is not necessarily in the direction of the angular velocity.

Now we find the kinetic energy T of the body by adding up the kinetic energies of each element of mass: 2T = Σv_{k}v_{k}. Using the expression for the space velocity, we again get a contraction of two antisymmetric tensor densities. In fact, 2T = Σm ε_{ijk}ω_{j}r_{k} ε_{irs}ω_{r}r_{s} = I_{jr}ω_{j}ω_{r}. This work is a good example of how much easier it is to use the index notation than vector notation in these things.

Newton's Second Law can be applied in the fixed system to each of the elements of mass, and we find that the net torque L_{i} = Σε_{ijk}x_{j}F_{k} is the rate of change of angular momentum H_{i} in the fixed system. From the above equations, we have L_{i} = I_{ij}(dω_{j}/dt) + I_{kl}ε_{ijk}ω_{j}ω_{k}. These are three differential equations for the components of the angular velocity of the body relative to the body axes.

It is always possible to diagonalize the inertia tensor by a suitable choice of the orientation of the body axes. Let us assume this has been done, and the principal moments of inertia are A^{(i)}. The index is in parentheses because it is not a tensor index, merely an identifying label. For ease of writing, we may also represent these quantities by A, B and C. Then, I_{ij} = A^{(i)}δ_{ij}, and the equations of motion of the preceding paragraph become L_{i} = A^{(i)}(dω_{i}/dt) + ε_{ijk}A^{(k)}ω_{j}ω_{k}. These equations are known as *Euler's equations*.

Now we can consider the problem of a body moving freely with one point fixed, which can be taken as the center of mass. An asteroid moving in space is a good example of such a body, since the torques exerted by unequal solar attraction are small and have little effect if the body is rotating rapidly. Such torques indeed cause the axis of the earth to precess, but this is a slow effect. We can safely neglect such torques with an asteroid rotating with periods of hours or less.

First of all, the kinetic energy T of the body must be constant, so 2T = H_{i}ω_{i} = I_{ij}ω_{i}ω_{j} = constant. The angular momentum is also constant, since no external torques act, so this says that the component of the angular velocity in the direction of H is also a constant. If we consider a space in which the components of the angular velocity are the coordinates, then this equation also defines an ellipsoid in this space, which is similar to the inertia ellipsoid. We shall call this the *energy ellipsoid*, since its size depends on the kinetic energy of the motion. We shall choose the body axes so that the inertia tensor is diagonalized, so the body axes are also the symmetry axes of the energy tensor. The semiaxes of the energy ellipsoid are √(2T/A^{(k)}). The angular velocity is a vector from the origin O to a point on the energy ellipsoid P, as shown in the diagram at the left.

If the equation for the energy ellipsoid is differentiated, we find that I_{ij}ω_{j}dω_{i} = 0. Now dω_{i} is a vector in the tangent plane at P, so this says that the angular momentum is perpendicular to the tangent plane. Because the angular momentum is constant, this means that the orientation of the tangent plane remains constant during the motion. Also, since the component of the angular velocity in the direction of the angular momentum is constant, the distance d from the origin O to the tangent plane is constant. Therefore, we find the important result that the tangent plane to the energy ellipsoid is fixed in space. We call it the *invariant plane*. Since the angular velocity passes through the point of contact P, there is no relative motion between the energy ellipsoid and the invariant plane: the energy ellipsoid must roll on the invariant plane without slipping.

The angular velocity must terminate on the energy ellipsoid in any case. For a certain d, it must also lie on the surface obtained by squaring the relation Hd = 2T, which is Σ(A^{(i)}dω_{i})^{2} = 4T^{2}. Eliminating the constant terms between these two equations, we find that Σ(A^{(i)}/2T)[(1/d^{2}) - (A^{(i)}/2T)]ω_{i}^{2} = 0, which is the equation of a cone (not in general a circular one) with center at O, on which the angular velocity must lie. This cone intersects the energy ellipsoid in a curve called the *polhode* (Greek: "path of the pole"). This cone rolls without slipping on a similar cone fixed in space determined by the path of the angular velocity in the invariant plane, which is called the *herpolhode*. We have now arrived at a complete description of the motion of the body, making use of the inertia ellipsoid.

Let's now specialize to the case frequently met when two of the moments of inertia are the same, say A = B, with C different. The case when C < A is shown in the diagram, where the inertia ellipsoid is prolate, and the 3-axis is the symmetry axis. Now the polhode and herpolhode determine circular cones that roll on each other without slipping, and the motion can be easily visualized. The component of angular velocity Ω along the symmetry axis is called the *spin*. It is given by Ω = ω cos α. The radius of the polhode is ω sinα, and the radius of the herpolhode is ω sin (θ - α). The angle between the symmetry axis and the direction of **H** is θ. It is related to α by tanα = (C/A) tanθ. The rotation of the symmetry axis around the direction of **H** is called *precession*, and its amount is dψ/dt = CΩ/(A cosθ).

In finding the inertia tensor for bodies made up of simple geometric shapes, the *Parallel-Axis Theorem* is useful. It says that the moment of inertia about an axis parallel to an axis through the center of gravity is equal to the moment about the axis through the center of gravity plus the total mass of the body times the square of the distance between the axes, or I_{ii} = I_{ii}' + md^{2}. For products of inertia (off-diagonal elements), the product of inertia is the product of inertia referred to the center of gravity plus the mass times the product of the coordinate differences. That is, I_{ij} = I_{ij}' + mxy. Tables of the moments of inertia of simple figures can be found in handbooks and in texts of engineering mechanics. The moment of inertia of a parallelepiped about an axis through the centroid perpendicular to a face is (m/12)(a^{2} + b^{2}), where a and b are the sides of the face. The centroidal moment of inertia of a sphere is (2/5)mr^{2}, and of a disk relative to an axis perpendicular to its plane, like a wheel and axle, (1/2)mr^{2}.

Consider a solid elastic medium that is deformed by forces applied to its boundaries, or by forces exerted directly on the material by external influences. In response to these forces, a general point originally at x_{i} before the forces are applied moves to a point x_{i} + ξ_{i}. The displacement ξ_{i} is not necessarily small, but its derivatives E_{ij} = ∂_{i}ξ_{j} we presume are much less than unity. These quantities are called *strains*, and form a rank-2 tensor that is a function of position. It is convenient to separate E_{ij} into symmetric and antisymmetric parts, E_{ij} = e_{ij} + Ω_{ij}. The symmetric part e_{ij} is called the *pure strain* tensor, while the antisymmetric part Ω_{ij} represents the rotation due to the deformation. The vector associated with Ω_{ij} gives the rotational axis and the angle of rotation. We will be mainly interested in the pure strain, which represents the deformation of the medium.

If the medium returns to its initial state when the forces are removed, the deformation is called *elastic*, and all the strain energy put into the body is recovered. If the deformation is too large, the body does not return to the initial state, mechanical energy is dissipated, and the deformation is called *plastic*. For solid bodies, deformation is usually elastic for sufficiently small strains. We shall deal with this case exclusively here.

Since the pure strain is symmetric, it can be diagonalized. Let the diagonal elements be a, b, and c. Then, a small parallelepiped with sides dx, dy and dz and volume V = dxdydz becomes a small parallelepiped with sides (1 + a)dx, (1 + b)dy and (1 + c)dz after deformation, of volume V + dV = (1 + a + b + c)dxdydz, so that dV = (a + b + c)V. Quantities of the order of squares of strains have been neglected. The sum a + b + c = dV/V is the relative increase in volume at the point, called the *dilatation*. Now, a + b + c is just the trace of the strain tensor, e_{ii} = ∂_{i}ξ_{i}, which we know is invariant under rotation.

As an example, consider a body strained as shown in the figure by a shearing force. The only nonzero component of the strain tensor is E_{12} = a, where a is a small quantity equal to the angle of strain θ. Then the pure strain tensor has e_{12} = e_{21} = a, and the rotation tensor has Ω_{12} = -Ω _{21} = a. We note that the deformation is indeed equal to a rotation of a radians about the 3-axis plus an extension along the 2-axis and a contraction along the 1-axis. Indeed, the pure strain tensor has eigenvalues ±a/2. The trace is zero, so there is no dilatation. This sort of deformation is called *pure shear*.

Now we must consider the causes of deformation and rotation, stresses. If we imagine a plane surface in the body and that the material on one side of the plane surface is removed, the forces that it exerted on the material on the other side can be represented by forces acting on this plane surface. Let the unit normal vector to the surface be taken as the normal pointing towards the removed material, or outwards. The normal force is positive when it is a pull, or in the direction of the normal, or a *tension*. There may also be a force in the plane of the surface, perpendicular to the normal, called a *shear force*, and in any particular case the positive directions of its two components in the surface will be agreed upon. A force per unit area of surface is a *stress*. If a hydrostatic pressure p acts on the surface, then the normal stress is -p and the shear stresses are zero.

The stresses will, in general, depend on the direction of the normal. For any normal vector n_{i}, we will have stresses F_{i} that are a function of the n_{i}. This defines a tensor S_{ij} such that F_{i} = S_{ij}n_{j}, called the *stress tensor*. That this is a satisfactory definition is can be seen by considering the equilibrium of a small tetrahedron, as shown in the figure at the left. The three components of the force on the inclined face of area dS are balanced by equal and opposite forces on the other three sides. Note that the area of each side is n_{i}dS, and the force on it will be in the opposite direction to the force on the face dS. To prove this, write out the forces in the 1, 2 and 3 directions separately and show that they balance.

The stress tensor must be symmetric, or there would be unbalanced torques on small areas that would cause them to rotate. The absence of such rotation causes the stress tensor to be symmetric. In the figure, we take the moments of the forces tending to rotate the cube about the 3-axis that come from the 1-2 component of the tensor, showing that S_{12} = S_{21}. The same holds for each of the other off-diagonal elements.

We now look for the connection between the stress and pure strain tensors. Since the strains are small compared to unity, we assume that Hooke's Law holds, and the stresses are linear functions of the strains. Actually, Hooke's Law is always valid for *elastic* deformations, since the strain is still small when inelasticity begins. A systematic way to do this is to consider a deformation energy F defined by ∂F/∂e_{ij} = S_{ij}. Hooke's Law then requires that S be proportional to e, or that F be a homogeneous quadratic function of the strains. The most general isotropic quadratic homogeneous function of the strains is F = (1/2)λ(e_{kk})^{2} + μ(e_{ij})^{2}, where the constants λ and μ are called the *Lamé* coefficients. If we differentiate F, we find S_{ij} = λe_{kk}δ_{ij} + 2μe_{ij}, which is the relation between stress and strain for an isotropic, elastic body.

Poisson took λ = μ, as did Cauchy and others who thought there should be only one elastic constant for an isotropic medium. Their theories were based on central forces between supposed "molecules" that have little in common with our modern molecules. There was an extended debate between those who demanded more elastic constants and those who demanded fewer. In the general case (as we shall discuss below) 21 constants were required by those who favored more, and only 18 by those favoring less. Time and experiment have come down on the side of those who think there must be the larger number.

Elastic constants useful in specific problems can be defined in terms of λ and μ. If we assume an isotropic pressure p, then -p = λe_{kk} + 2μe_{(k)(k)} = (λ + 2μ/3)dV/V. The *bulk modulus* k is defined by dV/V = -p/k, so k = λ + 2μ/3. If an axial force F is applied to a rod of length L and cross-sectional area A, the rod lengthens by ΔL = FL/AY, where Y is *Young's modulus*. At the same time, the rod contracts laterally, and the ratio of the lateral strain to the longitudinal strain is σ, *Poisson's Ratio*. Under these conditions, we find that σ = λ/(2λ + 2μ) and Y = 2(σ + 1)μ (see Exercise 5). Poisson's ratio can be between 0 and 1/2, but is usually not far from 1/4, which Poisson assumed was the case. It is 1/2 for a body with no shear strength, such as a liquid. The modulus μ itself is the ratio of a shearing stress to the shearing strain (an angle), so is called the *modulus of rigidity*. For more details on each of these quantities, refer to any book on Strength of Materials. A quantity called a *modulus* has the dimensions of a stress, it should be noted.

Steel has Y = 30 x 10^{6} psi or 2.07 x 10^{12} dyne/cm^{2}, μ = 12 x 10^{6} psi or 8.27 x 10^{11} dyne/cm^{2}, and σ = 0.27. Its bulk modulus, calculated from these figures, is k = 20 x 10^{6} psi or 1.38 x 10^{12} dyne/cm^{2}. Water, on the other hand, has μ = 0, since it lacks rigidity, and k = 3.16 x 10^{5} psi or 2.18 x 10^{10} dyne/cm^{2}. It is about 63 times more compressible than steel. As for gases, dV/V = -dp/p for an ideal gas, so k = p. At one atmosphere, this is 1.47 psi or 1.013 x 10^{6} dyne/cm^{2}, about 22,000 times more compressible than water.

Rubber is a curious material with σ ≈ 0.5, which implies that it has very low rigidity, a kind of "solid fluid." For rubber, Y = 100 to 600 psi, and &mu = 30 to 200 psi, so its rigidity is indeed small. Glass is on the opposite pole, with Y = 10 x 10^{6} psi and μ = 5 x 10^{6} psi. Its Poisson's ratio is low, 0.20 to 0.27, which implies that it is quite rigid compared to its longitudinal elasticity.

A medium whose properties depend on direction is often called *anisotropic*, but a less barbarous word is *aelótropic* (Greek: change-turning). The above expression for F suggests the generalization to an aelotropic medium by setting F = (1/2)λ_{ijkl}e_{ij}e_{kl}, which gives S_{1j} = λ_{ijkl}e_{kl}. The rank-4 tensor λ_{ijkl} is called the *elastic constant tensor*, an operator connecting two rank-2 tensors just as a rank-2 tensor is an operator connecting two vectors. In general, it has 81 components, but there is a great deal of symmetry here, since the energy is unchanged if i,j or k,l are interchanged (the strain tensors are symmetric), or if i,k and j,l are simultaneously interchanged. As a result, there are only 21 independent components at worst. It is confusing to work this out, but all 81 components can be listed, and equal ones sorted out, as a last resort.

The symmetry of crystals will introduce more relations between the constants, reducing their number. Even in a triclinic crystal, the least symmetric, the possibility of choosing convenient coordinates reduces the number to 18. A monoclinic crystal has 12, an orthorhombic crystal 9. A tetragonal crystal has 6, as does a trigonal (rhombohedral) crystal like calcite or quartz. A hexagonal crystal has 5, and a cubic crystal 3. Cubic crystals are often considered to be isotropic in many respects, but they have an extra elastic constant that truly isotropic materials do not have. The constant corresponding to λ splits into two, one the coefficient of the squares of the diagonal elements of e, the other the coefficient of the cross products. This reduction in the number of constants is found by applying the symmetry elements of the crystals, which turns some constants into their negatives, so they must vanish. This is an interesting point, but we cannot go into it further at this time.

An interesting application of this theory is to the propagation of mechanical waves in solids, which has application to seismology, nondestructive testing, and other fields. Such waves are almost always elastic, and have an interesting variety of properties that have been intensively studied. The theory is generally more complicated than that of electromagnetic waves, and includes the subject of surface waves (Rayleigh and Love waves), as well as waves in an infinite medium.

The relation between the electric displacement **D** (statcoulombs/cm_{2}) and the electric intensity **E** (statvolts/cm), considered in the first approximation as a proportionality, will in general be given by D_{i} = ε_{ij}E_{j}, where the dielectric tensor satisfies ε_{ij} = ε_{ji}. The requirement of symmetry comes from several sources, one of which is simply that the tensor should be diagonalizable by an ordinary rotation, which establishes the three orthogonal principal axes of polarization, and the three principal dielectric constants, its eigenvalues. This yields D_{(k)} = ε_{(k)}E_{(k)}, where k = 1, 2 and 3. An electric intensity in a principal direction gives a displacement in the same direction. If all three directions are equivalent in the medium, then the three constants are equal: ε_{(k)} = ε, and an electric intensity in any direction creates a displacement in the same direction; the medium is then *isotropic*.

In crystals belonging to the cubic or isometric system, three equivalent axes can be found, so the medium is isotropic. In the trigonal (rhombohedral), tetragonal and hexagonal systems, two equivalent axes perpendicular to the third can be found. If the unique axis is the 3-axis, then ε_{(1)} = ε_{2}, while ε_{3} is different. In the orthorhombic, monoclinic and triclinic systems, all three principal dielectric constants may differ. In any case, dielectric constants may *accidentally* be closely the same, so the medium will resemble a more symmetric one. Noncrystalline materials, such as polymer sheets, may be anisotropic as well. Nonisotropic transparent materials may show *double refraction*, discovered in 1669 by Bartholinus in Iceland spar, or transparent calcite, CaCO_{3}, and first explained by Huygens not long afterwards. Such materials are also called *birefringent*.

We must first discuss the propagation of plane electromagnetic waves in an anisotropic medium. We shall assume throughout that **B** = **H**, or μ = 1, which is true in transparent crystals. Assume that the wave depends on space and time through a factor exp[iω(t - n**r**·**s**/c)], where ω is the angular frequency, v = c/n is the phase velocity, **r** the position vector and **s** is the unit vector normal to the wavefront. Then we can replace time derivatives by iω and del by -iωn**s**/c in Maxwell's equations. We find: **s**·**D** = 0, **s**·**H** = 0, n**s** x **H** = -**D**, and n**s** x **E** = **H**. All the terms in these expressions are constant vector amplitudes.

The spatial relations between the vectors used to describe the wave are shown in the diagram at the right. The wavefront is the plane of **H** and **D**, with **s** the wavefront normal. **E** lies in the plane of **s** and **D** making an angle α with **D**. Since browsers do not yet support the perpendicular symbol, the electric intensity along **D** will be represented in the text by **E+**. The Poynting vector **S** is in the direction of **t**, the ray unit vector. This gives the direction of energy flow, which is not normal to the wavefront. The phase velocity is v**s**, with v = c/n, and the ray velocity is v_{r}**t**, with v_{r} = c/n_{r}. The electric energy density w_{e} = **E**·**D**/8π = (n/8π)**H**·(**s** x **E**) and the magnetic energy density w_{m} = **H**·**B**/8π is given by exactly the same expression, so the total field energy w = (n/c)**s**·**S**, where **S** = (c/4π)(**E** x **H**) is the Poynting vector. Now, v_{r} = S/w, so v = v_{r}**s**·**t** = v_{r} cos α. The indices of refraction are then related by n_{r} = n cos α.

The magnetic field can be eliminated between the Maxwell curl equations with the important result that **D** = n^{2}[**E** - **s**(**s**·**E**)] = n^{2}**E+**. This is a relation between E and D that must be satisfied by the wave, and is completely independent of the relation between them given by the dielectric tensor. Both relations must be satisfied in the wave. Since **E+** = (**E**·**D**)**D**/d^{2}, this relation can also be written as n^{2} = D^{2}/(**E**·**D**). Similarly, it can be shown that n_{r}^{2} = (**D**·**E**)/E^{2}.

Applying the diagonalized dielectric tensor (which means everything must be referred to principal axes) we find ε_{k}E_{k} = n^{2}[E_{k} - s_{k}(**s**·**E**)]. The k here is not a tensor index, merely an indentifier of one of the three components. Written out fully, we find three homogeneous equations for the field components E_{k}. Such a system can have non-zero solutions only if the determinant of the coefficients vanishes. This condition will determine the value of n, or the phase velocity, for which both Maxwell's equations and the dielectric relation are satisfied. Fresnel discovered a cunning way to do this that yields a symmetrical equation for v in terms of the components of the wave normal vector.

Rearrange the equation to E_{k} = [n^{2}s_{k}/(n^{2}k)] (**s**·**E**). This is valid if the quantity in parentheses does not vanish. Now multiply by s_{k} and add the equations for k = 1, 2 and 3. The dot product cancels, and we find a sum of three similar terms that adds to 1. Now the sum of the squares of s_{k} is also 1, so subtract this from both sides and combine the terms with the same s_{k}'s. Now change from n's and ε's to the velocities. The final result is s_{1}^{2}/(v^{2} - v_{1}^{2}) + s_{2}^{2}/(v^{2} - v_{2}^{2}) +
s_{3}^{2}/(v^{2} - v_{3}^{2}). This is *Fresnel's equation of wave normals*. To use it, multiply by the denominators to clear of fractions. The result is a quadratic equation in v^{2}, which gives two values for any value of **s**. The squares mean that the wave can travel in either direction. We will see that each pair of solutions for the same **s** gives us two plane-polarized waves polarized at right angles to each other which, in general, travel at different velocities.

A similar process in terms of **t** instead of **s** yields *Fresnel's ray equation*, which gives us the ray velocity v_{r} in terms of the ray unit vector. This can be found in Born and Wolf. What is more interesting is to find **t** in terms of **s** and the phase velocity. The result is t_{k} = s_{k}{[v^{2} + g^{2}/(v^{2} - v_{k}^{2})]/√(v^{4} + g^{2})}, where g^{2} = v^{2}(v_{r}^{2} - v^{2}) = {[s_{1}/(v_{1}^{2} - v^{2})]^{2} + [s_{2}/(v_{2}^{2} - v^{2})]^{2} + [s_{3}/(v_{3}^{2} - v^{2})]^{2}}^{-1}. From these formulas, we can find the ray vector and the ray (energy) velocity. When the phase and ray velocities re known, then the angle α can be found by a formula given above.

Now we can proceed to consider a geometric construction to find the directions of polarization of a wave travelling in an arbitrary direction in an anisotropic medium. The field energy w is given by 8πw = D_{1}^{2}/ε_{1} + D_{2}^{2}/ε_{2} +
D_{3}^{2}/ε_{3}. Taking new variables x_{k} = D_{k}/√4πw, we have the equation x_{1}^{2}/ε_{1} + x_{2}^{2}/ε_{2} + x_{3}^{2}/ε_{3}, which is the equation of an ellipsoid whose semiaxes are the square roots of the principal dielectric constants, or the indices of refraction for waves polarized along a principal axis. This ellipsoid has been called the *optical indicatrix* or the *index ellipsoid*. Consider the intersection of the plane x_{1}s_{1} + x_{2}s_{2} + x_{3}s_{3} = 0 normal to a wave vector **s** with the ellipsoid. This intersection will be an ellipse. The axes of this ellipse are the directions of polarization of the two waves with the given wave vector.

It should be clear that once we assume a direction for the wave, the polarization in the perpendicular plane must be carefully chosen so that **E** and **D** are coplanar with **s**, which in general they will not be. This happens for two polarizations that differ by 90°, and the two polarizations will travel with different velocities. This allows the two polarizations to be separated by refraction, as in a Nicol prism.

This is proved by finding the extrema of the vector from the origin to the ellipse, and showing that the result is the same condition that we found above from the combination of Maxwell's equations and the dielectric tensor. Any direction **s** for which the intersection is a circle is a direction along which a wave may have any polarization, and all polarizations travel at the same velocity. Such a direction is called an *optic axis* of the crystal. For an ellipsoid with axes all of different lengths, there will be two such directions symmetrically located in the plane of the two axes with the largest and smallest indices on either side of the axis of greatest index. Such media are called *biaxial*.

If two indices are equal, then a plane perpendicular to the axis of the third index cuts the ellipsoid, which is now a spheroid, in a circle. Therefore, this axis is the optic axis, which is the only one. Such media are called *uniaxial*. If the third index is greater than the other two, the medium is called *positive*, and *negative* otherwise. The two equal indices, or velocities, are called *ordinary* and denoted by a subscript "o" while the third is called *extraordinary* and denoted by a subscript "e." If we write Fresnel's equation of wave normals for this case, for a wave travelling in a direction making an angle θ with the optic axis, we easily find the two solutions v^{2} = v_{o}^{2} and v^{2} = v_{o}^{2} cos^{2} θ + v_{e}^{2} sin^{2} θ. This gives two surfaces, one a sphere of radius v_{o} for the "ordinary" wave, and one a surface of fourth order, an ovoid, for the "extraordinary" wave. For a positive uniaxial medium, the ovoid is inside the sphere, while for a negative uniaxial medium, it is outside, as shown in the diagram. These figures are not the index ellipsoid.

The distance between the ordinary and extraordinary wave normal surfaces is greatly exaggerated in the diagrams, as can be seen from the actual figures for calcite and quartz that are given. The extraordinary velocity surface is drawn as an ellipse for convenience, but it is not an ellipse, being blunter at the ends of the major and minor axes than an ellipse. Huygens took it for an ellipse, which indeed it closely resembles, and elementary Optics texts have followed him in this, but this is an error, although it gives the ray direction quite well using Huygen's construction. This is reminiscent of the Bohr atom, in which erroneous concepts give the correct result. In quartz, the two surfaces do not exactly meet on the optic axis, and this is the source of *optical activity* in quartz, but the effect can be neglected here. The optic axis of a cleavage rhomb of calcite passes through a blunt corner making equal angles with the sides and faces. If you look at a dot through the crystal, two images will be seen. When you rotate the crystal, the ordinary image will remain fixed while the extraordinary image rotates around it. The reason is that the ray velocity of the extraordinary wave is inclined to the wavefronts, lying in the plane containing the wave normal and the optic axis, as shown in the diagram. The ordinary wave is polarized at right angles to the optic axis, the extraordinary ray is polarized parallel to the optic axis, a fact easily checked with a Polaroid filter.

1. Diagonalize the symmetric matrix with rows: 6 0 0; 0 34 12; 0 12 41, finding the eigenvalues and the eigenvectors. Sketch the associated ellipsoid. (Leigh Page) Answers: the eigenvalues are 6, 50, and 25. The eigenvectors are (1,0,0), (0,3/5,4/5), (0,3/5,-4/5).

2. Find the inertia tensor for two masses of 100 g that are 10 cm from the x-axis, but on opposite sides and 10 cm apart vertically. Find the angular momentum when the masses rotate at 600 rpm about the x-axis, and the resulting moment. What rotating torque must be supplied? Check by analyzing the problem as two point masses. Find the principal axes by inspection. Answers: I = 20,000 -10,000 0; -10,000 5000 0; 0 0 25,000. Eigenvalues 25,000, 0, 25,000. H_{1} = 20,000ω, H_{2} = 10,000ω, H_{3} = 0. Torque 10,000ω^{2} dyne-cm normal to plane of masses. Principal axes: line joining the masses, line normal to this, line at right angles to these two axes.

3. The body shown in the diagram is constructed from three identical cubes of side a and mass m. C is the center of mass of the body. Find the inertia tensor with the axes shown, using the parallel-axis theorem. Diagonalize the tensor, and find the principal moments of inertia and the principal axes. The problem can be solved using rational numbers. The principal axes can be determined by inspection, which will be a check on your solution. Answers: Principal moments of inertia 3/2, 5/6 and 1/2, in units of ma^{2}. Principal axes: rotate x,z 90° about the y-axis.

4. Find the angle between the wave normal and the ray for light perpendicularly incident on a cleavage face of a calcite rhomb. The optic axis passes through a blunt corner where three faces meet with face angles of 101° 55'. From this information, find the inclination θ of the wave normal **s** to the optic axis. Find the phase and ray velocities for this case. The indices of refraction are n_{o} = 1.6584, n_{e} = 1.4864. Answers: θ = 44.610°, v = 0.63836c, v_{r} = 0.64215c, α = 6.228°.

5. From the equation giving the stress tensor in terms of the strain tensor, k and μ, find the relation between the traces of the strain and stress tensors, and from this the equation giving the strain tensor in terms of the stress tensor. Use this equation to find the strains when only σ_{11} is nonzero. This is the case of uniaxial homogeneous stress in a uniform rod, when both stress and strain tensors are diagonal. Find expressions for Young's modulus, Y = σ_{11}/e_{11}, and Poisson's ratio, σ = -e_{22}/e_{11} = -e_{33}/e_{11}, in terms of k and μ.

L. Page, *Introduction to Theoretical Physics*, 3rd ed. (New York: D. Van Nostrand, 1952). Chapters II and III.

M. Born and E. Wolf, *Principles of Optics* (London: Pergamon Press, 1959). Chapter XIV.

A. E. H. Love, *A Treatise on the Mathematical Theory of Elasticity*, 4th ed. (New York: Dover, 1944; reprint of the 1927 edition by Cambridge University Press). Contains a good Historical Introduction.

L. D. Landau and E. M. Lifshitz, *Theory of Elasticity* (London: Pergamon Press, 1959). An excellent concise treatment, using index notation.

Return to Physics Index

Composed by J. B. Calvert

Created 13 October 2002

Last revised 2 June 2009