Quaternions

Hamilton's quaternions, or hypercomplex numbers, are practically forgotten. After reviewing complex numbers, we play with them


The name of the brilliant William Rowan Hamilton (1805-1865), Irish physicist and mathematician, is familiar to every physicist in the form of the Hamiltonian function in quantum mechanics. Hamilton's analytical mechanics played an important, perhaps essential, role in the development of quantum mechanics. Hamilton also created the quaternion, an interesting mathematical notion, but it finds no application in modern physics, its place taken by other concepts. Curiously, quaternions can be applied to relativity, and some examples will be given below.

To appreciate what quaternions are about, one must first understand complex and imaginary numbers. These are very inappropriate terms, but they are customary and universal. As you know, ordinary numbers are subject to the four basic operations of arithmetic with their usual rules. In fact, these operations and rules pretty much define what a number is, and give it its useful properties. It is found that number pairs, (a,b) can also be made to act just like numbers, provided the operations are defined properly. If (a,b) + (c,d) = (a+b,c+d), (a,b) - (c,d) = (a-b,c-d), (a,b)(c,d)=(ac-bd,bc+ad), (a,b)/(c,d) = ((ac+bd)/(c2+d2),(bc-ad)/(c2+d2)). These look pretty arbitrary, but they are really finely crafted. Work out [(a,b)/(c,d)](c,d) for yourself, and check that it comes out (a,b) as it should. Also, if (a,b) = (c,d), then a = c and b = d.

Now let us solve the quadratic equation x2 = -1. Let x = (a,b), so that (a,b)2 = (a2-b2,2ab) = (-1,0), if we agree to write an ordinary number a as (a,0). Now, (a,0)+(b,0) = (a+b,0), (a,0)(b,0) = (ab,0), and (a,0)/(b,0) = (a/b,0), so this is quite all right. The quadratic equation now gives us a2-b2 = -1 and 2ab = 0. The solution of these equations is b = 1, a = 0, so (-1)1/2 = (0,1). We can solve quadratic equations with the new numbers that baffled us with only simple numbers. In fact, the new numbers allow the solution of any quadratic equation, and first saw light in this application. A number (0,a) was called imaginary, and (a,b) was called complex, and we must live with it.

It should be obvious that we can write any complex number (a,b) as (a,0) + (b,0)(0,1), and when we do, algebra gives us all the arithemetic properties we wrote down above. This means we can write (a,b) = a + ib, where i is shorthand for (0,1), called the imaginary unit.

Complex numbers are very special number pairs. Just associating two numbers together as a 1x2 matrix tells us very little. Only if the numbers obey the algebraic rules given above do they become complex numbers. Another sort of number pair is the vector. For the vector (a,b), we define addition and subtraction as for complex numbers, but not multiplication or division. The inner product (a,b).(c,d) = ac + bd, and the outer product (a,b)x(c,d) = ad - bc. Neither of these products results in a new vector. (In three dimensions, the outer product can be defined as a vector product, however.) Multiplication by a scalar is α(a,b) = (αa,αb). The square of the length of a vector is (a,b)(a,b) = a2 + b2. Vectors have an obvious geometrical interpretation as displacements. They are not numbers, however.

The complex number (a,b) can be associated with the vector (a,b), so complex numbers can be represented in the plane, called the complex plane. This is an extremely valuable picture of complex numbers, but it should be remembered that complex numbers have properties that vectors do not have. For example, (0,1)(a,b) = (a,b)(0,1) = (-b,a) is a vector rotated anticlockwise by 90°. Thus, the imaginary unit i corresponds to a rotation of 90°, and in general the multiplication or division of two complex numbers gives a third of different length and direction in the complex plane. This is very un-vectorlike behaviour. The square of the 'length' of a complex number is (a,b)(a,-b) = (a2+b2,0). The number (a,-b) is called the conjugate of (a,b).

Complex numbers have all the algebraic properties of ordinary numbers, so that, among other things, we can form power series, such as 1 + x + x2/2! + x3/3! + .... This series gives us ex, which we have just extended into the complex domain by simply using complex numbers instead of ordinary ones. Amazing things now appear, such as eix = cos x + i sin x. The combination of complex numbers and calculus makes an extremely powerful tool for analytical investigations.

We cannot pass by without mentioning the polar representation of complex numbers that comes directly from the vector analogy, and uses the properties of eix. For, Aeix = A cos x + i A sin x = (A cos x,A sin x). A complex number can be algebraically represented by Ae, where A is its magnitude and φ its angle. Now multiplication and division are easy: (Ae)(Be) = ABei(φ + ψ), for example.

Finally we can introduce the quaternion Q = (a,b,c,d) = ia + jb + kc + d, where, by analogy with complex numbers, we have written the quantities i,j,k that possess the properties ij = -ji = k, jk = -kj = i, ki = -ik = j, i2 = j2 = k2 = -1. These hypercomplex units resemble unit vectors along the coordinate axes that combine by the vector product. The quaternions Q behave like numbers, and you can write down the rules of quaternion arithmetic by using the properties of the units i,j,k. It is obvious from the properties of the i,j,k that PQ != QP (!= is 'not =' in C) in general, or that quaternion multiplication is not commutative. Quaternions are still numbers, but strange ones, and care must be taken in algebra to preserve the order of the factors. The geometric interpretation must be in a four-dimensional space, so it is not as easy to sketch as that for complex numbers. Also, there is no simple problem like quadratic equations for which quaternions give a ready solution. Since multiplication is not commutative, division of two quaternions is ambiguous.

The quaternion conjugate to Q is defined as Q~ = (-a,-b,-c,d), so that |Q|2 = QQ~ = Q~Q = a2 + b2 + c2 + d2, the square of the length of the four-dimensional vector (a,b,c,d) in a Euclidean space, which is a pure number. We can allow the components of Q to be complex, in which case the complex conjugate is Q* = (a*,b*,c*,d*). We note that if a = x, b = y, c = z, d = ict, QQ~ = x2 + y2 + z2 - c2t2 = s2, a relativistic invariant. A quaternion can represent a relativistic four-vector! Either the time or the space coordinates must be distinguished by the imaginary unit; here we have chosen to use an imaginary time. The condition is that Q* = -Q~ for q to represent a four-vector. In quantum mechanics, we use the imaginary unit for a different purpose, so that if we attack relativistic quantum mechanics with an imaginary time coordinate, it becomes very annoying to keep the two uses separate. So long as we stay away from quantum mechanics, however, the use of ict is convenient (as Minkowski found).

The rule (AB)~ = B~A~ can be proved by writing out both sides of the equation. As an example of its use, |AB|2 = (AB)(AB)~ = ABB~A~ = A|B|2A~ = |B|2AA~ = |B|2|A|2. A transformation R' = AR that preserves the length of R is a rotation in the four-dimensional space. Hence, |R'|2 = R'R'~ = (AR)(AR)~ = ARR~A~ = |A|2|R|2 implies that we must have |A|2 = 1. The same holds for a transformation R' = RB, so the general transformation R' = ARB has eight parameters, restricted by two conditions, or six independent parameters in all. This happens to be precisely the number of parameters describing a general rotation in four dimensions.

Now consider the transformations of the form R' = ARA~. (RS)' = ARSA~= ARA~ASA~ = R'S'. Such transformations preserve the products of quaternions as well. There are, however, only three independent parameters. To see what is happening, let R = (0,0,0,a). Then, R' = A(0,0,0,a)A~ = aAA~ = a. These transformations leave the real component of the quaternion invariant. Such transformations represent rotations in three-dimensional space. The noncommutatvity of quaternions is perfectly adapted to represent the noncommutativity of finite rotations in space. Here is an application at last!

If we subject A itself to the transformation, A' = AAA~ = A, so A is an invariant, and any quaternion αA will be unchanged under the rotation. A must represent the axis of the rotation. The three parameters could be the direction of this axis, and the angle of rotation. The four components of A are then chosen to give the required changes in the three space coordinates, and to satisfy the condition of unit length. To illustrate this with an explicit example, consider a rotation of amount φ about the z-axis, where we expect that x' = x cos φ - y sin φ, y' = x sin φ + y cos φ, z' = z. The appropriate quaternion would seem to be A = (0,0,a,b), with a2 + b2 = 1. Transforming the quaternion R = (x,y,z,0), we find that x' = (b2 - a2)x - 2aby, y' = 2abx + (b2 - a2)y, z' = z. We can see that the solution for a and b is a = sin(φ/2), b = cos(φ/2). The quaternion A = (0,0,sin(φ/2),cos(φ/2)) gives a rotation of angle φ about the z-axis. There will be similar results for the other axes, and a general rotation can be built from a product of them. The appearance of the half-angles is characteristic. The product of two transformations, BA, is again a transformation, and A(BC) = (AB)C. The identity transformation is (0,0,0,1), and the inverse to A is A~. Therefore, these transformations form a group, the three-dimensional rotation group. Quaternions are well-adapted to studying the rotation group, which is probably their most important classical application.

If you are not acquainted with relativity, the remaining paragraphs will be difficult to understand, and you will have to be satisfied with the application of quaternions to rotations. A quaternion representing a relativistic four-vector has the property R* = -R~. A Lorentz transformation must preserve this form, as well as the invariant RR~. If A is real, this is certainly true, since R'* = (ARA~)* = AR*A~ = -AR~A~ = -R'~. But a real A cannot describe the interesting Lorentz transformations, since it leaves the time component unchanged. If A is complex, we then have eight parameters, only six of which can be independent. Therefore, two conditions on A are necessary. Again, we start from R' = ARB, and require that (ARB)~ = B~R~A~ = -(ARB)* = -A*R*B*. Therefore, let B~ = A*, so that we have R' = ARA*~. Now, if A = A' + iA", invariance of RR~ requires that A'~A' - A"~A" = 1, and A'~A" + A"~A' = 0. These are the two necessary conditions. If A is real, they reduce to the single condition A~A = 1, as in the case of rotations.

In the case of the rotations, we know that a vector in the direction of the axis of rotation was unchanged, R' = R. In general, some vector R will be changed only in length by the transformation, not in direction, or R' = λR, where λ is a number, perhaps complex. Since |R'|2 = |R|2 under the transformation, the equation (λ2 - 1)|R|2 = 0 must be satisfied. This was certainly true for the rotations, where λ = 1, so R could be some vector of nonzero length. In the general Lorentz transformation, it happens that λ is not +1 or -1, so we must have |R|2 = 0. This means that R has zero length, or x2 + y2 + z2 - c2t2 = 0. This implies that the vector is in the light cone, the surface separating timelike and spacelike intervals. If you are acquainted with relativity, you will recognize the importance of the light cone. There are two such 'principal axes' of the transformation, which we shall call P and Q.

Lanczos gives the details of finding P and Q. In short, P = p(1 + iP) and Q = p(1 + iQ), where P and Q are real 3-vectors of unit length, and p is found from 2p2(1 - cosθ) = 1, where θ is the angle between P and Q. Then the quaternion of the transformation is A = αPQ~ + QP~/α, where α is a complex parameter. One can show directly that P' = αα* P, Q' = Q / αα, (PQ)' = (α/α*)(PQ), and (QP)' = (α*/α)(QP). These are the four principal axes of the general Lorentz transformation.

The familiar Lorentz transformation for motion along the x-axis without rotation is x' = γ(x - βct), y' = y, z' = z, ct' = γ(ct - βx), where β = v/c and γ = (1 - β2)-1/2. For this transformation, cos θ = -1, p = 1/2, α = [(1 - β)/(1 + β)]1/4, P = (1 + ii)/2, Q = (1 - ii)/2, where i is the imaginary unit, and i is the quaternion. The Lorentz group is quite complicated, so we shall go no further with it, having shown that quaternions do have an application in modern physics, as well as in classical physics.

I have made a program that does complex quaternion arithmetic, which allows numerical calculations of rotations and relativistic four-vectors with very little effort. Keeping track of the eight parameters of a quaternion is very tedious and confusing, so a program like this one is an excellent aid, and one that computers make available to us.

References

C. Lanczos, The Variational Principles of Mechanics (Toronto: University of Toronto Press, 1970), pp. 303-314, uses quaternions to represent the Lorentz transformation.

Alessandro Rosa's page on quaternion transcendental functions, with pretty pictures.


Return to Math Index

Composed by J. B. Calvert
Created 26 July 2000
Last revised 29 September 2001