Groups for Dummies

Why Study the Mathematical Theory of Groups?

The title of this page recalls a familiar series of instructional books on computer topics that require no previous knowledge of the subject, giving explanations that would be considered too elementary for the notice of the specialist. Of course, they are not for "dummies" but for those intelligent enough to want to learn, and who can learn relying on their own resources. The purpose of this page is to explain the wonderful algebraic properties of matrix representations of finite groups, which are not only delightful, but also useful in many connections. This theory extends naturally to infinite and continuous groups, so it may be considered fundamental and general.

One reason to study groups is the mathematical beauty of the subject, but an equally valid reason is for its practical application, especially in quantum mechanics and related subjects. The source of this application is that groups express symmetry. In the macroscopic world, we observe coarse bodies and represent them by idealized points and curves that approximate reality. In quantum mechanics, on the other hand, symmetries are exact, not approximate, and the idealization is often the reality. Any two electrons are precisely alike, not merely close resemblances as would be two peas in a pod. The ammonia molecule, a nitrogen atom with three hydrogen atoms forming a kind of triangular tent, is precisely symmetrical, not merely closely symmetrical, as would be a molecular model. An isolated atom is precisely spherically symmetric. Each of these examples has macroscopic consequences that are proof of the exact symmetry, and the symmetry is an indispensable tool in the study of them.

The portion of the theory of groups that is presented here is the part that is of use to molecular spectroscopists, usually chemists, in the classification and analysis of the complex infrared and optical spectra of molecules. There are many references addressed to this audience, and the subject is regarded as a difficult one. The fact that it is studied by these unwilling victims is proof of its utility. Physicists need the theory in studying atomic spectroscopy. With their characteristic ingenuity, they developed elaborate methods and rules that avoided the explicit use of group theory. Group theory also arose in connection with elementary particles, and here there was no substitute. In addition, it was arcane and esoteric enough to impress the unenlightened, and many physicists have learned to speak group theory without understanding it in the least.

Group theory, however, is really not very difficult although it uses unfamiliar mathematics such as linear algebra, determinants and matrices. Enough of these supporting topics will be presented to make the general argument clear, it is hoped, but the treatment will not be exhaustive or complete.

What is a Group?

A mathematical group is a much different thing than a group in everyday language. An everyday group is a set with some distinguishing characteristic that unites its members. A mathematical group is a set of transformations, not objects in the usual sense. A transformation acts on some object and alters it to a different object. For example, it may move an object from one position to another. What it does is quite general and arbitrary; it is only necessary that there is an initial state and a final state with some fixed and definite relation to one another. All the transformations in a group are supposed to act on the same objects, and capable of being applied successively. Later, we shall concentrate on one specific kind of transformation that will be quite concrete.

To achieve the status of a group, a set of transformations must satisfy the four group postulates. Three of the four are almost trivial, two of them merely say what cannot be left out of the set. First, the option of doing nothing, the Identity (I) or Einheit (E) transformation, must be included in the set. Second, there must be some way of reversing each transformation, so that if any transformation is included, so is its reverse or inverse. The third easy postulate merely states a general property of abstract transformations. If we denote transformations by letters, and let AB represent the result of doing B first and then A, and call it some transformation D, and further that BC, doing C first then B, is transformation E, then DC = AE. In algebraic notation, (AB)C = A(BC). This is called the associative law, and will always hold naturally unless we define our transformations in some perverse, arbitrary way that might delight a mathematician.

The fourth postulate is what contains all the magic. It states that the set is closed. That is, any transformation we may get by using those in the group must also belong to the group. This postulate can be desperately hard to satisfy with arbitrary choices of transformations. All of the striking mathematical results that we shall find flow from this postulate. A finite group is a set with a finite number g of members. The integer g is called the order of the group.

Everything is known about a group if we know what any two transformations, or group elements, that it contains give when carried out one after the other. This operation is commonly called, and represented algebraically, as multiplication, but has nothing whatever to do with multiplication in the usual sense, and it is good to thorougly expunge any such relation from your mind. We have already done this above, where AB represents the result of doing B first and then A. Of course, AB could mean doing A first and then B, and you might think this is more normal. Our peculiar choice is a result of considering the group members as operators doing something to what follows them, as in A|>, where |> (whatever it might be) is transformed by A. The order of operators is, as we shall see, important. AB may be different than BA (we will give examples!). When AB = BA, the operators A and B are said to commute.

All possible "products" of group members may be displayed in a square array for finite groups, called the multiplication table by analogy. Let the product AB be found at the intersection of the row beginning with A and the column beginning with B, where the rows and columns are labeled by the group members. It is usual to put the identity, I or E, at the upper left-hand corner. Such tables are severely restricted by the group postulates. One requirement is that each row and column shall be a permutation of the group elements, with none repeated or omitted. Multiplication tables are useful only for groups of small order, but are very graphic representations in these cases. Two groups of the same order that have the same multiplication table (for some correlation between the elements of the two groups) are abstractly the same, and are called isomorphic ("equal form"). As far as group theory is concerned, groups are the same if they are isomorphic. They may be very different in their explicit realizations.

Only one group exists of orders 1, 2, 3, 5, 7 and other prime numbers. There are two groups of orders 4 and 6, and five groups of order 8. The number of groups of any order is severely restricted, and there is no known way of determining how many groups there are of any order.

Often one can select a set of members of a group that is closed itself. Of course, we have the identity alone, I, and the whole group G, but it is more interesting if the order h of the subgroup satisfies 1 < h < g. Then we speak of a proper subgroup. The order of a subgroup h must be a factor of the order of the group. This is proved by noting that the elements HS and HR, where H is any member of the subgroup and S, R any other group elements, are either all the same or all different, so we can lay out the group in rows of h elements until finally nh = g.

Similarity Transformation

We must now get physical, since trying to express ourselves abstractly will only confuse. Nevertheless, all that we say can be generalized to a higher abstract level of less concreteness. We shall suppose that the transformations are those of points in 3-dimensional Euclidean space. We choose orthogonal directions and specify a point or vector by the coordinates (x,y,z). The transformations are of the form (x,y,z) -> (x',y',z') where x' = a11x + a12y + a13z, y' = a21x + a22y + a23z, z' = a31x + a32y + a33z. We have already used subscripts 1,2,3 on the matrix elements aij, and may soon write x = x1, y = x2 and z = x3, or xi as short for any of the three. It is much easier to write formulas with indices, but if you do not immediately understand what is written, you should write it out explicity with 1,2,3 or even x,y,z.

This form of transformation is called a linear substitution, and is represented by a matrix A = aij. Here, i and j run from 1 to 3, but in general can have any finite range. In the simplest case, we have just x' = ax, and a is just a number. If a matrix corresponds to each transformation, and each transformation corresponds to a group member, we have a matrix representation of the group, or simply a representation, for short. In general, there are very many ways of setting up a matrix representation by deciding on a basis (here, vectors of three dimensions) and operating on it by the group operations. You may think that restricting representations to such matrices is a severe restriction, but it is not. Linear substitutions are a very general transformation (including all rotations and reflections in space, for example), and coordinates are a useful way of expressing many relations. We shall represent the matrix corresponding to the group member A in a certain representation by D(A).

We know that it does not matter how we orient our 3-dimensional coordinate system in space. One system is as good as another, since space is isotropic. If v is a vector in one system, then v' = Sv can be considered as the same vector in a different system, rotated by the linear substitution S. In this case, S will be an orthogonal matrix (inverse equal to transpose), but the transformation we are considering does not have to be so special. The only thing we require of S is that it have an inverse, S-1, so that SS-1 S-1S = I (i.e., we can go both ways between the two systems at will). Now, let v' = D(A)v give the effect of group member A on an arbitrary vector. In the alternative system, Sv' = SD(A)S-1Sv. Therefore, SD(A)S-1 does the same thing in the alternative system that D(A) does in the original system. This kind of matrix transformation by a nonsingular matrix (one that has an inverse) is called a similarity transformation.

The representation by the g matrices SD(A)S-1 is said to be equivalent to the representation by the g matrices D(A). Which representation we get depends on which basis we use, and we can choose a basis arbitrarily. The two equivalent representations may look nothing alike, having only the fact that they have the same number of dimensions. It would be very nice to know when two representations are equivalent, and we will see that there is a way to find this out.

A Concrete Example: C3v

It's now time to look at a specific example, and a good example is the symmetry group C3v. The ammonia molecule has this spatial symmetry. A symmetry group is a group of spatial transformations that leaves an object unchanged. C3v is a point group, which is a group in which one point remains fixed under all symmetry operations. Imagine that you are looking at an ammonia molecule, and then look away while an associate moves the molecule (it is a remarkable associate who can reflect the molecule in a plane as well as rotate it). If you look back and the molecule appears as if it had not been moved, then the associate has made a symmetry transformation. These transformations form a group, because they satisfy the four group postulates. Think about this: the associate could have left the molecule unmoved (E); however it was moved, it could be moved back (A-1), and so forth.

As symmetry operations or transformations, we allow rotations, reflection in a plane, and any combination of these. Combinations include inversion in the origin, rotation about an axis plus a reflection in a normal plane (the rotation and reflection do not necessarily have to be symmetry operations themselves), and so forth.

For a representation of the symmetry group, we might choose as a basis the rectangular coordinates x, y and z in space. If all the rotations are about a single axis (as with ammonia), we may consider rotations about an axis normal to the x,y-plane through the origin, and reflections through planes containing the axis (called vertical planes). We will usually need the matrices corresponding to these operations, which are derived in the figure at the right.

There are two ways we can consider any transformation, called active and passive. In the first picture, the axes remain fixed, while the object moves. This is shown at the left, where vector v becomes vector v' when rotated about the orgin through an angle θ. In the second picture, the object remains fixed, while the coordinate axes rotate through an angle θ, as shown at the right. Note that the sense of the rotation is opposite in the two cases, but the effect on the coordinates describing the object is the same. Here we shall generally use the passive picture, and think of the effects of choosing a different basis to describe the same object. There is no essential difference, but it is necessary to be consistent. The figure also reviews how a linear substitution is expressed by matrices, and how matrices multiply. The vector is written as a 2x1 matrix, while the transformation is a 2x2 square matrix.

The determinant of the rotation matrix D(θ) is +1, which is characteristic of all rotation matrices, since they do not change the volume. The trace of the rotation matrix, the sum of its diagonal elements, is equal to 2 cos θ, or exp(iθ) + exp(-iθ). The determinant and trace of a matrix are invariant under a similarity transformation, and this turns out to be a very useful fact. Actually, an n x n matrix has n eigenvalues, and they are invariant under a similarity transformation. There is more discussion of eigenvalues and what they mean later. The determinant is the product of all n eigenvalues, and the trace is their sum. We also recall here that the determinant of the product of two matrices is equal to the product of the determinants. All this linear algebra should not put you off--it isn't group theory, but simply necessary to discuss matrix representations.

The diagram at the left illustrates the symmetry group C3v. The notation C3v is called the Schönflies designation and is used in molecular symmetry. The notation 3m is the Hermann-Mauguin notation used in crystallograpy. Each expresses the existence of a 3-fold rotation axis and a vertical plane of symmetry. If there is one plane, there then must be three, because of the 3-fold axis. We have made a specific choice of the relation of the symmetry elements to the coordinates, and could just as well made other assumptions and labelled differently.

The diagram shows how the symmetry elements affect an arbitrary point (x,y) labelled 1 in the figure. The symmetry elements take it into the other points 2 to 6 that are equivalent to it by symmetry, and to no others, showing the closure of the group. The effects of the symmetry operations on the coordinates of the arbitrary point are expressed by the six 2 x 2 matrices shown, which are a representation of C3v. Find the determinants and traces of each of the six matrices. Note that the reflections have determinant -1, instead of the +1 of the rotations.

The multiplication or group table is shown at the right. We agree that the first operation performed is in the top row, and the second in the leftmost column. When we write the product AB, B is the first operation performed, and A is the second. We have mentioned above the reason for this odd convention. Verify that the table is correct by explicitly working out the effect of the operations. Then prove it as well by multiplying the matrices. I could easily have made a mistake in the table! Note that each row and column is a permutation of the six group elements. Find the pairs of elements that do not commute, that is, for which AB is not equal to BA. This group of order 6 is the smallest group with noncommutative transformations, a nonabelian group.

The two-dimensional basis is the smallest dimensionality that could express the noncommutativity faithfully. In fact, the representation has a different matrix for each group element. Such representations are called faithful. A representation certainly does not have to be faithful. Consider the one-dimensional representation of C3v where the unit matrix [1] corresponds to each element. This is indeed a representation, since AB = C implies that D(A)D(B) = D(C) in a trivial way: 1 x 1 = 1. All one-dimensional representations must be commutative (ordinary multiplication commutes), but they can certainly represent a noncommutative group. Can you think of another one-dimensional representation of C3v? How about one that assigns [1] to E and the rotations, and [-1] to the reflections? Let's call our one-dimensional representations A and B respectively, and the two-dimensional faithful representation E (not to be confused with the identity E). These three representations will turn out to be special.

Let us now create a three-dimensional representation by including the z coordinate. In all of the six symmetry operations, z is unchanged, so the corresponding 3 x 3 matrices have a 1 on the diagonal corresponding to z, and the usual six matrices of the E representation completing a 2 x 2 diagonal block. It is easy to see that z by itself is a basis for the A representation, so what we have here may be termed A + E. This representation is manifestly reducible, meaning that the matrices of smaller representations are combined along the diagonal and the bases of the separate smaller representations are not mixed in any symmetry operation. On the other hand, the representations A and E are obviously irreducible, A because it is one-dimensional, and E because there is no way it could be expressed as 2 x 2 diagonal matrices by any choice of basis. For one thing, diagonal matrices always commute, but the representation is faithful, and must contain noncommuting matrices.

A similar group is called D3, or 32. Instead of the three reflections, there are three 2-fold axes in the same positions. As far as the xy-plane goes, these 2-fold axes have the same effect as the reflections. However, they reverse the direction of the z-axis, putting a -1 in the representation matrix instead of the +1 for C3v. Work out the group table for D3, and show that it is the same as that of C3v with suitable relabelling. The two groups are isomorphic, and so their group properties will be identical. Since there is only one nonabelian group of order 6, and both C3v and D2 are of order 6, they must be isomorphic. Now the reducible representation produced by the basis x,y,z is B + E.

Consider 3 objects in a row, labelled 1,2 and 3. They may be rearranged by exchanging any pair, and this may be done repeatedly. However, there are only six arrangements, which may be called 123, 213, 321, 132, 231 and 312. We have a finite group of transformations of order 6, and this group must be isomorphic to C3v and D2. This group is called the symmetric group on three elements when realized in this way, denoted P3. We can apply it to the x,y,z coordinates, where it will arrange them in different orders. The corresponding 3 x 3 matrices are called permutation matrices, and contain only 1's. A 1 on the diagonal corresponds to an object that is not moved. The six 3 x 3 permutation matrices are shown at the right. Find their determinants and traces, and verify that they obey the group table when suitably labelled.

The traces of these matrices are exactly the same as the traces of the six matrices of the reducible representation A + E of the same abstract group. We know that a similarity transformation does not change the trace. Is it possible that by choosing some new basis we can change the permutation matrices into the six matrices we found earlier with the same traces? It happens that this is actually the case, and representations with the same traces are equivalent in this sense. Even more exciting, it is also possible to change an arbitrary representation into a representation that is the sum of irreducible representations only, like our A + E or B + E.

As a basis for the representation of our permutation group, consider the sum of all the permutations, 123+213+321+132+231+312. This sum remains the same for any operation of the group, so it is the basis of an A representation. We may be considering a function depending on the coordinates of three identical (boson) particles, f(1,2,3). In this case, the symmetrized combination is the only state that occurs in nature. The six permutations, taken individually, are a basis for a six-dimensional representation of the group. The 6 x 6 matrices of this representation are permutation matrices, since every group member takes each one of them into another, distinct, one (except, of course for the identity). This is called the regular matrix representation of the group. It contains each of the irreducible representations of the group a number of times equal to the dimension of the irreducible representation. For C3v and its isomorphs, this is A + B + E + E, 6 dimensions in all.

What we are looking for is choices of bases for representations that reduce them to the simplest level, and these are the bases of the irreducible representations. The irreducible representations are the property of the abstract group, not of any realization of the group, so the same mathematics applies in many different situations. We are going to find some remarkable properties of irreducible representations. In many cases, it is only necessary to find which irreducible representations we are dealing with, and not to explicity find the bases or representations.

Yet another realization of the group is the set of substitutions w = z, w = 1/z, w = -1/(z + 1), w = -(z + 1)/z, w = -(z + 1), w = -z/(z + 1). Group multiplication is successive performance. The variables w and z can be complex variables. The substitution w = z is the identity. Work out the isomorphism between this group and C3v, showing which elements correspond. There is nothing more valuable than a variety of concrete examples when you are trying to understand groups and their representations.

Another 2-dimensional matrix representation of C3v is shown at the left. Verify that the matrices satisfy the group table of C3v, and work out the isomorphism. These matrices will be useful in what follows. They are, in fact, equivalent to the E representation that was given above. Take a general 2 x 2 matrix S, and form the products DS and SD, then work out the consequences of DS = SD, which means that S commutes with a matrix D of the representation. S commutes with the identity in any case. If D is the fourth matrix shown, commutation requires that S = [a b, b a]. Then, taking D as the second matrix, we find that b = -b, or b = 0, so S = [a 0, 0 a] = aI, a multiple of the identity. Therefore, only a multiple of the identity commutes with all the matrices of this representation. As will be shown below (Schur's Lemma), this guarantees that the representation is irreducible (in this case, to two one-dimensional representations).

As a further example, consider the point group C4v. Start with a 4-fold axis and one vertical reflection plane. By performing successive transformations beginning with just these 5 elements, show that you generate a group of order 8 that includes two separate classes of reflections at 45° from each other. Show also that the rotations also fall into two classes, so there are 5 classes in all (you may have to read further to understand this; come back after you have read about classes). The point group D4 is isomorphic to C4v. Find the group table and prove that the group is nonabelian. Find one 2-dimensional representation, as for C3v. There are also four one-dimensional representations made up of +1's and -1's that you should be able to work out. In order to get representations of more than two dimensions, one must go to groups with higher-fold rotation axis in more than one direction, such as the symmetry group of the tetrahedron (T).

Classes

In C3v, we found that one plane of reflection was carried into the other two by the rotations, and that a rotation of 120° is carried into a rotation of 240° by a reflection, and vice-versa. All of this merely corresponds to the choice of a new basis by acting on the old one with a member of the group, which generates a similarity transformation. If we form SAS-1 where A and S are any elements, the elements of the group fall into distinct sets, called classes. In an abelian group, SAS-1 = A, so every element is on its own. In C3v, however, SσS-1 = σ' and SCS-1 = C', where σ and σ' may be different reflections, and C and C' different rotations. The three reflections fall into one class, as do the two rotations. The identity E is always in its own class, since it commutes with all group elements. In a nonabelian group, the number of classes is less than the order of the group. The elements in a class are all similar in some respect, but arranged differently as required by the group structure.

In the representations we looked at above, the traces of the matrices corresponding to the members of a class were the same. This is certainly not surprising, since the members of a class are connected by similarity transformations, and the trace is invariant under a similarity transformation. The abstract group at the bottom of C3v and D2 and P3 has three classes. In P3, one class is the identity, another the odd permutations, and the third the even or cyclic permutations. It is no accident that it also has three irreducible representations, since it happens that the number of inequivalent irreducible representations is equal to the number of classes, as we shall prove.

We have already mentioned the subgroup. If a subgroup consists of whole classes, it is called an invariant subgroup. This means that any similarity transformation of the elements of the subgroup gives only the elements of the subgroup. The group can then be divided into n sets of h elements that then act like group elements themselves, and this group of order n is called the factor group with respect to the invariant subgroup.

In C3v, E, C and C2 are an invariant subgroup. The factor group of order 2 is called C2. C2 is said to be homomorphic onto C3v, where 3 elements of C3v correspond to each element of C2, a 3 to 1 mapping. It is easy to see that any representation of the factor group is a non-faithful representation of the group. For C3v, these are the A and B one-dimensional representations. C2 is abelian, and can have only one-dimensional representations.

Orthogonality Relations

We now come to the remarkable orthogonality relations between the matrix elements of irreducible representations. These relations are of great practical and theoretical use. We shall proceed in the reverse order to that generally used by mathematicians, as in the reference by Wigner, in an attempt to motivate the discussion. It is the almost invariable procedure to present mathematical reasoning in a logical and progressive sequence that appears wonderful and impressive, starting from the fundamentals and ending with the desired result. However, this is not a route that one would follow to attempt to prove the result. One gets the answer by any means possible, then constructs the impressive logic that is a straight and level road to the answer.

The Greeks invented the method they called analysis, which began with the answer and proceeded to the question, and were very proud of it, since it was very powerful in discovering new knowledge. It is as if there were a temple in the woods, at an unknown location, and one had the problem of discovering a path to it from outside the woods. If you start at the temple, and work your way out, then it is easy to find a direct route to it. If you start at the outside, you hardly know in which direction to proceed, and can only encounter the temple with luck after a random search among the trees.

The orthogonality relation is shown at the right. The D's are the matrices of two irreducible representations m and n, which may be the same. The sum is a group sum, over all elements of the group of order g. The indices of the matrix elements are i,j,k and l, which run from 1 to the dimension of the representation h. The right-hand side of the relation is usually zero, unless we are considering the same matrix elements of the same irreducible representation

Note that the first D is complex conjugated. The reason for this is that the matrices of a representation are always chosen to be unitary for simplicity. It must be proved that this is possible. In a unitary representation, D(A-1) = D(A), the hermitian conjugate, which is the transposed complex conjugate. Matrices corresponding to inverse elements appear in the proof of the orthogonality relations, and these are converted to complex conjugates in the usual expression of the relations, which is certainly more convenient. A unitary matrix is the generalization of the orthogonal matrix, where the inverse equals the transpose, to complex bases. A transformation by a unitary matrix preserves the scalar product of two vectors. In quantum mechanics, it preserves "matrix elements" and the normalization of states.

Verify that the orthogonality relations hold for the representations we have found for C3v, D3 and P3. If one of the irreducible representations is A, where the matrices are all [1], what do the orthogonality relations imply? Check that the group sum of the matrix elements with the same indexes is zero. What do the orthogonality relations give for the A representation? It is quite obvious that the group sum is just the order of the group.

Now to the proof. Consider the matrix M given by the group sum M = ΣD(R)XD'(R-1. Here, we have distinguished a second irreducible representation by the prime, and X is any suitable matrix whatever. By suitable, we mean that if D is of dimension m and D' of dimension n, then X must be m x n so that the matrix multiplication makes sense. If m and n are not the same, then X is rectangular, not square. What we are doing here is setting up a group sum that looks like the orthogonality relations. Now, D(S)M = ΣD(SR)XD'(R-1) = ΣD(SR)XD'(R-1)D'(S-1)D'(S) = ΣD(SR)XD'((SR)-1)D(S) = MD'(S). Be sure to understand why this is so. A sum over all SR is the same as a sum over all R--it's just the same elements in perhaps some different order. We are using the power of group closure here, and it is powerful indeed.

We have just found a matrix M such that D(S)M = MD'(S) for any S. The hermitian conjugate of this relation is MD(S-1) = D'(S-1)M. Now multiply from the left by M to get MMD(S-1) = MD'(S-1)M = D(S-1)MM. This holds for any S, so MM commutes with every matrix of the irreducible representation D(S). We can (and will) prove that any such matrix must be a multiple of the indentity matrix. This result is known as Schur's Lemma, and is the key to the proof.

Now we have MM = cI, where c is a constant and I the identity matrix. If M is not square (the dimensions of the representations not the same), then make a square matrix N by adding rows or columns of zeros. Since NN = MM = cI as well, the determinant of the left-hand side is zero. This must equal the determinant of the right-hand side, which is, therefore, also zero, and thus c = 0. If the two representations have different dimensions, then M = 0.

If the numbers of dimensions is the same, then M is square, and can have a nonvanishing determinant. In that case c is not equal to zero, and M can have an inverse. In this case, we have MD'(S)M-1 = D(S) and the two representations are equivalent. If c is equal to zero, then MM = 0, and so M = 0. To see that this is true, simply write out the product in terms of matrix components to find Σ|Mik|2 = 0, where the sum is over k.

We now know that our matrix M that involved the arbitrary matrix X is equal to zero unless the irreducible representations are equivalent. If we now take each component of X in turn as 1, and the others zero, we get precisely the orthogonality relations for m ≠ n. We must still consider the case c ≠ 0, that can occur when the representations are of the same dimension. This will show explicitly how we get from X to the orthogonality relations.

In this case, M = Σ D(R)XD(R-1) = cI, where c may, of course, depend on what is chosen for X. Let us choose any one element of X, say Xjk = 1, and all the rest zero. There are h2 of these X's, where h is the dimension of the representation D. Then, writing out the matrix product with indices, we find M = Σ D(R)ijD(R-1)kl = cjkδil. Now set l = i and sum from 1 to h. The result is Σ D(R-1)kiD(R) ij = Σ δjk = cjkh, or cjk = (g/h)δjk. Now that we know what c is, we can write Σ D(R)ijD(R-1)kl = (g/h)δ jkδil, the desired orthogonality relation. If the representation is unitary, we can put the hermitian conjugate for the inverse, and find Σ D(R)ijD(R)*lk = (g/h)δ ilδjk, the usual expression.

Schur's Lemma

The proof of the orthogonality relations depended on the fact that any nonzero matrix that commutes with all the matrices of an irreducible representation is a multiple of the identity--that is, a diagonal matrix with all of its elements equal. Of course, such a matrix always commutes with any matrix, so aside from this obvious case, we can say that no nonconstant matrix commutes with all the matrices of an irreducible representation. This fact is called Schur's Lemma.

Suppose we have a matrix M that commutes with all the matrices A of an irreducible representation, or AM = MA for any A. The hermitian adjoint of this expression is AM = MA. If we multiply on both sides by A, and use AA = AA = I, we find AM = MA. If a matrix commutes with all A, so does its hermitian conjugate, and so, further, do the hermitian matrices M + M and i(M - M). If these are multiples of the identity, then so is M. It is, therefore, only necessary to prove the lemma for a hermitian matrix.

This makes things rather easy, since any hermitian matrix can be diagonalized by a similarity transformation by some matrix V: V-1MV = d, where d is a diagonal matrix. A representation equivalent to the original one is formed by the matrices V-1AV = B. Multiplying AM = MA on the left by V-1 and on the right by V, we find Bd = dB. That is, the equivalent representation B commutes with the diagonal matrix d. In components, this means Bijdjj = diiBij. If two diagonal elements of d are different, this means that the rows and columns corresponding to the different values must be zero. Hence, the matrices B are composed of diagonal blocks, with no components linking the bases corresponding to the different blocks. B, then, is reducible. Since B, by hypothesis, is irreducible, this cannot happen, and all the diagonal elements dii must be the same. Thus d is a constant matrix, and M = VdV-1 = d is the same constant matrix, since constant matrices commute with any matrix. Schur's Lemma is now proved.

We speak rather glibly of diagonalization, but do not give an actual example, only assume that it can be done. Diagonalization is a standard computational process in linear algebra. What we are doing is finding the basis or coordinate system in which a matrix takes its simplest form, which is diagonal. The diagonal elements di are called the eigenvalues of the matrix A, and the corresponding vectors vi which satisfy the equation Avi = divi are called its eigenvectors. They are simply changed at most by length, not in direction, by the operator A. If a matrix is hermitian, then the eigenvalues are real and the eigenvectors are mutually orthogonal, and can be chosen orthonormal. An n x n matrix has n eigenvectors. The computational procedure is first to find the eigenvalues by solving the equation |A - dI| = 0 for the d's, and then finding the eigenvectors from Av = dv for each of the n d's. We actually find just the directions of the eigenvectors; their lengths can be chosen any way we want. The eigenvectors then make up the transforming matrix U, where U-1AU = d. U is even unitary, if we began with an arbitrary orthonormal basis. For the proofs, see any linear algebra text. It is easy to see why the eigenvalues are invariant under a similarity transformation now. As a physical example, a moment of inertia tensor in mechanics can be diagonalized to find the principal axes (symmetry axes are always principal axes) and the principal moments of inertia. It is much easier to handle these three quantities than nine quantities that are not independent of each other. The restriction to hermitian (or symmetric) matrices is rather important. Trying to diagonalize an arbitrary matrix may be opening a can of worms. In some cases, two hermitian matrices can be simultaneously diagonalized by the same unitary transformation, which is even more fun.

Unitarity

The hermitian adjoint (transposed complex conjugate) of a unitary matrix is equal to its inverse, as we have mentioned above. This implies that the matrix must have a nonzero determinant, since this is required for the existence of an inverse. All the kinds of transformations we have been considering will be represented by matrices with nonzero determinants, and will have inverses, as required by the group postulates. However perverse our choice of bases, the resulting representations are always equivalent to a representation by unitary matrices, a property we have assumed above whenever convenient. A unitary transformation is a very restricted kind of transformation, and it is not obvious that any transformation we are concerned with can be expressed as unitary. Note that we are not saying that any representation we may come up with is unitary, but simply that it is equivalent to some unitary representation (which may well replace the original one in our further investigations).

Unitary transformations are particularly important in quantum mechanics, since they preserve the normalization of wave functions, and the matrix elements of the hermitian operators that represent physical magnitudes. Orthogonal matrices, the analogues of unitary matrices for real elements, represent rotations in n-dimensional spaces, that do not change the lengths of vectors or the angles between them. A unitary matrix can be thought of representing a sort of transcendental rotation in complex spaces, that also does not change lengths or angles.

The proof that any representation (by matrices with a nonvanishing determinant) is equivalent to a unitary representation proceeds by explicitly finding the similarity transformation that produces the unitarity. We do this by first diagonalizing the hermitian matrix H = Σ AA, to find d = U-1HU. All the components of d are real and positive, so we may create a new diagonal matrix where each component is the reciprocal of the square root of the corresponding component of d, and call it d-1/2. Then, I = d-1/2Hd-1/2 (not a similarity transformation!). The equivalent representation B = d-1/2U-1AUd-1/2 is then unitary, which we can prove by explicit evaluation of BB, using the expression for I we found above. In the sum over the group elements that defines H, we use group closure. The algebra is given in more detail in Wigner, but you can work it out for yourself. Work with the intermediate representation U-1AU so the U's do not appear.

We have now worked through the usual proofs, which are found in their normal order in Wigner. One begins with unitarity, then proves Schur's Lemma, and following this the theorem on matrices connecting representatons of different dimensions. Finally, the orthogonality relations are proved, which form the useful output of the effort. The results are really remarkable. We see that they follow mainly from group closure by irresistible mathematical logic, and could not have been anticipated at the beginning.

Characters

We are aware that the sum of the diagonal elements of a matrix, the trace, is not changed by a similarity transformation. Hence, the traces of the matrices of a representation are the same for any equivalent representation, and so are a signature of the representation. They are so useful that they receive a special name, the characters of the representation. Since the members of a class are connected by similarity transformations, all the matrices representing the members of a class have the same trace, or character. Thus, character is a class function.

Orthogonality relations for characters are easily found from the orthogonality relations for matrix elements by summing over diagonal elements. The result is Σ χ(i)(R)*χ(j) (R) = g δij, where i and j are two irreducible representations. Note that the group sum is over elements, so that each element of a class must be included. It can also be written as a sum over classes, if a weighting function w(k) for the kth class equal to the number of group members in the class is included.

The characters may be considered as vectors in a euclidean "class space" with a dimension equal to the number of classes. The orthogonality relations show that these vectors are orthogonal to each other. Since the maximum number of orthogonal vectors in an n-dimensional space is n, the number of inequivalent irreducible representations can be no larger than the number of classes. In fact, it can be shown (by similar arguments) that the number of inequivalent irreducible representations can be no less than the number of classes, so this number is equal to the number of classes.

The group C3v has three classes, and we found three inequivalent irreducible representations--A, B and E. The characters are (1,1,1), (1,-1,1) and (2,0,-1). Each of these class vectors has the length squared (modulus) of 6, the order of the group (remember that the class weights are 1,3,2), and are mutually orthogonal. The characters of the permutation matrices are (3,1,0). The modulus of this class vector is 12, twice the order of the group, which shows that the representation is reducible. This means that there exists a similarity transformation that will convert all six matrices to block-diagonal, explicitly reduced, form. The characters of these matrices will be the sums of the characters of the irreducible representations that are present, which we believe are A and E.

If the ith representation appears ai times in the reducible representation, the characters of the reducible representation are χ(R) = Σ aiχ(i)(R). The constants ai can be found from the orthogonality of the characters just like the constants in a Fourier series. Just multiply both sides by χ(j)(R)* and sum over the group. We find that aj = (1/g)Σχ(j)(R)*χ(R). This process is called character analysis, and allows us to find what irreducible representations are included in any arbitrary representation. Often, this is all we need to know. Carry this out for the permutation matrices to confirm our suspicions. Also, consider the two 2-dimensional matrix representations of C3v that we gave above, and show that they are irreducible and equivalent.

Every time you meet irreducibility, review what it means. Most significantly, it means that there is no way to take linear combinations of the basis to get a simpler representation of the action of the group transformations. In quantum mechanics, the basis may consist of certain states of a system whose symmetry is expressed by the group. If the states belong to an irreducible representation, there must be as many as the dimensionality of the representation, and they must have exactly the same energy. If the symmetry is broken, the energies may now be different, but perhaps not wildly different if the symmetry is only weakly broken. Of course, states may accidentally have the same energy when there is no reason to force equality. Sometimes what appears as accidental may actually reflect a larger symmetry that is not recognized, as in the case of the hydrogen atom, where the exact 1/r radial potential gives a larger symmetry group, O4, in place of the usual rotational symmetry group O3.

All you need to do this is some way to find the characters of the representation in which you are interested, often by explicit transformation and noting what remains fixed (which will correspond to diagonal elements in the matrices, and all you need for the trace is the diagonal elements), and the characters of the irreducible representations of the group. For the point groups, tables of characters are available, and one example is included in the References below. The characters are the property of the abstract group, independently of its realization, but these tables usually repeat the tables for isomorphic groups because the labelling of the elements is different. If the character tables are the same, the groups are isomorphic.

Conclusion

We have now introduced finite groups, proved the orthogonality relations for the irreducible representations, and explained what character analysis is. On the way, a number of examples has been given that illustrate the theory. There are many interesting applications, but a proper consideration of them would detain us too long at this point. This gives the idea, however, of one of the most beautiful areas of algebra, opened up by Schur, Frobenius and Burnside many years ago. The theory of representations goes on to much more complex and difficult areas, but what is presented here is enough for many practical applications. If we did more, it would be with the symmetric group, which is worthy of study and quite interesting. Continuous groups, such as the rotation and Lorentz groups, are another important extensive field, with properties that are generalizations of those for finite groups

The method we have followed is due to I. Schur, who revealed his results in 1905. Schur's method works directly with the representation matrices, and can be extended to continuous groups. G. Frobenius had followed a different route in his papers from 1896 and later, using the analogy between a finite group and an algebra whose bases are the group members, and whose structure is given by the group table. Frobenius's method is powerful and beautiful, but it is applicable only to finite groups. Algebras, it might be said, are difficult to understand, unlike the matrix algebra that was widely known by physicists. Frobenius's method is explained in the text by Littlewood. The Lie theory of continuous groups is a further development, where algebras once again arise. It is very important for further applications of group theory, especially to elementary particles. As far as the rotation group goes (the theory of angular momementum), physicists can make do very well with algebra (not algebras!).

References

Only a few outstanding references are mentioned here, which will lead to others. There is a very large literature on group theory, with works ranging from lucid to obscure. Those by physicists are more comprehensible than those by mathematicians, and are more likely to contain useful and entertaining results. Those by mathematicians are more likely to be rigorous and correct.

E. P. Wigner, Group Theory and its Application to the Quantum Mechanics of Atomic Spectra (New York: Academic Press, 1959), especially Chapter 9, pp 72-87. Wigner's explanations are always crystal clear to me. He is interested in explaining, not concealing, and understands things deeply.

P. W. Atkins, M. S. Child and C. S. G. Phillips, Tables for Group Theory (Oxford: Oxford University Press, 1970). Character tables for point groups.

D. Schonland, Molecular Symmetry (London: D. Van Nostrand, 1965). A clear treatment of the application of group theory to molecular spectra. Also includes character tables for point groups.

M. Hamermesh, Group Theory (Reading, MA: Addison-Wesley, 1962). A classical group theory text for physicists. Possibly not as lucid as it could be, but worth study.

D. E. Littlewood, The Theory of Group Characters and Matrix Representations of Groups, 2nd ed. (Oxford: Clarendon Press, 1940). Contains character tables for symmetric groups, and explains the Frobenius algebra. Includes introductions to matrices, algebras and groups. A mathematician's text.

I. Schensted, A Short Course on the Application of Group Theory to Quantum Mechanics (Ann Arbor, MI: NEO Press, 1965). This little softcover gem is probably unobtainable now, but was once widely appreciated. It is from the days when group theory was becoming important in elementary particle physics , though there is no mention of this application. Like this paper, it covers Schur's theory and character analysis.