From J. J. Thomson's demonstration of the particle nature of the electron in 1896 until the creation of modern quantum mechanics by Heisenberg and Schrödinger in 1926, the electron was considered to be a small classical particle described by classical mechanics and relativity, as modified by Planck's quantum theory and other ad hoc quantum properties, as seemed necessary. This theory, which may be called Lorentz's electron theory, was more successful than it had any right to be, explaining many of the phenomena of what has become known as "modern physics." It is still used quite generally for practical purposes, and still forms the basis of most physicist's concepts of the electron. The limitations of this method were becoming obvious just as quantum mechanics was introduced. Quantum mechanics corrects and extends the theory, giving it a firm foundation as a model of microscopic behavior of matter. This article will explain those aspects of the particle model of the electron that are not treated elsewhere on this site, up to the explanation of the Thomas precession, a relativistic effect of some mystery that is a triumph of the theory. In many cases, the particle theory is much easier to comprehend, while confirmed by better but much more complicated fully quantum-mechanical analysis.
The particle electron was conceived as a small material particle of mass 9.10939 x 10-28 g carrying a negative electrical charge of magnitude e = 4.803207 x 10-10 esu. The electromagnetic force on a particle of charge q and mass m is given by the Lorentz force, f = q[E + (1/c)v x B], where E is the electric field in statvolt/cm, B the magnetic field in gauss, and v the velocity in cm/s. I will use Gaussian (cgs) units here and in what follows, but will give the equivalents in MKS units in most cases. The use of MKS units in atomic physics can be inconvenient, especially when magnetic fields are considered. Otherwise simple equations sprout meaningless conversion factors with εo and μo and 4π's in them, and dimensional analysis is confused. I will avoid this whole mess by using Gaussian units. The conversions 10,000 gauss = 1 T and 300 V = 1 statvolt are useful in expressing results in practical units. The electronic charge in MKS is e = 1.602177 x 10-19 C. (Multiply by c/10, where c is in cm/s, to get esu). The electron-volt (eV) unit of energy is 1.602 x 10-19 J or 1.602 x 10-12 erg. For more information on units, see Monopoles.
In MKS units, the Lorentz force drops the factor c (making relativistic considerations a little more inconvenient), and the force is in newton instead of dyne. This is a fully relativistically-correct force equation (unlike the others commonly appearing in nonrelativistic electrodynamics) that is Lorentz invariant. However, it can be used with Newton's equations of motion dv/dt = f/m to describe the motion of the particle electron in arbitrary electric and magnetic fields. This is done in the article Charged Particle Dynamics, to which the reader is referred for this very important application.
The equations of motion dv/dt = (q/m)[E + (1/c)v x B] can be expressed in a very simple form by transformation to a frame of reference rotating with a constant angular velocity ω. If v' is the velocity in this rotating frame, then in the inertial frame v = v + ω x r'. Taking the time derivative, dv/dt = dv'/dt + 2ω x v' + ω x (ω x r'). We recognize the "fictitious" Coriolis and centrifugal forces in this expression. When this is put into the equations of motion, we find dv'/dt = (q/m)E + (e/c)v' x [B + (2mc/q)ω] + (ω x r')[(q/c)B + mω] after a little manipulation.
If the angular velocity of rotation is chosen to be ω = -(q/2mc)B, the Coriolis term disappears, and the centrifugal term is zero from the properties of the cross product. All that remains is dv'/dt = (q/m)E. In the rotating system, the electron moves as if it were under the influence of E alone! This also holds for an assembly of more than one particle, so long as they all have the same (q/m). All electrons, in fact, have exactly the same mass and charge, which really follows from quantum mechanics and the concept of identical particles. Indeed, this is verified experimentally to an extremely high precision. The angular velocity ωL = -(q/2mc)B is called the Larmor angular velocity, and the movement itself is called Larmor precession. Note that the axis of rotation is the direction of the magnetic field, and that the senses are opposite for a positive charge.
An electron has q = -e, so ωL = (e/2mc)B for electrons. A little confusion may be introduced because the charge on the electron is negative. It should be remembered that for a positive particle, the rotation is in the opposite direction. The quantity e/2mc = 8.7942 x 106 esu-s/gm-cm is called the gyromagnetic ratio. To see why this name is used, consider an electron moving with velocity v in a circle of radius r. The period of rotation is 2πr/v, so the average current is ev/2πr. The magnetic moment of a current loop is the product of the current, i/c and the area of the loop, cm2, or μ = (i/c)A. The current i is in esu/s, and the c converts it to emu/s. Our revolving electron then has a magnetic moment of μ = (ev/2πrc)(πr2) = evr/2c. The angular momentum of the revolving electron is l = mvr, so μ = emvr/2mc = (e/2mc)l. Therefore, e/2mc is the ratio of the magnetic moment to the angular momentum of the particle. For an electron, it is easy to see that they are opposite, so that really μ = -(e/2mc)l. An angular momentum is always accompanied by a magnetic moment.
The magnetic moment just mentioned can be realized as a current loop, with the direction of the moment normal to the area in the direction a screw would advance when rotated in the direction of a positive revolving charge, and of magnitude (i/c)A. Alternatively, it is the limit of equal and opposite magnetic poles keeping the product μ = md constant as d → 0. Magnetic poles are an idealization, since no particles with magnetic charge have yet been discovered.
A uniform magnetic field B exerts a torque N = μ x B tending to turn the moment in the direction of the magnetic field. Since the magnitude of this torque is μB sin θ, the energy U = ∫Ndθ as a function of the angle can be found by integration, U = -μB cos θ, or U = -μ · B, where the zero is taken as the state in which the moment is normal to the field.
The equation of motion for a body of angular momentum L acted upon by a torque N is dL/dt = N. Here, μ = -(q/2mc)L, so dL/dt = -(q/2mc)L x B = (q/2mc)B x B. This means that L precesses with angular velocity (q/2mc)B about B. For a positive charge, the precession would rotate a screw in the direction of B. For an electron, the precession is in the opposite direction. This is, of course, just the Larmor precession that was described above. Since the magnitude of the torque is N = μB sin θ, and dL/dt = ωL sin θ, the result is not difficult to derive, showing the close connection between the gyromagnetic ratio and the Larmor angular frequency.
The interaction energy can also be interpreted as rotational energy. Using index notation, the rotational energy of a body with a symmetrical inertia tensor Ijk is W = (1/2)Ijkωjωk. If ω j → ωj + ωj', then W' = (1/2) Ijkωjωk + Ijkωjωk' + (1/2) Ijkωj'ωk', making use of the symmetry of the inertia tensor. If ω' is much less than ω, the final term is negligible, so the change in energy is U = W' - W = Ijkωjωk'. The angular momentum of the body is Ijkωj = Lk, so U = Lkωk' = - (q/2mc)L · B = -μ · B, as above.
The most celebrated success of the particle electron model was the Bohr atom of 1913, coming just after Rutherford established that the atomic nucleus was small and massive. The idea of electrons orbiting this sun-like nucleus as planets was a persuasive one. It probably would have been decisively rejected on stability grounds if the theory did not give, astoundingly, the correct numbers for the hydrogen spectrum. We now know that the planetary analogy is quite incorrect, and in atoms even a wave-packet electron soon spreads uniformly, but the correctness of the results commanded respect.
Suppose an electron revolves in a circular orbit of radius r around a nucleus of charge +Ze with speed v. Newton's second law gives mv2/r = Ze2/r2, so mv2r = Ze2. Bohr then made the quantum ansatz that the angular momentum was an integral multiple of h/2π = h': mvr = nh'. Dividing these two expressions yields v = Ze2/nh'. Then, we easily find that r = nh'/(mZe2/nh') = n2h'2/mZe2. The ground state of hydrogen has Z = 1 and n = 1, so v = e2/h' and r = h'2/me2. The first number, divided by c, is the ratio v/c for the ground state of hydrogen, known better as the fine structure constant α = e2/ch' = 1/137.04. The second number is the Bohr radius 0.052918 nm.
In fact, the ground state of hydrogen is a minimum-uncertainty state of zero orbital angular momentum absolutely nothing like an orbiting electron. The Bohr radius is, however, rather close to the "size" of the hydrogen atom as found from scattering experiments. The energy of the electron in a state of quantum number n is the sum of its kinetic energy and potential energy, E = mv2/2 - Ze2/r. Substituting our values for r and v, we find E = -mZ2e4/2n2h'2. These energies are exactly correct (except for some very small corrections for things unknown at the time). In theoretical spectroscopy, wave numbers σ = 1/λ in cm-1 are generally used in place of frequencies as more convenient in size. In terms of wave numbers and Planck's quantum of radiation, E = hν = hcσ, or σ = E/hc. We then have, taking the absolute value of the energy, σ = Z2e4m/4πch'3n2 = RZ2/n2, where R is the Rydberg constant, which had been determined experimentally with great precision.
Bohr then assumed that the observed spectrum was the consequence of transitions between states of different n. While in any state, the atom did not radiate. This was an astonishing assertion, but suggested from the empirical expression of spectral lines as differences in "term values" which up to then had no good explanation. This gave σ = RZ2(1/n2 - 1/n'1), in emission if n1 > n2, and in absorption if n2 > n1. Efforts to find harmonic relationships between spectral lines, as in vibrations of material bodies, had been wholly and completely unsuccessful. For large quantum numbers, however, the energy differences between levels correspond to orbital frequencies. Bohr laid the search for harmonics to rest once and for all. The Rydberg constant was now expressed in terms of elementary constants, and gave the precise value 109,737.315 cm-1. This is the greatest success of the Bohr model, and is astonishing.
The Bohr model was refined by Sommerfeld to include elliptical orbits and relativity, and even then it gave excellent results for hydrogen-like spectra. The Bohr model failed rather completely when used for more than one valence electron, or for molecules, however. This is characteristic of particle electron analyses, which can give very good numbers in simple cases, but cannot be extended to more complex situations, since the analogy is very imperfect. Quantum mechanics gives a much more satisfactory account of the hydrogen atom, and the theory can be extended to more complex atoms and to molecules as well.
Quantum mechanics introduces a better description of the angular momentum, introducing the quantum number l, which for a state of principal quantum number n goes from 0 up to n-1. The orbital angular momentum is space-quantized so that its value along any direction can take only the values lh', (l-1)h', ..., -lh' = mh', where m is the magnetic quantum number corresponding to l. It is called magnetic because these states are separated in energy when in a magnetic field, as will be discussed in the next section. Any state can be specified uniquely by n, l, m. We shall see later that electrons have a fourth degree of freedom, spin, which has s = 1/2 and ms = ±1/2. The letters s,p,d,f,g,h,... are used for l = 0,1,2,3,... for historical reasons. Therefore, in hydrogen, n = 1 corresponds to an s state, n = 2 to s and p states, n = 3 to s, p and d states, and so on. There are 2l+1 states for any value of l, labelled by their m values. For any value of n the total number of states is n2, where an s is one state, a p 3 states, a d 5 states, and so on. This degeneracy (several states with exactly the same energy) occurs only for hydrogen-like spectra, and is a result of a 4-dimensional symmetry depending on the 1/r potential. For all other spectra, the states of different l are separated in energy.
Whatever the complicated electronic motions there were that produced observed spectral lines, the application of a uniform magnetic field should produce changes because of the Larmor precession. A general vibration can be resolved into three harmonic components at right angles. Suppose the magnetic field is in the +z-direction. Then the vibration in the z-direction will be unaffected by the magnetic field, while the x and y motions will precess with the Larmor angular velocity (e/2mc)B in a right-handed sense about the +z-axis. An x-vibration x = a cos ωt can be expressed as the sum of two circular motions (a/2)eiωt and (a/2)e-iωt rotating in opposite directions, and similarly for a y-vibration. The clockwise motion will be speeded up by the Larmor precession to ω + ωL, and the anticlockwise motion will be slowed down by the same amount.
When we look at the light emitted in a direction normal to the magnetic field, we should see one component polarized in the direction of the magnetic field ("p" polarization) of unchanged frequency, and two components polarized perpendicularly to the direction of the field ("s" polarization), one at a smaller frequency (or energy) and one at a larger, from what we know of the radiation from a linear dipole. If we look along the magnetic field, say with B pointing towards us, we should see two circularly polarized lines, the right-circularly polarized one of greater frequency, the left-circularly polarized line of smaller frequency. There will be no line of unchanged frequency, since an electron does not radiate in its direction of motion. The change in wave number from the original frequency will be Δσ = ωL/2πc = eB/4πmc2 = 4.6686 x 10-5B cm-1. We will denote this Zeeman shift by Z for short.
The pattern of the central p-polarized line flanked by the two s-polarized lines is called a normal Zeeman triplet. Even for a field of 10,000 gauss, approaching the largest that can conveniently be produced by an iron-core electromagnet, the width of the triplet will be only 0.93 cm-1. For comparison, a green line of wavelength 555 nm has σ = 18,018 cm-1, so it requires great spectral resolution to see a Zeeman triplet. At first, Zeeman was only able to note a broadening of a line, and the polarization of its edges, to indicate that something was happening. Since then, increased resolution has allowed the examination of many Zeeman patterns.
It was great puzzlement when not the normal triplet, but more complex symmetrical patterns were found. Only in the case of singlet lines, for which the spin quantum number S was zero, was a normal triplet observed. The overall size of the effect was, however, correctly predicted by the classical analysis. These patterns were eventually completely explained by a semiclassical analysis due to Landé, called the vector model of the atom, and based on the newly-postulated spin of the electron. It happened that the electron was not just a charged particle, but that it had an intrinsic angular momentum whose component along any direction could only be ±h'/2. This resembled the angular momentum of a rotating body, hence the term "spin," but any explanation of it as a classical rotation of the electron fails. It was shown by Dirac to be a consequence of a proper relativistic treatment of the electron.
We would expect the magnetic moment corresponding to the spin to be opposite in direction, and of magnitude (e/2mc)(h'/2), but Dirac showed that it was actually twice this, eh'/2mc, which is the magnetic moment corresponding to an orbital angular momentum of one full unit, h'. The magnetic moment eh'/2mc erg/gauss is called the Bohr magneton, a convenient unit for atomic magnetic moments. The gyromagnetic ratio of the electron can be written g(e/2mc), where the g-factor is 2. The complication in the Zeeman effect is that the orbital angular momentum and the spin angular momentum tend to precess at different rates, since they have different g-factors.
We'll consider the case of one electron, which will include all the essential features. The case of two or more electrons is similar, and can be found in the References. The total angular momentum j is the sum of the orbital angular momentum l and the spin angular momentum s. It is a constant of the motion, and its magnitude is specified by the quantum number j. Its vector magnitude is √l(l+1) h' = j*h', and its component along any axis can have the values mh' = jh', (j-1)h', (j-2)h', ..., -jh', where m is called the magnetic quantum number. There is actually a set of 2j+1 states of equal energy in the absence of an external field, and the atom may be in any one of them, or in a linear combination. The orbital and spin angular momenta are described by quantum numbers l and s = 1/2 in a similar way. These quantum numbers are constants, but their projections on any direction in space are not. The total angular momentum can be viewed as the vector sum j = l + s, where j can take the values l + 1/2 and l - 1/2 only, if l > 0. When l = 0, j = 1/2 only, and the angular momentum is due entirely to the spin. The projections of l and s on j are constant.
What happens is that only the projections of the magnetic moments on the direction of j are constant, and the sum of the projections is the total magnetic moment corresponding to j. The orbital angular momentum and the spin angular momentum will not precess independently about B, but the total angular momentum and its associated magnetic moment will. The actual ratio of the resultant magnetic moment to the total angular momentum can be expressed as g(e/2mc), where g will be between 1 and 2. By considering a vector diagram for the angular momenta and the magnetic moments, the Landé g-factor can be shown to be given in terms of the quantum numbers by g = 1 + [j(j+1) + s(s+1) - l(l+1)]/2j(j+1). When s = 0, g = 1, while if l = 0, g = 2. s = 0 does not occur for one electron, of course, but it does for two electrons, in which the states are either singlets with S = 0 or triplets with S = 1. Capital letters are used for more than one electron. In LS-coupling, the individual orbits couple to L and the spins to S, and then the two resultants couple to J. The g-factor is given by the same formula.
For example, a 2P3/2 state has s = 1/2, l = 1 and j = 3/2 in the usual notation. In labelling states, S,P,D and F mean l = 0,1,2 and 3, the superscript is 2S + 1, and the subscript is J. The g-factor for this state is g = 1 + [(3/2)(5/2) + (1/2)(3/2) - (1)(2)]/2(3/2)(5/2) = 4/3. In a magnetic field B, each of the 4 states now has a different energy (4Z/3)m, which makes a symmetrical set of equally-spaced states. Similarly, a 2S1/2 state has g = 2 since l = 0, and splits into two states of energies 2Zm = +Z and -Z. Each spectral line in the observed Zeeman pattern of the line that is a transition between the P state and the S state starts on one of the P levels and ends on one of the S levels, obeying the selection rule Δm = ±1 for s-polarized lines, and 0 for p-polarized lines. The selection rules are given by quantum mechanics, and are conditions that the transition matrix elements not vanish from symmetry. The consequent pattern is composed of six lines instead of three, as shown in the diagram. Relative intensities of the lines are shown above the arrows representing the transitions.
All of the many Zeeman patterns that are observed can be calculated by this method, and the results agree very well with the observations. This is excellent verification of the spin and angular momentum of the electron, and its anomalous g value of 2, as predicted by the Dirac theory. The magnetic interaction energy is, in practice, much less than the dependence of the energy on the relative orientations of l and s, which is called the spin-orbit interaction energy. A very strong magnetic field can succeed in decoupling the orbital and spin motions so that they precess independently in the field. In this case, the two precessional energies simply add with their individual g-factors, so the wavenumber shift is Zm + 2Zms, where ms = ±(1/2). For one electron, the levels are simply Zm ± Z, plus a correction for the spin-orbit interaction. This is called the Paschen-Back effect, and is rather hard to observe. A field of 32,000 gauss was required to see the effect in the hydrogen line at 656.3 nm.
Hydrogen-like single-electron states can be labelled by the quantum numbers n, l, j and m, where m is now the magnetic quantum number associated with j, not l. These states are often called orbitals, and linear combinations of them are used to construct approximate wave functions for many-electron systems (many can mean 2). Energy levels in more complex spectra can be labelled by L and J, as well as by the value of S that is coupled with L to form J. L and S are thought of as vector sums of the orbital and spin angular momenta of the individual electrons, which is a reasonable approximation in many cases. The letters S, P, D, F and so on are used to signify the L value, 2S+1 is placed as a superscript before this letter, and J as a subscript following.
The energies of the two states 2P3/2 and 2P1/2 of the same principal quantum number n for one electron differ in the relative orientation of the spin and orbital angular momenta. In the first state, they are more or less parallel, and in the second state more or less opposed. In more complicated atoms, the energies of states with different L and S have different electrostatic energies, but in hydrogen the difference in the energies of the states j = l ± 1/2 can be expected to be small, and due principally to the different energies of the spin in the magnetic field that appears in the rest frame of the electron that is moving in the electrostatic field of the nucleus. This magnetic field is E x (v/c). Since E = (Ze/r3)r, the magnetic field is seen to be proportional to the orbital angular momentum of the electron. In fact, B = l(Zeh'/mcr3). The Larmor precession frequency in the rest frame of the electron is then ωL = - g(e/2mc)(Zeh'/mcr3)l. We have put g in this equation for more generality. For an electron, g = 2.
When the energy due to this precession is calculated, it turns out to be twice as large as the observed value, as if the electron had g = 1. Since this is definitely not the case, there must be something wrong. What this is has to do simply with relativistic kinematics. The rest system of the electron is obtained from the fixed inertial system by using a relative velocity v that is continually changing in direction. Let the rest system and the inertial system both have the same origin for simplicity. Since Lorentz transformations do not commute, the effect is that the successive Lorentz transformations deriving the electron rest system from the fixed inertial system do not leave the coordinate axes parallel. That is, the electron rest system is continuously rotating as the electron revolves about the nucleus (in our particle picture), and the rate of rotation depends on the acceleration of the electron. This rotation is called the Thomas precession, after L. H. Thomas, who first recognized it in 1926. Since the acceleration is also due to the nuclear electric field, this kinematic precession has exactly the same form as the spin-orbit precession, and happens to be half as large and in the opposite direction. The precession angular velocity in the inertial system is then the sum of these two angular velocities, which gives the result ω = - (e/2mc)(Zeh'/mcr3)l. This value of the spin-orbit interaction agrees with observations.
The interaction energy is the product of this precessional angular velocity and the spin angular momentum h's, or ΔU = (Z/2)(g - 1)(eh'/mc)2r-3l·s. To apply this result, we need an expression for l·s in terms of the quantum numbers j,l and s, and a suitable value of r must be used. Squaring j = l + s, we find j(j+1) = l(l+1) + 2l·s + s(s+1), using the proper quantum-mechanical expressions for the squares of the momenta. Then, the dot product is seen to be (1/2)[j(j+1) - l(l+1) - s(s+1)]. The best way to handle r-3 is to average it over the orbital electron density. The result for hydrogen-like atoms is avg(r-3) = (Z/a1)3[n3l(l+1/2)(l+1)]-1.
Putting this all together, the spin-orbit interaction energy Γ = (a/2)[j(j+1) - l(l+1) - s(s+1)], where the constant a = Rα2Z4/n3l(l+1/2)(l+1) cm-1, in terms of the Rydberg constant R (109737 cm-1) and the fine-structure constant α (1/137). For Z = 1, the numerator is 5.85 cm-1. For the level n = 2, l = 1 we have the states 2P3/2 and 2P1/2. From our formula, Γ = 0.122[j(j+1) - 11/4], which gives +0.122 for j = 3/2 and -0.244 for j = 1/2. The "doublet" is shown at the right. The numbers to the left of the states give the degeneracy. The level of smallest j lies lowest, and the "center of gravity" is at the level without spin-orbit coupling, since 4(0.122) + 2(-0.244) = 0. This is called a "normal doublet." The total splitting is 0.366 cm-1, a very small amount. Careful experiments give 0.364 cm-1, so the agreement is excellent.
In more complex atoms where the orbital angular momenta are couped to L and the spins to S, and these coupled to a total angular momentum J (LS-coupling), there is also a spin-orbit coupling depending on L·S, but the splitting factor a may be a result of several effects. It can be described, at least empirically, as we have done for the magnetic interaction of a single electron. The states arising from one L and S are called a "multiplet," and the multiplicity is 2S + 1, the number of possible spin orientations, and so of J values. Each state of given J = L+S, L+S-1, ..., |L-S| is a set of 2J+1 states of the same energy when B = 0 that can be observed in the Zeeman effect. LS-coupling occurs only when states of different S are well-separated in energy. S affects the symmetry of the state, which in turn affects the electrostatic repulsion between the electrons. States of high S have the electrons well separated, and so have lower energies. A similar effect occurs for L, so states with the highest value of S lie lowest, and of these those with the highest value of L lies lowest. This is called Hund's Rule, and was observed experimentally long before it was explained by theory.
Since matter is composed largely of electrons, the interaction of electromagnetic radiation with electrons is a subject of considerable interest and utility. Electromagnetic radiation covers a wide range of frequency f, or what is equivalent, of quantum energy hf. The spectrum includes radio wave, microwave, infrared, visible, ultraviolet, X-ray and gamma-ray regions, with which the reader is generally familiar. Wave models for lower frequencies give way to particle models at higher frequencies, but radiation is the same thing whatever the frequency, and is described accurately by quantum mechanics.
The most notable consequence of the quantum nature is that energy transfers are in units of hf or h'ω, which seem to be the creation and annihilation of "quanta" or photons. This "particle model" is useful, but only represents a few properties of radiation, leaving out the essential role of phase, and cannot be carried far. A classical wave model is much more useful, but it must always be used with an eye to quantum behavior. In the photon model, the photon has an energy h'ω and a momentum h'ω/c, so its rest mass is zero, from the relativistic relation E2 - (pc)2 = (mc2)2 satisfied by the energy-momentum 4-vector. It also happens that the photon has an angular momentum h' ("spin 1") that takes only the values ±h' with respect to the direction of propagation.
In the wave model, the electric field of a monochromatic wave can be represented by εEeik·r - iωt. k is the wave vector, in the direction of propagation, and of magnitude k = 2π/λ, where λ is the wavelength, where kω = c, the speed of propagation, about 3 x 1010 cm/s. ε is a unit vector called the polarization vector, which must be normal to k. Two such polarization vectors at right angles are sufficient to describe any state of polarization of the radiation in a unique way. The energy density in the wave is |E|2/8π, and the energy flux is c|E|2/8π erg/cm2s. There is a magnetic field B perpendicular to E that oscillates in phase with it, but the magnetic field has a negligible influence on the motion of an electron on which the wave falls.
Linear polarization results when the x and y vibrations are in phase. Circular polarization results when the x and y vibrations are equal in amplitude but differ in phase by 90°. The polarization vector εL = (1/√2)(εx + iεy) represents left-circular polarization with time dependence e-iωt, and (1/√2)(εR = εx - iεy) represents right-circular polarization. Right-circular polarization means that the electric vector rotates clockwise as you face the oncoming radiation. Ordinary unpolarized radiation can be considered to be an incoherent superposition (that is, with an arbitrary phase difference) of linear polarizations at right angles, or of right- and left-circularly polarized components, with half the power in each of the polarizations.
Now we look at the radiation from an accelerated charge. The power radiated into solid angle dΩ by a charge with acceleration a = dv/dt is dP/dΩ = (e2/4πc3)|ε·a|2 where ε is the polarization vector of the radiation. For a derivation of this formula, see Jackson. To make the meaning of the formula clearer, consider an electron moving along the z-axis with z = Ae-iωt. The magnitude of the acceleration is a = ω2A, so dP/dΩ = [(Aeω)2/4πc3] |ε·εz|2.
Consider radiation in a direction making an angle θ with the z-axis. One polarization vector can lie in the xy-plane normal to the direction of propagation, the other can lie in the plane defined by the direction of propagation and the z-axis, as shown in the diagram. We note that there will be no radiation of the first polarization, since the dot product is zero. The other polarization vector makes an angle 90° - θ with the z-axis, so the dot product gives sin2θ. Therefore, dP/dΩ = C sin2θ. We see that there is no radiation along the axis of the vibration, and that the radiation is a maximum in the xy-plane. It is easy to integrate over all directions to find that P = (8π/3)C. These results are probably familiar.
If the plane wave ε'Eeik·r - iωt falls on an electron, the acceleration a = -(eE/m)ε', omitting the exponential factor. From the radiation formula, we then have dP/dΩ = (e4E2/4πm2c3) |ε·ε'|2 = (cE2/8π)(e2/mc2)2 |ε·ε'|2. The first factor is the incident energy flux. If we divide by this, we get the cross-section for scattering into solid angle dΩ, so dσ/dΩ = (e2/mc2)2 |ε·ε'|2.
The quantity e2/mc2 has the dimensions of a length. Suppose the electron was constructed by starting with a conducting sphere of radius r, and bringing in its charge e from infinity in small increments dq. This will require more and more work as the charge builds up. In fact, the increment in energy is dE = (q/r)dq. Integrating from q = 0 to q = e, we find that E = e2/2r. Einstein tells us that E = mc2, so we have r = e2/2mc2. If all the mass of the electron were due to the electrostatic energy in assembling it, then this would be roughly the size of the electron. Of course, this is not true, and the electron's mass is not totally electromagnetic, but it is a good story. Therefore, e2/mc2 is called the classical electron radius ro, and has the value 2.82 x 10-13 cm. This is roughly a nuclear size, but we know that actual electrons cannot be confined in such a small volume. It is tempting to think of an electron as a little hard sphere rattling around in the region where its wave function is nonzero, but the question of what the electron looks like is unlikely to be answered by any such picture, and our confusion has no practical effect.
The scattering cross section is dσ/dΩ = ro2 |ε·ε'|2. Let's suppose that the wave vector of the unpolarized incident wave is directed along the +z-axis, so that the electric fields lie in the xy-plane, and their polarizations are defined by the unit vectors in the coordinate directions, with half the power in each. Polarization directions can be chosen for radiation scattered into θ,φ with one in the xy-plane, and the other in the plane of the scattered wave vector and the incident wave vector. These will be, respectively, ε1 = -εx sinφ + εy cosφ, and ε2 = cosθ(εx cosφ + εy sinφ) + εz cosφ. It is now easy to pick out the contributions from the two incident polarizations. The result is cos2θcos2φ + sin2φ and cos2θsin2φ + cos2φ. On adding, φ drops out, and we have just 1 + cos2θ, which must be divided by two, since half the power is in each incident polarization.
Our final result is dσ/dΩ = (ro2/2)(1 + cos2θ), called the Thomson scattering formula. The total cross section is σ = (8π/3)ro2 = 0.665 x 10-24 cm2. It is a small cross-section, of the size of nuclear cross-sections, independent of frequency, and back scattering is as important as forward scattering. For light, Rayleigh scattering from density fluctuations of a size comparable to the wavelength is far more probable, with its λ-4 wavelength dependence, which gives the blue of the sky.
We also note that the scattered radiation is unchanged in frequency. If there were a significant momentum transfer, we would expect the electron to recoil and take some energy with it, so that the scattered radiation would be of smaller frequency. This, in fact, becomes important when the photon momentum h'ω/c is not negligible compared to mc. The condition for the validity of the Thomson formula is also h'ω/mc2 << 1, but energy is not really the significant parameter. When this condition does not hold, we must consider the recoil of the electron, which we shall do in the next section.
The Thomson formula applies to a single free electron. If we have scattering from Z electrons, we can add the scattered powers to obtain a cross section Zσ provided the scattered amplitudes are incoherent, which means that in the average value of the square of the amplitudes the cross terms average to zero due to random phases, so that |Σa|2 = Z|a|2. If the phases are not random, then |Σa|2 can be as large as |Za|2 = Z2|a|2. This is coherent scattering, where the intensity may be gathered into sharp peaks, as in X-ray scattering by crystals. In Thomson scattering, the cross section may peak sharply in the forward direction when the scattered amplitudes add coherently, as from the Z electrons in a single atom.
In 1922, Arthur Holley Compton studied the energy distribution of X-rays scattered at 90° by a graphite target. In addition to X-rays of unchanged wavelength, he found a peak due to X-rays of slightly larger wavelength, whose position varied with the scattering angle. This became known as the Compton Effect, which Compton explained as an elastic collision of an X-ray photon with an atomic electron. The recoiling electron carried away some of the energy of the incident photon, leaving a photon of smaller energy. A simple calculation on the basis of the conservation of energy and momentum gave the wavelength of the scattered X-ray as a function of scattering angle, which was fully confirmed by experiment.
A diagram of such a collision in the laboratory system is shown at the right. The electron of rest mass m is initially at rest, while the incident X-ray photon approaches from the left. The scattered photon leaves at an angle φ, the recoil electron at an angle θ. For each particle, the energy and momentum are shown in the diagram. These quantities are the relativistic values, with β = v/c and γ = (1 - β2)2, where v is the recoil velocity of the electron. Conservation of energy gives h(f - f') = mc2(γ - 1), where f is the frequency of the X-ray before collision, and f' the frequency after. Conservation of momentum is expressed by two equations. One is (f - f'cosφ) = (γmc2β/h) cos θ = K cosθ, and the other is f'sinφ = K sinθ. Squaring the two equations and adding, we get f2 -2ff'cosφ + f'2 = K2, which eliminates the angle of recoil θ. Subtracting the result of squaring the equation resulting from the conservation of energy gives 2ff'(1 - cosφ) = K2 - [mc2(γ - 1)/h]2. The right-hand side reduces to 2(mc2/h)2(γ - 1) = 2(mc2/h)(f - f'), using γ2(β2 - 1) = -1, and the conservation of energy again. The result is f - f' = ff'(h/mc2)(1 - cosφ). Since f = c/λ, (f - f')/ff' = (1/c)(λ' - λ), so finally we get λ' - λ = (h/mc)(1 - cosφ), a quite elegant result.
Consulting several references, I found that the authors did not present this algebra, merely waving their arms and usually saying that it was complicated, which probably means that they had trouble doing it. Therefore, I have given it here to show that it is simple (provided you do it in the order shown). It is also easy to use nonrelativistic mechanics, where the kinetic energy of the recoil electron is mv2/2 and its momentum is mv. The manipulations are much like those above, except that near the end you get (f - f')[1 - h(f - f')/2mc2] instead of just (f - f'). Then you argue that the second term in the brackets (half the energy transfer divided by the rest energy of the electron) is very small and can be neglected whenever nonrelativistic mechanics is a good approximation, and get the same result as the relativistic theory. It is easier just to use relativistic mechanics and get an exact result!
The quantity (h/mc) is called the Compton wavelength, with the value 2.4263 x 10-3 nm. It is the increase in wavelength at a scattering angle of 90°, such as Compton used, and is typical. For visible radiation, with λ = 500 nm, it amounts to about 5 parts per million, so is generally lost in the line width. In scattering from bound electrons, the mass in most scattering events is the atom mass, so there will be no Compton effect because of negligible recoil. For both these reasons, the Compton effect is not observed with visible light, but it still occurs whenever there is a recoil.
Radiation with photon energies of from about 1 keV to 100 keV, or wavelengths from around 1 nm to 0.01 nm, are generally called X-rays, and those of greater energies γ-rays. The Compton effect is easily detected for X-rays, since the Compton wavelength is a considerable fraction of the wavelength (from 0.002 to 0.2). The Compton wavelength itself corresponds to an energy of 0.517 MeV, near the rest energy of the electron, mc2 = 0.511 MeV. The ratio (hf/mc2) varies from 0.002 to 0.2 for X-rays, so the effect of electron recoil can be important.
A competing process is the complete transfer of the energy of the photon to an electron, which is called the photoelectric effect. It is easy to show that energy and momentum cannot be conserved in this process, so it occurs only when the excess momentum can be simultaneously transferred to the atom. The theory is involved, so we will not consider it here. Another competing process, for photons of energy greater than 1.022 MeV, is pair production, in which the photon disappears and is replaced by an electron and a positron, its antiparticle. Considering these things would take us too far from our aims, but the reader can find them discussed where the absorption of gamma rays in matter is discussed. X-rays can eject electrons from atoms, creating positive ions, and tracks of recoil electrons can even be seen in a cloud chamber.
The cross-section for Compton scattering is very closely related to the Thomson formula. In fact, dσ/dΩ = ro2(k'/k) |ε·ε|2, where k' and k are the magnitudes of the wave vectors for the scattered and incident radiation. This formula holds for spinless particles. The theory is rather involved, finally leading to the famous Klein-Nishina formula for the general case, so a description will not be attempted here.
Now, k/k' = λ'/λ = (λ' + λ - λ)/λ = 1 + (λ' - λ) = 1 + (hf/mc2)(1 - cosφ). Therefore, the factor k/k' = 1/[1 + (hf/mc2)(1 - cosφ)]. At φ = 180°, the factor is 1/[1 + 2hf/mc2]. This shows that the cross section is reduced for backscattering. If hf = mc2 (0.5 MeV gamma-rays), the cross section is smaller by a factor (1/3)2 = 0.11.
We now should have a good picture of what happens when electromagnetic radiation falls on an electron. Compton scattering supplements Thomson scattering by allowing for recoil of the electron. It is very interesting to see how far classical concepts can give reasonable results, when adjusted here and there to account for quantum effects.
Electrons are more than just electromagnetic particles, and run in a mysterious company of their own. One of the astounding properties of radioactivity is that atoms emit electrons of high energy, which were named "beta rays" early on. Some are orbital electrons that were already present, but others are new electrons produced in the nucleus, and these include not only negative electrons, but their positive antiparticles, positrons. It is impossible for electrons to exist as such in the nucleus, so any electrons coming from the nucleus must be newborn.
Although electrons cannot exist completely inside the nucleus, the wave functions of s electrons are not zero there, so there is some "handle" on them for nuclear processes. Sometimes a nucleus can make a transition from an initial state to a more stable state by emitting a photon, just as an excited atom emits radiation when the electrons rearrange themselves more comfortably. The energy differences are, however, much larger, up to many keV instead of eV, so the photons are gamma rays. It is usually easier for the nucleus to grasp the handle on an orbital electron and fling it away than to emit a photon, though the processes compete. These internal conversion electrons have definite energies, and leave the atom in an excited state from which it usually emits characteristic X-rays. The process is a nuclear one, not the emission of a gamma ray followed by the photoelectric effect on its exit from the atom.
In some cases, the nucleus would be more stable if a neutron were replaced by a proton, or a proton by a neutron. For example, if there were an unpaired proton and an upaired neutron in the nucleus, such a change would result in all the nucleons being paired up, probably a more stable configuration. In most naturally radioactive isotopes, the nucleus has too many neutrons as the result of a previous alpha-decay, and would like to change one into a proton to even up the neutron-proton ratio. Fermi showed that there was a force acting between neutrons, protons and electrons that would allow a neutron to create a positive and negative charge within itself, then hold on to the positive charge while kicking the negative charge out as an electron. This force had to be of very short range, so it was called the weak force because it had no effect except in nuclear close quarters, where, of course, it is locally powerful. It is strong enough to accelerate an electron to MeV energies over nuclear distances, it must be remembered, in opposition to electrical forces that would tend to attract the electron.
Although the change in nuclear energy in beta-decay was quite definite, the emitted electrons had a puzzling continuous distribution. The maximum energy agreed with the change in nuclear energy levels, but the electron energy went down to quite small values, as if energy were being lost. The only reasonable answer was that another particle was emitted that carried off the missing energy, but that this particle could not be detected by the means then available. Since then, this particle has indeed been observed, though its existence was accepted as a fact because of overwhelming circumstantial evidence. It was called the neutrino, of zero mass and spin 1/2, like the electron's. There is a whole family of neutrinos and antineutrinos, related to electrons and their fatter relatives the muon and tauon. The neutrino emitted in negative beta-decay is an electron antineutrino. Electrons and electron neutrinos both have the propery of "electron-ness," the electron of +1, the electron antineutrino of -1. If an electron antineutrino is created simultaneously with an electron in beta decay, then the net change in electron-ness is 0, or electron-ness is conserved, which appears to be a natural rule. In beta decay, nucleon number is also conserved, 1 neutron to 1 proton.
Negative beta decay is, symbolically, n0 → p+ + e- + ν'0, which conserves charge, neutron number, electron number, energy and momentum. The prime distinguishes an antiparticle, and the charge is written as an exponent. Fermi calculated the electron energy spectrum to be expected under many different condtions, and his results agreed with experiment. This amounted to all but conclusive proof of the existence of the neutrino. This alone would have secured Fermi's place as one of the greatest physicists of all time, but it is only one remarkable contribution of many.
Some artificially created isotopes have too many protons, and would like to see fewer. This can be accomplished by positive beta decay, p+ → n0 + e'+ + ν0, where an antielectron and a neutrino are emitted. Except that the positron is helped out by the positive nuclear charge, it is the same as negative beta decay in principle. In this case, a related process is possible, p+ + e- → n0 + ν0. This is called K-capture, since the electron is usually one of the s electrons in the K shell, but can be any s electron. Now, only the neutrino comes out, and the atom emits the usual X-rays to put its electrons in order. If the half-life for electron emission is long, K-capture is the preferred method of decay.
These reactions can be reversed, such as ν0 + n0 → p+ + e-, but the probability of such reactions is extremely small. There is a lot of nuclear activity in the sun, and a huge number of neutrinos are emitted which bathe the earth. It was expected that this large flux would permit the neutrino to be detected, and would even throw light on nuclear processes in the sun. Disappointingly, very few such inverse neutrino reactions were observed, and it followed that the neutrino flux from the sun was much less than expected. It is now thought that neutrinos from the sun are not pure electron neutrinos, which would cause the reactions, but in a state that mixes them with mu or tau neutrinos, which would not. As the neutrinos move, they oscillate back and forth between being electron neutrinos and mu neutrinos. When they get to the earth, they are in their mu phases, and so do not cause reactions. This was a great relief to solar theorists.
A few decades ago, the electromagnetic and weak forces were shown to be aspects of the same force, the unified electro-weak force, mediated by a heavy vector boson (particle of integral spin) as well as the massless photon. All the fermions (particles of half-integral spin) subscribe to the electro-weak force, which is responsible for beta decay as well as the electromagnetic field. This seems to me to be the last significant progress in physical theory, which has now gone off into speculations that are interesting but not conclusive, and some of which are not even provable. It would be very satisfying if some of this activity explained even the least of the mysteries that still surround the matters considered in this article, but it seems totally detached from reality.
Particle accelerators are an important application of the classical point electron model. The fields of a charge in arbitrary motion are studied in Relativistic Electrodynamics and the Field of a Point Charge, and the electron synchrotron is described there. The first machine accelerating electrons to high energies was the betatron, proposed by D. W. Kerst in 1940, and put into operation at General Electric in 1945, producing 100 MeV electrons.
The betatron accelerates electrons without the use of a static electric field. Instead, the accelerating field is produced by a changing magnetic field that also serves to maintain the electrons in a circular orbit of fixed radius as they are accelerated. The electric field can be found from Faraday's Law, curl E = -(1/c)dB/dt, which in integral form is ∫E·ds = -(1/c)∫B·dA = -(1/c)dΦ/dt. With cylindrical symmetry, this is 2πrE = -(1/c)dΦ/dt. The tangential force on the electron is -eE = (e/2πrc)dΦ/dt, which equals the rate of change of tangential momentum p = γmv, where γ = (1 - v2/c2)-1/2. Integrating from t = 0, p = 0, Φ = 0 to t, we find that eΦ/2πrc = γmv. This is the first expression for the momentum, coming from Faraday's Law.
The magnetic field also holds the electron in a circular orbit, and equating the magnetic force to the centripetal acceleration times the mass gives γmv2/r = eBv/c, or γmv = reB/c. Note that γ does not change in circular motion, since v is constant, so we can do this. This is a second expression for the momentum, coming from Newton's Law. If we equate our two expressions for the momentum γmv, we find eΦ/2πrc = reB/c, or B = Φ/2πr2. This is the condition that the magnetic field at the orbit, B, will be just right to keep the accelerating electrons in an orbit of fixed radius. Since the average value of the magnetic field inside the orbit is Φ/πr2 = <B>, we have <B> = 2B, the famous 1:2 condition. The average field inside the orbit must be twice the value of the field at the orbit. This can be arranged by properly shaping the pole faces, as shown in the diagram. It is a good exercise to verfy that the sense of revolution in the orbit is consistent with the magnetic field, and that the induced electric field is in the correct direction to accelerate the electrons when the flux increases. Don't forget that the electronic charge is negative.
Electrons injected into the vacuum chamber may have orbits that rise above and below the midplane, and which oscillate radially as well. The magnetic field must be shaped so that these betatron oscillations are stable. The betatron introduced these considerations into accelerator design. As the magnetic field increases during acceleration, the oscillations are damped and a well-defined electron beam is produced. The maximum energy for a betatron is about 500 MeV. The electrons are allowed to strike a target at the end of each cycle of acceleration, where they produce powerful X-rays. In fact, the main use of betatrons is as an X-ray source. The electrons also radiate as they revolve in their orbits, producing synchrotron radiation. This radiation is treated at some length in the article on Relativistic Electrodynamics.
The magnet core of the General Electric betatron of 1945 weighed 130 tons. The large amount of iron required is one of the principal disadvantages of betatrons, especially in large sizes. The electron synchrotron overcame this disadvantage, since its magnets have only a guidance function.
While considering the magnificent contributions to science made in the past by organizations like General Electric, Bell Labs and Radio Corporation of America, it is sad to realize that the present-day successors will never again have the skills and intellectual resources to be able to do the like.
H. E. White, Introduction to Atomic Spectra (New York: McGraw-Hill, 1934). A classic introduction to the vector model and atomic spectra. See especially chapters 8 and 10.
J. D. Jackson, Classical Electrodynamics, 2nd ed. (New York: John Wiley & Sons, 1975). Section 11.8, pp. 541-547, explains the Thomas precession thoroughly. There is also an excellent account of Thomson scattering.
I. Kaplan, Nuclear Physics (Cambridge, MA: Addison-Wesley, 1955). A very accessible account of beta-decay is given in Chapter 14. Chapter 15 treats the passage of gamma-rays through matter, including Compton scattering. The algebra is given in Chapter 6. This excellent text, full of basic information, holds its value even today.
Composed by J. B. Calvert
Created 1 June 2003
Last revised 9 June 2003