\documentclass{article}
\usepackage{chicago}
\usepackage{latexsym}
\newcommand{\ie}{\mbox{i.\,e.\,\ }}
\newcommand{\iec}{\mbox{i.\,e.\,}}
\newcommand{\role}{r\^{o}le }
\newcommand{\eg}{\mbox{e.\,g.\,\ }}
\newcommand{\egc}{\mbox{e.\,g.\,}}
\newcommand{\etc}{etc.\,\ }
\newcommand{\vctr}[1]{\ensuremath{\mathbf{ #1 }}}
\newcommand{\ddt}{\ensuremath{\frac{\dr{}}{\dr{t}}}}
\newcommand{\dbd}[2]{\ensuremath{\frac{\dr{#1}}{\dr{#2}}}}
\newcommand{\pbp}[2]{\ensuremath{\frac{\partial #1}{\partial #2}}}
\newcommand{\pbpbp}[3]{\ensuremath{\frac{\partial^2 #1}{\partial #2 \partial #3}}}
\newcommand{\ket}[1]{\ensuremath{\left| #1 \right\rangle}}
\newcommand{\bra}[1]{\ensuremath{\left\langle #1 \right|}}
\newcommand{\bk}[2]{\ensuremath{\left\langle #1 | #2 \right\rangle}}
\newcommand{\proj}[2]{\ensuremath{\ket{#1} \bra{#2}}}
\newcommand{\tpk}[2]{\ensuremath{\ket{#1}\!\otimes\!\ket{#2}}}
\newcommand{\tpb}[2]{\ensuremath{\bra{#1}\!\otimes\!\bra{#2}}}
\newcommand{\matel}[3]{\ensuremath{\bra{#1} #2 \ket{#3}}}
\newcommand{\hilbert}[1]{\ensuremath{\mathcal{#1}}}
\newcommand{\op}[1]{\ensuremath{\widehat{\textsf{\ensuremath{#1}}}}}
\newcommand{\opad}[1]{\ensuremath{\op{#1}^{\dagger}}}
\newcommand{\id}{\op{\mathsf{1}}}
\newcommand{\denop}{\ensuremath{\rho}}
\newcommand{\tr}{\textsf{Tr}}
\newcommand{\nrm}{\frac{1} {\sqrt{2} } }
\newcommand{\mc}[1]{\ensuremath{\mathcal{#1}}}
\newcommand{\dr}[1]{\ensuremath{\mathrm{d} #1\,}}
\newcommand{\vct}[2]{\ensuremath{\left( \begin{array}{c} #1 \\ #2 \end{array} \right)}}
\newcommand{\be}{\begin{equation}}
\newcommand{\ee}{\end{equation}}
\newcommand{\e}[1]{\mathrm{e}^{#1}}
\begin{document}
\title{QFT, Antimatter, and Symmetry}
\author{David Wallace}
\maketitle
\begin{abstract}
A systematic analysis is made of the relations between the symmetries of a classical field and the symmetries of the one-particle quantum system that results from quantizing that field in regimes where interactions are weak. The results are applied to gain a greater insight into the phenomenon of antimatter.
\end{abstract}
\section{Introduction}
Quantum mechanics comes with a $U(1)$ symmetry built in. All quantum-mechanical theories are formulated on a complex Hilbert space; all quantum-mechanical systems obey a Schr\"{o}dinger equation that is invariant under phase transformations.
Classical field theory does not have a $U(1)$ symmetry built in, but one can be put in by hand. That is, ``complex'' fields are just as valid as real fields in classical field theory: they are just ordered pairs of real fields. And if the transformation of that ordered pair which corresponds to ``complex multiplication'' is a symmetry of the field equations, then the classical field theory has a $U(1)$ symmetry.
When the two sorts of $U(1)$ symmetries are present in the same theory --- that is, when we try to quantize a classical complex field --- it seems that interesting things happen. The quantization process seems to have to be changed; the quanta which emerge seem to come in particle and antiparticle varieties; much confusion ensues.
Indeed, sometimes one can get the impression that the rules for quantizing a classical field are fundamentally different depending on whether that field is real or complex. Folk history --- and, to some extent, real history --- may seem to support this: applying the real-field quantization process to a complex field leads to the pathology of negative energy states, which (it can seem) can be removed only by the ad hoc reinterpretation of those solutions as antiparticles.
By contrast, my theme in this paper is that the same quantization process applies to real and complex fields, but that needless confusion ensues as long as we forget that the two $U(1)$ symmetries are \emph{distinct}. One of them is a universal feature of all quantum theories; the other is specific to certain field theories, and in foundational issues as widely separated as the nature of the gauge principle and the origins of antimatter, it is vital that we make a careful distinction between the two.
In fact, to properly understand the quantization process and the origin of antimatter, it is necessary to develop a general theory of how classical symmetries behave under quantization, and the main content of this paper is exactly that. To be more precise: my main result is a systematic account of exactly what the relation is between the symmetries of a classical field theory and those of the corresponding quantum-mechanical particle.
Nothing here is exactly new: in particular, the importance of distinguishing two forms of complex symmetry in QFT has been stressed by \citeN{saunderscomplexnumbers}, and of course the details of what happens when one or other classical field is quantized can be teased out of innumerable QFT textbooks. But there is, I think, some value in appreciating the systematic shape of quantization theory as it applies to symmetries. A great deal that can seem like black magic is thereby displayed as simple and natural.
By way of motivation, I begin my account (in sections \ref{sect2}--\ref{sect3}) by considering two puzzling aspects of modern quantum mechanics. In section \ref{sect2}, I sharply criticise the standard presentation of the gauge principle in quantum mechanics; in section \ref{sect3} I try to persuade the reader just why antimatter is prima facie so puzzling.
In section \ref{linearclassical} I develop the classical theory of linear fields and their symmetries, as a precursor to the quantum theory of those fields which I present in section \ref{linearquantum}. Real field theories, of course, are not linear, and I sketch the connection between linear and nonlinear results in section \ref{nonlinear}. (My discussion takes place in Lagrangian QFT --- \iec, the sort used in theoretical particle physics and solid-state physics --- and is only partially applicable to axiomatic and algebraic approaches to QFT. See~\citeN{bakerhalvorson} for a discussion of antimatter from the algebraic perspective.)
Sections \ref{irreducibility} and \ref{generalise} are the heart of the paper. Here I set out the details of what the symmetries of the classical field theory do and do not entail for symmetries of the corresponding quantum-mechanical particles; in the process, I clarify just why quantizing complex fields leads to the antimatter phenomenon, and return to the question of the gauge principle. Sections \ref{irreducibility}--\ref{generalise} deal with small symmetries; \ref{CPT} extends the theory to parity, conjugation and time reversal symmetries. In section \ref{conclusion} I make a few concluding remarks.
At various points in the argument I make fairly extensive use of the theory of complexification (of vector spaces, and of representations upon those spaces). I sketch the details in the main part of the paper, but to avoid breaking up the argument I have relegated a full and careful development of these results to an Appendix. In another appendix I apply the theory I have presented to the Standard Model of particle physics. (This material is relegated to an appendix since it is considerably more technical than the rest of the paper, and makes extensive use of results in QFT which space prevents me from explaining in detail).
\section{First preamble: the gauge principle in non-relativistic quantum mechanics}\label{sect2}
There is a standard textbook argument to motivate the gauge principle in quantum mechanics, which goes something like this.
\begin{enumerate}
\item
Firstly, it is noted that global phase transformations of the wavefunction are not empirically detectable, and so $\psi(x)$ and $\mathrm{e}^{i \theta}\psi(x)$ are really just different representations of the same physical state.\footnote{Roughly, by a ``global'' transformation I mean one specified by a finite set of parameters; by a ``local'' transformation I mean one specified by a finite set of spacetime functions. But in any case, in this paper I make use of them only to motivate my analysis; they play no part in the analysis proper.}
\item Then it is argued that this should continue to be so if the phase transformation is performed not on the \emph{whole} wavefunction but only on a part of it --- or, equivalently, that $\psi(x)$ and $\mathrm{e}^{i \theta}\psi(x)$ should continue to be different mathematical representations of the same state \emph{even if $\theta$ is position-dependent}. The motivation for this move is normally that the probability of finding a particle at position $x$ is $|\psi(x)|^2$, a quantity which is invariant under even position-dependent phase transformations.
\item It is then noted that the Schr\"{o}dinger equation is not invariant under such a transformation, so a new (classical) field, the $\vctr{A}$-field --- often called the \emph{connection} --- is introduced to compensate for this. In more detail (recall), $\nabla$ is replaced with $(\nabla-i q \vctr{A})$, and a phase transformation is required to replace $\vctr{A}$ with $\vctr{A}-\nabla \theta$.
\item Finally, a (classical) dynamics is introduced for \vctr{A}, given by a Lagrangian density whose form is not forced by the argument but is in fact taken to be the usual Lagrangian of vacuum electromagnetism.
\end{enumerate}
It's not a very good argument. It's long been recognised that the last step (in which \vctr{A} becomes a dynamical player) needs further motivation: all that the first three steps motivate is the existence of a static, possibly even flat (\iec, satisfying $\nabla \times \vctr{A}=0$) connection (see, \egc, \citeNP{anandanbrown}). But in fact the problems begin earlier than this.
\begin{enumerate}
\item For a start, there is something rather awkward about the setting in which the argument is presented. We are dealing with a \emph{quantum} system (nonrelativistic particle mechanics), yet we are motivated to introduce a \emph{classical} connection (with associated classical dynamics). To be sure, we are used to the unfortunate process of constructing a theory classically (applying various arguments to justify its form) and then quantizing it, but something seems yet more unfortunate about constructing our theories at the part-quantized level. We could, of course, see the argument as occurring after first quantization (which gives us quantum particle mechanics) but before second quantization (which gives us quantum field theory, including a properly quantum theory of the connection). But it has long been recognised that ``second quantization'' is not a perspicuous way to understand quantum field theory, and that we do better to begin with a classical field and quantize it only once.
\item Consider the second step: going from a global to a local symmetry. This is motivated by the claim that \emph{position} measurements are invariant under local symmetries; but position measurements are not the only measurements that can be made, and more seriously, the squared modulus of the wavefunction is not the only physically relevant property it has. Why not apply the gauge principle to the momentum-space representation of the quantum state? That gives a perfectly well-defined physical theory, but one in which the ``magnetic field'' would be a field on momentum space rather than physical space, and so would be very different to the magnetic fields actually observed in nature.
\item Even supposing that position measurements really are preferred in a defensible sense, we still have problems. For the wavefunction, in general, is not defined on physical space at all: it is defined on $3N-$dimensional \emph{configuration space}. And if we perform a phase transformation where the phase is not only spatial-position-dependent but \emph{configuration-space-position-dependent}, again this will leave all probabilities of measurements of particle position unaffected. Again we will get a perfectly well-defined theory this way (with a connection on configuration space rather than on physical space); again this theory bears little resemblance to anything actually found in the world.
\item Changing tack: how do neutral particles fit into the argument? It is supposed to apply to the wavefunction of any one-particle system, yet neutral particles do not interact at all with the \vctr{A}-field. (I ignore magnetic effects here: they are in any case not predicted by the gauge argument).
\item Finally, the argument creates a worrying disanalogy between \emph{electromagnetic} gauge theory and other gauge theories (such as the SU(3) theory that is quantum chromodynamics (QCD) and the spontaneously broken SU(2) theory which contributes to the weak interaction). Superficially the arguments look similar: in QCD, for instance, particles have an internal (``colour'') degree of freedom, and global rotations of that degree of freedom are a symmetry of the system's dynamics. We can localise that symmetry if we introduce a (rather complicated) connection, an su(3)-valued vector field,\footnote{That is, by a vector field whose components are elements of su(3) rather than real numbers (or, if a coordinate-free definition is desired, a field of linear maps from one-forms into su(3)).} and we can postulate a dynamics for it just as for the electromagnetic connection.
But the SU(3) symmetry here is a very different object from the phase symmetry which grounds electromagnetism according to the standard argument. Only very special theories have an SU(3) internal symmetry, whereas \emph{every} quantum theory has global phase as a symmetry. In fact, calling it a ``symmetry'' at all is generous: it is natural, and common, to formulate QM in forms which eliminate it entirely (in terms of density operators, for instance, or of rays on Hilbert space.) Global SU(3), by contrast, is a symmetry in the fullest sense, as much so as translation or rotation. (Note that SU(3) symmetry generates perfectly respectable conserved quantities; the conserved quantity associated with global phase symmetry, by contrast, is trivial).
\end{enumerate}
In fact, it seems to me that the standard argument feels convincing only because, when using it, we forget what the wave-function really is. It is not a complex classical field on spacetime, yet the standard argument, in effect, assumes that it is. This in turn suggests that the true home of the gauge argument is not non-relativistic quantum mechanics, but classical field theory.
On the other hand, the standard argument \emph{works}. If the gauge principle is really understood at the field-theoretic level, presumably the electromagnetic interaction in non-relativistic quantum mechanics must be understood by thinking of that theory as a low-energy special case of quantum field theory. But then, what explains the success of the standard argument when applied \emph{directly} to non-relativistic quantum mechanics? ``Coincidence'' feels unsatisfactory.
\section{Second preamble: antimatter}\label{sect3}
So much for \emph{local} internal symmetries of quantum theory. A mystery of a rather different kind emerges when we consider just the \emph{global} symmetries. Consider: from a classical perspective a complex field is just a real field with an internal degree of freedom,\footnote{This is not \emph{quite} true, even classically. Massless spinor fields are irreducibly complex, and Lorentz transformations apply phase shifts to the field as well as rotating it on spacetime. But nothing similar happens for massive spinor fields (as the Majorana representation of the Dirac equation demonstrates), or for scalar and vector fields, so this phenomenon cannot be the root of the difference between the $U(1)$ internal symmetry and other internal symmetries.} not different in kind to, say, a field taking values in $\mathrm{R}^3$ or $\mathrm{C}^2$. In terms of symmetries, the complex field just has an internal $U(1)$ symmetry, whereas the field taking values in $\mathrm{R}^3$ has an $SU(3)$ internal symmetry and the field taking values in $\mathrm{C}^2$ has a $U(2)$ internal symmetry.
What happens when we quantize theories with internal symmetries like these? In general, things proceed rather as we would expect: the particles that emerge from the quantization have an internal degree of freedom. So quarks, for instance, have a three-dimensional internal degree of freedom which we call colour, resulting from the SU(3) symmetry of the quark field. It is convenient to pick a basis in that internal space and label its vectors \emph{red, green, blue}, but that choice of basis is arbitrary --- we could just as easily have chosen \emph{$\nrm$(red+green), $\nrm$(red-green), blue}. We might expect, then, that something similar would happen when we quantize a theory with a $U(1)$ symmetry: that is, we might expect to find such theories have a two-dimensional internal degree of freedom, with no particularly preferred basis in it.
If we did expect that, we would be astonished. What actually happens, of course, is that quantizing a complex field produces \emph{antimatter}. To be sure, the existence of matter and antimatter versions of the quantized field's particles is an additional degree of freedom, but it behaves altogether differently from the SU(3) case. There is nothing at all arbitrary about our preference for the \emph{matter, antimatter} basis over the \emph{$\nrm$(matter + antimatter), $\nrm$(matter - antimatter)} basis, and nothing at all arbitrary about saying that our world is made up of one of the two kinds of particles rather than of a linear superposition of the two kinds.\footnote{The reader who doubts this is invited to perform the following experiment, preferably by remote control and far away from any population centre: prepare a few kilograms of particles which are either (a) all definitely matter or all definitely antimatter, or (b) all definitely in a matter-antimatter superposition; isolate them from their environment; wait for a few milliseconds. If any living thing remains within a quarter-mile of your laboratory, you have case (a). (At the mathematical level, if each individual atom is in a superposition $(1/\sqrt{2})(\ket{\mbox{matter}}+\ket{\mbox{antimatter}})$, the tensor product of such states for a macroscopic number $N$ of such atoms can be expanded into states containing $M$ antimatter atoms and $N-M$ matter atoms; basic combinatorics tells us that the expansion is dominated by terms containing $\sim N/2$ of each, and quantum electrodynamics tells us that the matter particles will quickly annihilate the antimatter particles in each superposition.)}
So we have a puzzle: why does quantizing a complex field not give rise to particles with an internal symmetry? (Or, if you prefer: why does quantizing fields with other internal symmetries not give rise to some generalisation of antimatter?)
Hopefully, you should by now have the impression --- both from antimatter, and from the gauge argument --- that something odd and interesting happens in the quantization of complex fields. We now turn to the main business of the paper: establishing just how that quantization process works.
\section{Linear classical fields and their symmetries}\label{linearclassical}
We begin our analysis with linear field theories, partly because in the classical context they are the simplest field theories there are, and partly because of the well known duality between a classical linear field theory and a one-particle quantum theory. Any such theory can be specified by:
\begin{itemize}
\item A semi-Riemannian manifold \mc{M}, representing spacetime. For simplicity, I will always take \mc{M} to be \emph{Minkowski} spacetime, although most of the results generalise to other spacetimes.
\item A real or complex vector space \mc{V}, and an associated \mc{V}-bundle (that is, vector bundle with typical fibre \mc{V}) over \mc{M}, whose sections represent kinematically possible fields. (The technical details of vector bundles play little role in what follows: the reader can, if desired, safely replace ``section of a \mc{V}-bundle over \mc{M}'' with ``smooth function from \mc{M} to \mc{V}.'')
\item A Lagrangian density, quadratic in the fields and their first derivatives, which determines which kinematically possible fields are dynamically possible via the Euler-Lagrange equations.
\end{itemize}
Examples include:
\begin{description}
\item[A. Real Klein-Gordon theory:] Here the vector space is just the real line, so that fields are smooth real-valued functions on \mc{M}. The Lagrangian density is
\be
\mc{L}=\frac{1}{2}(\partial_\mu \varphi \partial^\mu \varphi + m^2 \varphi^2)
\ee
and the associated equation of motion is
\be
(\partial^\mu \partial_\mu + m^2)\varphi=0.
\ee
\item[B. Complex Klein-Gordon theory:] Here instead we take fields to be smooth \emph{complex-valued} functions on \mc{M}, use the Lagrangian density
\be
\mc{L}=\frac{1}{2}(\partial_\mu \varphi^* \partial^\mu \varphi + m^2 \varphi^*\varphi)
\ee
and obtain the same equation of motion as in the real case.
\item[C. Klein-Gordon theory with internal degrees of freedom:] Here we take \mc{V} to be any finite-dimensional real vector space (whose vectors we write $v^a$) on which $h_{ab}$ is a positive-definite metric. The fields are now \mc{V}-valued functions, the Lagrangian is
\be
\mc{L}=\frac{1}{2}(h_{ab}(\partial_\mu \varphi^a \partial^\mu \varphi^b) + m^2 h_{ab}\varphi^a\varphi^b).
\ee
and the equation of motion is
\be
(\partial^\mu \partial_\mu + m^2)\varphi^a=0.
\ee
(The further generalisation to complex Klein-Gordon theory with internal degrees of freedom is straightforward.)
\item[D. Weyl spinor theory:] Here \mc{V} is a two-dimensional complex vector space (more precisely, the vector bundle for the theory is the spin bundle over \mc{M}) on which $\epsilon_{ab}$ is a completely antisymmetric 2-tensor. Fields can then be thought of as ordered pairs of complex functions on \mc{M}, and we write them as $\phi^a$. One possible form of the Lagrangian density is now
\be
\mc{L}=\epsilon_{ab}\phi^a(\partial_0+\sigma^i \partial_i)\phi^b
\ee
(where $\sigma^1,\sigma^2,\sigma^3$ are the Pauli matrices)
and the equation of motion is in any case
\be
(\partial_0+\sigma^i \partial_i)\phi^b=0.
\ee
Alternatively, if we take the Lagrangian to be
\be
\mc{L}=\epsilon_{ab}\phi^a(\partial_0-\sigma^i \partial_i)\phi^b
\ee
then the equations of motion are
\be
(\partial_0-\sigma^i \partial_i)\phi^b=0.
\ee
So there are actually two kinds of Weyl fields, referred to as ``left-handed'' and ``right-handed'' in recognition of the fact that one is a mirror image of the other.
\item[E. Dirac spinor theory:] Dirac theory can be written in a number of ways, but for our purposes a convenient way is to use the bispinor formalism: take \mc{V} as the direct sum of two two-dimensional complex vector spaces. A field is then an ordered pair $(\phi^a,\chi^b)$ of complex 2-vector fields. I omit the detailed form of the Lagrangian and simply give the dynamical equations:
\be
(\partial_0+\sigma^i \partial_i)\phi^b-im \chi^b=0;\,\,\,\,
(\partial_0-\sigma^i \partial_i)\chi^b+im \phi^b=0.
\ee
In effect, a Dirac field is a left-handed and a right-handed Weyl field, coupled together so as to restore mirror symmetry.
\item[F. Majorana spinor theory:] If we impose the condition $\chi^a=i\sigma^2 \phi^{*a}$ on the Dirac spinor, we obtain the single equation
\be
(\partial_0+\sigma^i \partial_i)\phi^b+m\sigma^2 \phi^{*b}=0.
\ee
\item[G. Real vector theory:] Here \mc{V} is a copy of Minkowski spacetime (more precisely, the vector bundle for the theory is the tangent bundle over \mc{M}). Fields are then vector fields $A^\mu$ in the usual sense of that term; the Lagrangian is
\be
\frac{1}{4}(\partial_\mu A_\nu-\partial_\nu A_\mu)(\partial^\mu A^\nu-\partial^\nu A^\mu)+\frac{1}{2}m^2 A_\mu A^\mu
\ee
and the equations of motion are
\be
\partial_\nu A^\nu=0;(\partial^\mu \partial_\mu + m^2)A^\nu=0.
\ee
Replacing the Lagrangian with
\be
\frac{1}{4}(\partial_\mu A^*_\nu-\partial_\nu A^*_\mu)(\partial^\mu A^\nu-\partial^\nu A^\mu)+\frac{1}{2}m^2 A^*_\mu A^\mu
\ee
generalises this theory to a complex vector theory.
\item[H. Free gauge boson theory:] Here $\mc{V}=\mc{V}_{vect}\otimes \mathbf{g}$, where $\mc{V}_{vect}$ is a copy of Minkowski spacetime and $\mathbf{g}$ is the Lie algebra of the real, finite-dimensional Lie group \mc{G} (more precisely, the vector bundle for the theory is the tensor product of the tangent bundle over \mc{M} with the trivial $\mathbf{g}$-bundle over $\mc{M}$). Fields are then $\mathbf{g}$-valued vector fields $A^\mu$; the Lagrangian is
\be
\frac{1}{4}\tr\{(\partial_\mu A_\nu-\partial_\nu A_\mu)(\partial^\mu A^\nu-\partial^\nu A^\mu)\}
\ee
and the equations of motion are
\be
\partial^\mu \partial_\nu A^\nu- \partial_\nu \partial^\nu A^\mu=0.
\ee
\end{description}
Further generalisations are possible (albeit rarely physically relevant), and internal degrees of freedom can be added for vector and spinor fields in the same manner as for scalar fields. It should be noted that, in all these cases, the components of the various solutions always obey the Klein-Gordon equation.
So: it appears that free fields come in real and complex varieties, and also come with and without internal degrees of freedom. But this paper will take a somewhat different perspective: namely, \emph{complex fields are just special cases of real fields}. After all, any complex vector space is also a real vector space (of twice the dimension); any complex-linear theory is also a real-linear theory. Indeed, from this perspective, complex Klein-Gordon theory is just a special case of real Klein-Gordon theory with an internal degree of freedom.
To elaborate: a complex vector space is, by definition, an additive group of vectors together with a rule for multiplying vectors by complex numbers such that the multiplication rule obeys certain constraints. That rule can be broken down into two parts: a rule for multiplying vectors by \emph{real} numbers, together with a real-linear map from vectors to vectors which represents multiplication by $i$. If we write that map as \textbf{J}, we have that
\be
(\alpha + i \beta)\vctr{v}=\alpha \vctr{v}+\beta \textbf{J}\vctr{v}.
\ee
And in fact, if \mc{V} is \emph{any} real vector space with a real-linear map $\mathbf{J}:\mc{V}\rightarrow\mc{V}$ satisfying $\mathbf{J}^2=-1$, this equality suffices to make \mc{V} into a complex vector space; for this reason, any such map is known as a \emph{complex structure} (a more careful mathematical treatment may be found in the Appendix).
So no true generalisation of real linear field theory is gained by allowing complex fields as a separate case. Rather: a complex linear field is just a real-linear field defined on a \mc{V}-bundle such that (i) there is a complex structure \textbf{J} on \mc{V}, and (ii) if $\varphi$ is a solution to the equations of motion, so is $\mathbf{J}\varphi$.
Of course, (ii) is just another way of saying that multiplication by \textbf{J} is a symmetry of the theory, and this brings us on to the more general question of what the symmetries of a field theory are. For our purposes, a symmetry may be taken to be any smooth fibre-preserving map $f$ of the field bundle onto itself which takes dynamically possible fields to other dynamically possible fields. (Recall that a fibre-preserving map is one where two vectors initially in the same fibre are not taken to different fibres, so that $f$ induces a diffeomorphism $\overline f$ on \mc{M}; in less geometric language, a fibre-preserving map is a map $\overline f$ of the spacetime \mc{M} onto itself, together with a rule, for each spacetime point $x$, taking vectors at $x$ to vectors at $\overline f(x)$).
We will, however, be interested mainly in \emph{Lagrangian} symmetries: symmetries with the property that the Lagrangian density, evaluated at $\overline f(x)$ with respect to the new field $f\cdot\phi$, is equal to the Lagrangian density evaluated at $x$ with respect to the old field $\phi$. These in turn can usefully be divided into three categories:
\begin{enumerate}
\item \emph{Rigid internal symmetries}, where $f\cdot \phi(x)$ depends only on $\phi(x)$: all such symmetries can be written as
\be
f \cdot \phi(x)=U(\phi(x))
\ee
for some fixed $U:\mc{V}\rightarrow \mc{V}$.
%is generated by some smooth map of \mc{V} to itself: $f\cdot \phi (x)$=$U (\phi(x))$ for some fixed $U:\mc{V}\rightarrow \mc{V}$.
\item \emph{Gauge internal symmetries}, where $f\cdot \phi(x)$ depends on $\phi(x)$ and $x$: all such symmetries can be written as
\be
f \cdot \phi(x)=U(x)(\phi(x))
\ee
where $U(x)$ is some spacetime-dependent map from $\mc{V}$ to itself.
\item \emph{Spacetime symmetries}, where $f\cdot\phi(x)$ is not determined by $x$ and $\phi(x)$ alone.
\end{enumerate}
(My distinction between `rigid' and `gauge' symmetries is basically the same as the distinction between `global' and `local'; I resort to neologisms instead of using the standard terminology in order to avoid getting bogged down in semantic questions as to whether this is the ``right'' definition of a local symmetry.)
As is well known, the symmetries of a theory form a group; clearly, the rigid internal symmetries form a subgroup of that group.
As usual, we also distinguish between ``small'' and ``large'' symmetries: the former, but not the latter, lie in the connected component of the symmetry group containing the identity.
Spacetime symmetries are usually generated by some underlying symmetry of the spacetime metric: in the examples above, then the familiar (small) Poincar\'{e} symmetries are spacetime symmetries. My concern will mostly be with internal symmetries, and indeed mostly with \emph{rigid} internal symmetries. For brevity, in fact, I will drop the terms ``small'' and ``rigid'', so that an internal symmetry is small and rigid unless otherwise stated. In particular, I use ``internal symmetry group'' to denote that connected component of the group of rigid symmetries which contains the identity.
In the cases considered above, for instance:
\begin{itemize}
\item In cases $A$, $G$ and $F$ (real scalar and vector fields, and the Majorana spinor field), the internal symmetry group is trivial.
\item In cases $B$, $D$ and $E$ (complex scalar fields, and Weyl and Dirac spinor fields), the internal symmetry group is $U(1)$: the action of $\cos \theta \id + \sin \theta \mathbf{J}$ is a symmetry for any value of $\theta$, and $\theta$ generates the same transformation as $\theta + 2N \pi$.
\item In case $C$ (scalar fields with internal degrees of freedom), the internal symmetry group is $SO(N)$. (Note that, as a special case, when $N=2$ then the internal symmetry group is $SO(2)\simeq U(1)$: as promised, complex scalar field theory is a special case of $C$).
\item In case $H$ (free gauge boson theory), the internal symmetry group is $SO(Dim(g))$, where $Dim(g)$ is the dimension of the Lie algebra. Notice that this may not coincide with (the origin-containing component of) \mc{G}, the Lie group of which \textbf{g} is the Lie algebra, as a subgroup. If $\mc{G}=SU(2)$, for instance, the internal symmetry group is $SO(3)$, which at least is locally isomorphic to $SU(2)$; if $\mc{G}=SU(3)$, then the internal symmetry group is $SO(8)$, which is a considerably larger group.
\end{itemize}
In the rest of this paper, I will need some further assumptions about the symmetry groups of the fields I consider. Firstly, I will assume that the internal symmetry group is compact (all physical fields seem to have this property). As is well known, this entails the existence of an inner product on \mc{V} invariant under internal symmetry transformations.\footnote{To construct this inner product explicitly, let $\{v,w\}$ be an arbitrary inner product on \mc{V}, and define
\[
(v,w)=\int_\mc{G}\dr{\mu}(g)\{g\cdot v,g \cdot w\},
\]
where \mc{G} is the internal symmetry group, $g\cdot v$ is the action of $g\in \mc{G}$ on the vector $g$, and $\mu$ is the Haar measure.}
Secondly, I will assume that the internal symmetries (rigid and gauge) commute with the spacetime symmetries. (In the absence of supersymmetry, the Coleman-Mandula theorem proves that this must be the case; see, \egc, \citeN{wessbagger}.)
Finally, I will assume that the internal symmetry group acts \emph{irreducibly} on \mc{V}. In a sense this is no restriction at all: if the internal group acts reducibly (so that $\mc{V}=\mc{V}_1 \oplus \mc{V}_2$, and each $\mc{V}_i$ is invariant under the action of $g$) then we could perfectly well regard the theory as two fields instead of one, with one field taking values in $\mc{V}_1$ and the other in $\mc{V}_2$. One shallow reason for the requirement, then, is just that it is in some sense more ``natural'' to regard reducible theories as multi-field theories\footnote{This was basically the motivation adopted by Wigner in his classic~\citeyear{wignerclassification} classification of quantum \emph{particles} in terms of \emph{irreducible} representations of the Poincar\'{e} group.}; in the quantum case, renormalisation offers a deeper reason, as will be seen in section \ref{nonlinear}.
\section{Quantizing linear field theories}\label{linearquantum}
We are interested in extracting the properties of particles from a quantized
field theory, and particle phenomenology is associated with a free field theory
--- that is, with a field theory with linear dynamical equations. According
to the conventional position in theoretical particle physics (which I refer to hereafter as \emph{Lagrangian QFT}), this is not because of any
pathology of interacting field theories, but simply because particles are a useful
approximation, appropriate only in regimes where the interactions are relatively
small and can be treated perturbatively. According to various more formal approaches to QFT
(notably, algebraic quantum field theory), it is because we do
not understand how to quantize interacting field theories at all, the success of
Lagrangian QFT notwithstanding.
The Lagrangian position will come to the forefront in the next section, but for now, we can confine our attention
to the free-field case. We will also assume that the theory has no gauge symmetries (such symmetries are generally dealt with by adopting some gauge-fixing convention prior to quantization, though there are ways which preserve the gauge symmetry at the quantum level).
Because these theories have linear field equations, the solutions of these equations can be expressed as a sum of so-called \emph{normal-mode} solutions: solutions of the form
\be
\phi(x,t)=f_k(x)(C_k \exp(-i \omega_k t)+C_k^* \exp(+i \omega_k t))
\ee
for some function $f_k$ (which may take values in the internal space of the theory, for instance by being spinor-, vector-, or su(3)-valued), some complex number $C_k$, and some positive real number $\omega_k$ (for details, see any text on field theory). A general solution to the equations can then be written as
\be
\phi(x,t)=\int \dr{k}f_k(x)(C_k \exp(-i \omega_k t))+f_k^*(x)(C_k^* \exp(+i \omega_k t))
\ee
where the ``integral'' over $k$ is schematic, and might include discrete sums and/or continuous integrals.
Quantizing these theories, in outline, is like quantizing any theory: the end product should be a Hilbert space (call it \mc{F}) representing the possible states of the field theory, together with various operators $\op{\psi}(x,t)$, $\partial_\mu \op{\psi}(x,t)$ which are the quantizations of the classical observables. (Since these field theories have internal degrees of freedom, their operators will have indices which range over those internal degrees of freedom.)
The quantization can be performed in a variety of ways, but the outcome is the same: the Hilbert space of the quantum field theory is the (symmetric or antisymmetric) \emph{Fock space}
\be
\mc{F}=\sum_{N=0}^\infty \mc{S}_N\mc{H}^N_{1P}.
\ee
Here:
\begin{enumerate}
\item $\mc{S}_N$ is the $N$-fold symmetrisation or antisymmetrisation operator;
\item $\mc{H}^N$ is the $N$-fold tensor product of $\mc{H}$ (so $\mc{H}^3=\mc{H}\otimes \mc{H}\otimes \mc{H}$, for instance);
\item $\mc{H}_{1P}$ is the \emph{one-particle Hilbert space}.
\end{enumerate}
The symmetric case, of course, corresponds to bosons; the antisymmetric case, to fermions. So the problem of how to quantize a free field theory reduces to two questions: whether the field is bosonic or fermionic, and what the one-particle subspace is. My concern in this paper is entirely with the latter.
The one-particle subspace can be constructed from the space of solutions to the classical equations of motion in a fairly algorithmic way (again, the algorithm can be derived in a variety of ways). We begin with the real-linear space $\mc{S}$ of classical solutions to the field equations; recall that each element of \mc{S} is a section of a \mc{V}-bundle over \mc{M}. Then we complexify \mc{S}: that is, replace it with the space $\mc{S}^\mc{C}=\mc{S}\oplus \mc{S}$, equipped with the complex structure \textbf{J} defined by $\mathbf{J}(v,w)=(-w,v)$. This is equivalent to complexifying the \mc{V}-bundle (to produce a bundle whose typical fibre is the complexification $\mc{V}^\mc{C}$ of \mc{V}) and taking $\mc{S}^\mc{C}$ to be those sections of this bundle which satisfy the dynamical equations. (Again, a more careful mathematical discussion of complexification can be found in the Appendix).
Next, we fix a foliation of \mc{M} by hyperplanes, and a direction of increasing time along that foliation. This defines a time coordinate $t$, and any element of \mc{S} can thus be written as $\psi(\vctr{x},t)$, where $\psi(\vctr{x},t)$ is a vector in \mc{V}. We can then construct the Fourier transform $\hat\psi(x,\omega)$ of $\psi(x,t)$ with respect to $t$, and thus divide any solution into its positive and negative frequency parts: $\psi=\psi_+ +\psi_-$, where $\hat\psi_+(x,\omega)$ is non-zero only for $\omega>0$ and $\hat\psi_-(x,\omega)$ is non-zero only for $\omega<0$. This process is uniquely defined and linear, so it divides $\mc{S}^\mc{C}$ into positive and negative frequency subspaces $\mc{S}^\mc{C}_+$ and $\mc{S}^\mc{C}_-$. We discard the negative-frequency subspace and work only with the positive-frequency one.
In terms of the modal analysis above, the positive-frequency solutions are the solutions of the form
\be\phi(x,t)=\int \dr{k}f_k(x)C_k \exp(-i \omega_k t).
\ee
Next, we provide $\mc{S}^\mc{C}_+$ with an inner product, which we do in two steps. Firstly, if $f$ and $g$ are positive-frequency solutions to the complex Klein-Gordon equation, we define
\be
(f,g)=\int \dr{^3k} \frac{1}{\omega(k)}\hat f^*(k,\omega(k))\hat g(k,\omega(k))
\ee
where now $\hat f(k,k_0)$ is the Fourier transform of $f(x,t)$ with respect to both $x$ and $t$, and where $\omega(k)=+\sqrt{k\cdot k+m^2}$. This inner product is well known to be invariant under (small) Poincar\'{e} transformations.
And secondly, if $h_{ab}$ is a (real) inner product on $\mc{V}$ invariant under the action of the internal symmetry group (recall that the existence of such an inner product is entailed by the compactness of the internal symmetry group), we can define
\be
\langle\varphi,\psi\rangle=h_{ab}(\varphi^{a*},\psi^b).
\ee
for any $\varphi,\psi\in \mc{S}^\mc{C}_+$.
The result is a (complex) inner product for the complex vector space $\mc{S}^\mc{C}_+$, invariant under both internal and spacetime symmetries. Lastly, we turn $\mc{S}^\mc{C}_+$ into a Hilbert space by completing in this norm; the resultant Hilbert space is the one-particle space.
The functions in this space at least look like conventional one-particle wave-functions. In particular, in the trivial case of a scalar field with no internal degrees of freedom, they are simply complex functions on \mc{M} --- at any given time, they are just square-integrable complex functions on $\Sigma$. In the more general case of a field with internal degrees of freedom, they are maps from \mc{M} to the complexification $\mc{V}^C$ of \mc{V}.
It is a somewhat complicated business to explain in exactly what sense these functions really \emph{can} be treated as wave-functions, but the bottom line is that actual physical practice is indeed to treat them as wave-functions, and that this practice appears to be justifiable. (See, \egc, chapters 1--3 of~\citeN{waldQFT} for the details, and~\citeN{wallaceqftloc} for an extended discussion.)
For our purposes, the crucial result is this:
\begin{quote}
Single particles of a quantized free field obey the complexified version of the original field's dynamical equations, with the extra restriction that they must be positive-frequency, and the Hilbert space norm for their wavefunctions is invariant under symmetry transformations.
\end{quote}
Unsurprisingly, there is a close relation between the symmetries of a classical field and the symmetries of the associated one-particle quantum theory. Since:
\begin{enumerate}
\item any classical symmetry leaves the dynamical equations of the classical linear field theory invariant;
\item any real transformation which leaves the dynamical equations invariant will also leave their complexification invariant;
\item the internal symmetries of the one-particle Hilbert space are exactly those Hilbert-space-norm-preserving transformations which leave the one-particle dynamical equations invariant;
\item the one-particle dynamical equations are just the complexification of the classical dynamical equations
\end{enumerate}
it follows that a classical symmetry is a symmetry of the one-particle subspace iff it takes positive-frequency solutions to positive-frequency solutions. In particular, the (small) Poincar\'{e} symmetries have this property; so do the internal symmetries.
It is natural to define internal symmetries of the one-particle theory in exactly the same way as for the classical theory. It follows that the internal symmetry \emph{group} of the one-particle theory is the same as for the classical theory, and the \emph{action} of that group is the \emph{complexification} of the action of the group on the classical theory (that is, the extension, by complex linearity, of the action on the real space of classical solutions to the complex space of positive-frequency solutions).
\section{Nonlinear field theories}\label{nonlinear}
If all fields were linear, the world would be boring: it is through nonlinearity that interactions enter physics. The true significance of linear field theory is that (i) in some circumstances (such as interactions with an external potential) it is a sufficient approximation to nonlinear theory in its own right; (ii) more importantly, in many circumstances we can treat the non-linear part of a theory as a perturbation of the linear part. The full details of how this works in quantum field theory are well beyond the scope of this paper (and of limited relevance to its goals); however, some important insights can be gained from a semi-classical discussion of the process. For more details, see the appropriate sections of, \egc,~\citeN{peskinschroeder}, \citeN{chengli}, or \citeN{coleman}.
It will be helpful to begin with a simple example: consider the Lagrangian
\be
\mathrm{L}=\frac{1}{2}\dot x^2-V(x)
\ee
for a single one-dimensional particle, and suppose that $V$ is a smooth function with a global minimum at $x_0$. Then by elementary Taylor expansion, the Lagrangian can be written as
\be
\mathrm{L}=\frac{1}{2}\dot x^2-\frac{1}{2}V''(x_0)x^2 + \delta V(x-x_0)-V(x_0)
\ee
where $\delta V(x)$ is $o(x^2)$ (that is, $\delta V(x)/x^2 \rightarrow 0$ as $x \rightarrow 0$). In other words, provided that we are interested in motion sufficiently close to $x_0$ --- where ``sufficiently close'' will depend on the precise form of $V$ --- the Lagrangian of the theory is close to a simple harmonic oscillator, oscillating around $x_0$. If we quantize the theory, we would expect --- at least for a certain set of states --- that the theory can be treated as a harmonic oscillator together with a small correction term which can be analysed via perturbation theory.
Since a (bosonic) free-field quantum theory is --- mathematically speaking --- just a collection of harmonic oscillators, particles in quantum field theory can be understood in much the same way. Consider, for instance, the real Klein-Gordon theory with the quadratic mass term replaced by a more general potential:
\be
\mc{L}=\frac{1}{2}(\partial_\mu \varphi \partial^\mu \varphi)+ V(\varphi).
\ee
If $V$ has a minimum at $\varphi_0$, we can make the coordinate transformation $\varphi\rightarrow\varphi-\varphi_0$, and rewrite the Lagrangian
as
\be
\mc{L}=\frac{1}{2}(\partial_\mu \varphi \partial^\mu \varphi +V(\varphi_0)+ V''(\varphi_0)(\varphi-\varphi_0)^2)+\delta V (\varphi-\varphi_0),
\ee
where again $\delta V(\varphi)$ tends to zero at least as fast as $\varphi^3$ as $\varphi$ tends to zero. This theory now has the form of the Klein-Gordon equation (with $m=\sqrt{V''(\varphi_0)}$) together with a dynamically irrelevant constant term and a perturbation term $\delta V(\varphi)$; prima facie we might expect that the perturbation can be treated as small, so that the theory can be analysed perturbatively as a free-field theory --- that is, a many-particle theory --- together with an interaction term which can be understood as generating scattering between particles.
This expectation is naive, though. In fact, the perturbation does not normally generate \emph{small} terms: it generates \emph{infinite} terms. This is the notorious ``problem of infinities'' of quantum field theory: the second-order and higher-order terms in the perturbative expansion for interacting quantum field theories, calculated formally, are infinitely large. The term ``problem'', however, is a misnomer (at least from the point of view of Langrangian QFT): the difficulty can be resolved in two steps. Firstly, the \emph{infinities} need to be understood as a consequence of naively assuming that the field theory can be defined for arbitrarily short lengthscales. If some sort of short-distance cutoff is imposed (most crudely, by replacing the continuum of spacetime points with a lattice) then the higher-order terms in the perturbative expansion become finite (albeit very large). This raises the problem that the dynamics become very sensitive to the details of the cutoff; in fact, though, it turns out that the effects of those details can be absorbed into adjustments to a very few parameters (basically the mass, the overall magnitude of the fields, and a small number of parameters determining the interaction term). For instance,the Lagrangian density of the scalar theory above can be rewritten as
\be
\mc{L}=\frac{1}{2}(\partial_\mu \varphi_R \partial^\mu \varphi_R + m_R^2\varphi_R^2)+\delta V_R (\varphi_R)+\mbox{constant},
\ee
where $\delta V_R$ really is ``small'' in perturbation-theory terms.\footnote{In fact, the form of $\delta V_R$ is sharply constrained by the renormalisation process: if we expand it as a power series, for instance, all but the $\varphi_R^3$ and $\varphi_R^4$ terms will have vanishingly small dynamical influence except at energy scales close to the cutoff scale; see \citeN{binney} or \citeN{peskinschroeder} for details. This plays no further role in my analysis, however.}
Of course, since the ``bare'' --- \iec, pre-adjustment --- values of these parameters are not experimentally accessible, what we actually measure is the renormalised parameters. Technical details of the renormalisation process can be found in, \egc, \citeN[pp.\,315--346]{peskinschroeder}, \citeN[pp.\,31--66]{chengli}, \citeN[pp.\,353--374]{binney} or \citeN[pp.\,99--112]{coleman}; for a more detailed conceptual discussion, see \citeN{wallaceconceptualqft}.
For the purposes of this paper, it is crucial to ask how symmetries of the full theory translate into symmetries of the linearised theory. At first sight, it might appear that \emph{any} symmetry of the former would be a symmetry of the latter; however, this fails to take into account the possibility that the point of expansion for the linear theory is itself not invariant under a symmetry transformation. This is the famous phenomenon of \emph{spontaneous symmetry breaking}: a classic example is the following special case\footnote{Given renormalisation, it actually isn't a particularly special case! --- this is the most general possible form of a renormalised complex scalar field theory.} of the scalar theory with internal degrees of freedom (case C in my earlier taxonomy):
\be
\mc{L}=\frac{1}{2}\langle\partial_\mu \varphi^, \partial^\mu \varphi\rangle+\alpha \langle\varphi,\varphi\rangle+\beta \langle\varphi,\varphi\rangle^2,
\ee
where $\alpha$ and $\beta$ are real numbers with $\beta>0$ and $\langle\cdot,\cdot\rangle$ is an inner product for the internal space \mc{V}.
Elementary calculus tells us that the location of the minimum of the constant-potential (that is, constant-$\varphi$) part of this Lagrangian
depends on the sign of $\alpha$. If it is positive, the minimum occurs at $\varphi=0$. If it is negative, $\varphi=0$ is actually a maximum, and the minima occur at $\langle\varphi,\varphi\rangle=-\alpha/2\beta$.
In the former case, we can linearise about $\varphi=0$. The Lagrangian (in unrenormalised form) will be the complex Klein-Gordon Lagrangian with $m=\sqrt{2 \alpha}$ together with a perturbation proportional to $\phi^4$. In the latter case, however, there is a continuous family of minima: if $\varphi$ is a minimum, so is $\mathbf{R}\varphi$, where $\mathbf{R}$ is any rotation operator, and we can choose to linearise about any one of them. If, for instance, we choose to linearise about $\varphi_0=(\sqrt{-\alpha/2\beta},0)$ then the Lagrangian density becomes
\be
\mc{L}=\sum_{a