An introduction to Special Relativity

This page is incomplete and under construction and is based on lectures given to my class of undergraduate students.

English Translation of Einstein's 1905 SR paper — This is worth taking a look.

# Lecture 0: What is a vector?

In high-school, one is told that "A vector is something that has magnitude and direction." — while this might make one comfortable about working with vectors it doesn't do justice to the importance of identifying something as a vector. A 100g orange thrown up in the air is not a vector, right?

# Lecture 1: Towards Special Relativity

## Electric and magnetic fields necessarily mix under boosts

Consider the following situation seen by observer one.

Two masses, $m_1,m_2$, one of which has charge $q$ and one has no charge, are both lying at rest. A constant magnetic field $\mathbf{B}$ is also present.

Observer one sees no force acting on either of the particles and hence both particles remain at rest. Now consider a second observer who is moving with a constant velocity $\mathbf{u}$ w.r.t. observer one. Here is what the second observer will see.

Two masses, $m_1,m_2$, one of which has charge $q$, both moving with constant velocity $-\mathbf{u}$.

We expect the second observer to agree with observer one on all forces — in particular, they will agree that there are no forces acting on both particles — such observers will be termed mutually inertial. Let us see the implications from this for the two particles and in particular, for the charged particle. Observer two will see a particle moving with velocity $-\mathbf{u}$. Naively, observer two expects the charged particle to experience a force given by the Lorentz force law:

(1)
\begin{align} -q \left(\mathbf{u}\times \mathbf{B}\right)\ , \end{align}

while the uncharged particle will have no force. Clearly, this cannot be correct. First, there can be no pseudo-forces involved as they would affect the uncharged particle as well. Thus, the additional force that would cancel this must be electromagnetic in origin as it affects only charged particles. Second, this problem is present even when the velocity $\mathbf{u}$ is taken to be arbitrarily small — thus there is no need to consider exotic effects due to high speeds. A simple solution is to assume that the electromagnetic field seen by observer two is of the form:

(2)
\begin{align} \big(\mathbf{E}'(\mathbf{u},\mathbf{B}),\mathbf{B}'(\mathbf{u},\mathbf{B})\big)\ . \end{align}

Let us first study the conditions that this field has to satisfy:

1. In the limit $\mathbf{u}\rightarrow 0$, one has $\big(\mathbf{E}'(\mathbf{u},\mathbf{B}),\mathbf{B}'(\mathbf{u},\mathbf{B})\big)=(\mathbf{0},\mathbf{B})$.
2. Linearity of Maxwell equations implies that they must be linear functions of $\mathbf{B}$.
3. The correction at first-order in $\mathbf{u}$ must be proportional to $(\mathbf{u}\times \mathbf{B})$. Since this term is a polar vector, it can only appear as the first-order correction in the expansion for $\mathbf{E}'$ and there is no term at first-order in the expansion for $\mathbf{B}'$.
4. Finally, the total electromagnetic force of the charged particle must vanish.

Putting all this together, we obtain

(3)
\begin{eqnarray} \mathbf{E}' &=& \mathbf{0} + \mathbf{u}\times \mathbf{B}+ \mathcal{O}(u^2) \\ \mathbf{B}' &=& \mathbf{B} + \mathcal{O}(u^2) \end{eqnarray}

It is easy to see that the charged particle has no force acting on it to order $u$. In fact, there are no more vectors that can be constructed from $\mathbf{u}$ and $\mathbf{B}$!

Let us now ask a different question for the same set-up considered above. We know that observer one will see a non-zero electric field due to the charge. For simplicity, let us assume that it has been placed at the origin of observer one's coordinate system. The electric and magnetic field due to this point charge is

(4)
\begin{align} \mathbf{E} = k \frac{q\mathbf{x}}{r^3}\textrm{ and } \mathbf{B}=\mathbf{0}\ , \end{align}

where $k=1/4\pi\epsilon_0$ and $r=|\mathbf{x}|$. What about the electric and magnetic fields seen by observer two? Again, let us assume that the charged particle was at the origin of the observer two's coordinate system at initial time. The charge's location at some later time will be $\mathbf{x}' =-\mathbf{u} t$. After taking into account that there is no term that one can add at order $u$ since $\mathbf{u}\times \mathbf{E}$ is not a polar vector, the electric field seen by observer two is then given by (we take into account the shift in the position of the charge as indicated below when writing out the electric field seen by observer two)

(5)
\begin{align} \mathbf{E}'(\mathbf{x}',t') = k \frac{q\ (\mathbf{x}'+\mathbf{u}t)}{{|\mathbf{x}'+\mathbf{u}t|}^3} + \mathcal{O}(u^2)= \mathbf{E}(\mathbf{x},t)+ \mathcal{O}(u^2)\ , \end{align}

where we indicate possible corrections at order $u^2$ that may appear. The magnetic field due to a moving charge is given by the Biot-Savart law. One has

(6)
\begin{align} \mathbf{B}' = \frac{\mu_0}{4\pi} \frac{-q\mathbf{u}\times (\mathbf{x}'+\mathbf{u}t)}{{|\mathbf{x}'+\mathbf{u}t|}^3} = -\frac{\mathbf{u}\times \mathbf{E}}{c^2} + \mathcal{O}(u^2)\ , \end{align}

where we used $\mu_0\epsilon_0=1/c^2$.

Again, we can use linearity of the two sets of results to get the general transformation relating an arbitrary electromagnetic field $(\mathbf{E},\mathbf{B})$ seen by observer one to the electromagnetic field $(\mathbf{E}',\mathbf{B}')$ seen by observer two. We get

(7)
\begin{eqnarray} \mathbf{E}'(\mathbf{x}',t') &=& \mathbf{E}(\mathbf{x},t) + \mathbf{u}\times \mathbf{B}(\mathbf{x},t)+ \mathcal{O}(u^2)\ , \\ \mathbf{B}'(\mathbf{x}',t') &=& \mathbf{B}(\mathbf{x},t) - \frac{\mathbf{u}}{c^2}\times \mathbf{E}(\mathbf{x},t)+ \mathcal{O}(u^2)\ , \end{eqnarray}

where we have added the space and time coordinates for the two observers to make the formulae valid for varying electromagnetic fields. It is not hard to see that both $\mathbf{E}\cdot \mathbf{B}$ and $(\mathbf{E}\cdot \mathbf{E} - c^2 \mathbf{B}\cdot\mathbf{B})$ are invariant to order $u$.

Conclusion: We see that electric and magnetic fields mix under transformations relating the electromagnetic fields seen by different mutually inertial observers. This is a direct consequence of Newton's first law. The above deriviation of the mixing does not require any drastic assumption beyond linearity of Maxwell's equations and the fact that electric fields are polar vectors and magnetic fields are axial vectors. In particular, we did not assume that the speed $u$ was comparable to the speed of light.

### Maxwell's equations in Gaussian units

Gaussian units are more natural in special relativity. We have already seen that electric and magnetic fields can mix — it makes sense to measure them in the same units. Note that in SI units, $E/B$ has dimensions of velocity. Gaussian units can be obtained from SI units in three steps:

1. Consider Coulomb's law: $\mathbf{E}=kq \tfrac{\hat{e}_r}{r^2}$. $k=1/4\pi\epsilon_0$ in SI units where charge in measured in Coulombs. In Gaussian units, $k=1$ and charge is measured in Statcoulombs. Thus, one way to go from SI units to Gaussian units is to set $\epsilon_0=(4\pi)^{-1}$.
2. Recall that $\mu_0\epsilon_0=1/c^2$. Thus, in Gaussian units, $\mu_0=4\pi/c^2$.
3. Carry out the rescaling: $\mathbf{B}\rightarrow \mathbf{B}/c$.

This completes the conversion from SI units to Gaussian units.
Exercise: Show that the Lorentz Force law in Gaussian units takes the following form.

(8)
\begin{align} \mathbf{F}=q\left(\mathbf{E}+\frac{\mathbf{v}}c \times \mathbf{B}\right) \end{align}

Exercise: Show that Maxwell's equations take the following form in Gaussian units.

(9)
\begin{eqnarray} \nabla\cdot \mathbf{E}=4\pi\rho \quad &,& \quad \nabla\times \mathbf{B}-\frac1c \frac{\partial \mathbf{E}}{\partial t} = \frac{4\pi}c \mathbf{J}\ , \\ \nabla\cdot \mathbf{B}=0~~~ \quad &,& \quad \nabla\times \mathbf{E}+\frac1c \frac{\partial \mathbf{B}}{\partial t} =0\ . \end{eqnarray}

Remark 1: Note the disappearance of the constant $\epsilon_0$ — it must be considered a conversion factor like the one that connects metres to yards or the Boltzmann constant $k_B$ which connects temperature to energy. Maxwell's equations thus has one relevant constant $c$ which turns out to be the speed of a propagating electromagnetic wave — this is the fancy way of referring to the speed of light.
Remark 2: The constant $c$ can be used to measure time in units of length. Define $x^0=ct$. We see that all time derivatives appear with a factor of $c$ and hence can be written as $\partial/\partial x^0$. In fact, if we turn off sources, i.e., set $(\rho,\mathbf{J})$ to zero, there are no factors of $c$ remaining.
Remark 3: The mixing between and electric fields takes a nice symmetric form in Gaussian units.

(10)
\begin{eqnarray} \mathbf{E}'(\mathbf{x}',t') &=& \mathbf{E}(\mathbf{x},t) + \frac{\mathbf{u}}c\times \mathbf{B}(\mathbf{x},t)+ \mathcal{O}(u^2)\ , \\ \mathbf{B}'(\mathbf{x}',t') &=& \mathbf{B}(\mathbf{x},t) - \frac{\mathbf{u}}{c}\times \mathbf{E}(\mathbf{x},t)+ \mathcal{O}(u^2)\ , \end{eqnarray}

### Scalar and Vector potentials

The equations given in the second line can be ‘solved’ in terms of the vector potential $\mathbf{A}$ and the scalar potential $\phi$ as follows:

(11)
\begin{align} \mathbf{B}=\nabla \times \mathbf{A} \quad,\quad \mathbf{E}= -\nabla \phi - \frac1c \frac{\partial \mathbf{A}}{\partial t} \ . \end{align}

The remaining two Maxwell equations become

(12)
\begin{eqnarray} -\nabla^2 \phi + \frac{\partial^2 \phi}{c^2\partial t^2}- \frac{\partial}{c\partial t}\left(\frac{\partial\phi}{c\partial t} +\nabla\cdot \mathbf{A}\right) &=&4\pi\rho \quad,\\ -\nabla^2 \mathbf{A} + \frac{\partial^2 \mathbf{A}}{c^2\partial t^2} +\nabla \left(\frac{\partial\phi}{c\partial t} +\nabla\cdot \mathbf{A}\right) &=& \frac{4\pi}c \mathbf{J} \end{eqnarray}

Defining the D'Alembertian operator

(13)
\begin{align} \Box \equiv \frac{\partial}{c^2 \partial t^2} - \nabla^2 \ , \end{align}

and

(14)
\begin{align} \partial \cdot A = \frac{\partial\phi}{c\partial t} +\nabla\cdot \mathbf{A} \ , \end{align}

we can rewrite the remaining Maxwell's equations as

(15)
\begin{eqnarray} \Box \phi- \frac{\partial}{c\partial t}\left(\partial \cdot A\right) &=&4\pi\rho \quad,\\ \Box \mathbf{A} +\nabla \left(\partial \cdot A\right) &=& \frac{4\pi}c \mathbf{J} \end{eqnarray}

Since we have shown that the $\mathbf{E}$ and $\mathbf{B}$ fields can mix, we can now work out how the vector and scalar potentials mix. Again based on symmetries, we can write an ansatz for the mixing:

(16)
\begin{eqnarray} \mathbf{A}'(\mathbf{x}',t') &= \mathbf{A}(\mathbf{x},t) + a_1\ \frac{\mathbf{u}}{c}\ \phi(\mathbf{x},t) + \mathcal{O}(u^2)\ , \\ \phi'(\mathbf{x}',t')&= \phi(\mathbf{x},t) + a_2\ \frac{\mathbf{u}}{c}\cdot \mathbf{A}(\mathbf{x},t) + \mathcal{O}(u^2)\ , \end{eqnarray}

where $a_1$ and $a_2$ are constants to be determined. It is easy to see that the terms written above are linear in $\mathbf{u}$ and transform nicely under parity as well. We will also need to relate the coordinates in the two frames. Again, transformations under rotations and parity constrain what we can write down. Let us denote by $x^0=ct$ — this way both space and time are measured in units of length — this makes it easy for us to check the dimensionality of various terms.

(17)
\begin{eqnarray} \mathbf{x}' &=& \mathbf{x} - \frac{\mathbf{u}}{c}\ ct + \mathcal{O}(u^2)\ , \\ {x^0}' &=& x^0 + b_1 \frac{\mathbf{u}}{c}\cdot \mathbf{x} + \mathcal{O}(u^2)\ , \end{eqnarray}

where we included a term that mixes time with spatial coordinates — this comes with a coefficient $b_1$. When $b_1=0$ and there are no $\mathcal{O}(u^2)$ corrections, we recover the transformation of Galilean boosts with velocity $\mathbf{u}$ — it is important to remember that Galilean boosts do not need us to assume that the speed is small i.e., $|\mathbf{u}|\ll c$. We would like to see if the mixing of the electromagnetic fields leads us naturally to Galilean boosts (it doesn't but we have not shown that yet).

The above transformation of coordinates implies that the temporal and spatial derivatives transform as

(18)
\begin{eqnarray} \frac{\partial}{\partial {x^0}'} & =& \frac{\partial}{\partial x^0} + \frac{\mathbf{u}}{c}\cdot \nabla \\ \nabla' &=& \nabla - b_1 \frac{\mathbf{u}}{c} \ \frac{\partial}{\partial x^0} \ . \end{eqnarray}

With these preparations, we can compute how $\mathbf{B}' = \nabla' \times \mathbf{A}'$ and $\mathbf{E}' = -\nabla' \phi' - \tfrac{\partial \mathbf{A}'}{\partial {x^0}'}$ can be written in terms of the unprimed variables. We begin with $\mathbf{B}'$

(19)
\begin{align} \nabla' \times \mathbf{A}' =\nabla\times \mathbf{A} - b_1 \frac{\mathbf{u}}{c} \times \frac{\partial \mathbf{A}}{\partial x^0} - a_1\ \frac{\mathbf{u}}{c}\times \nabla \phi(\mathbf{x},t) \end{align}

We recover the transformation law for $\mathbf{B}'$ if $a_1=-1$ and $b_1=-1$. Now consider $\mathbf{E}'$:

(20)
\begin{align} -\nabla' \phi' - \frac{\partial \mathbf{A}'}{\partial x^{0'}}&=& -\nabla \phi +b_1 \frac{\mathbf{u}}{c} \ \frac{\partial\phi}{\partial x^0}-a_2\ \nabla(\frac{\mathbf{u}}{c}\cdot \mathbf{A} ) \\ && - \frac{\partial \mathbf{A}}{\partial x^0}- \frac{\mathbf{u}\cdot \nabla}{c} \mathbf{A}-a_1\frac{\partial \phi}{\partial x^{0}}\ \frac{\mathbf{u}}{c} \end{align}

In the above equation, line one on the RHS corresponds to the terms from $-\nabla' \phi'$ and the second line corresponds to terms from $- \frac{\partial \mathbf{A}'}{\partial x^{0'}}$. It is easy to the terms involving $\frac{\partial \phi}{\partial x^{0}}$ cancel if $a_1=b_1$ which is something we already obtained. Next, we recover the term involving $\mathbf{B}$ on using the identity: $\mathbf{u}\times (\nabla \times \mathbf{A})=\nabla(\mathbf{u}\cdot \mathbf{A})-(\mathbf{u}\cdot \nabla) \mathbf{A}$ provided that $a_2=-1$. In summary, we have fixed the three constants to be

(21)
\begin{align} a_1=a_2= b_1=-1\ . \end{align}

We thus see that the coordinates of space and time transform as follows:

(22)
\begin{eqnarray} {x^0}' &=& x^0 - \frac{\mathbf{u}}{c}\cdot \mathbf{x} + \mathcal{O}(u^2)\ , \\ \mathbf{x}' &=& \mathbf{x} - \frac{\mathbf{u}}{c}\ x^0 + \mathcal{O}(u^2)\ . \end{eqnarray}

The surprise is that it is incompatible with Galilean boosts since $b_1\neq 0$. Since the above formula holds only for small speeds, we will call it an infinitesimal Lorentz boost. We also have obtained the transformation law for the scalar and vector potentials between the two frames.

(23)
\begin{eqnarray} \phi'(\mathbf{x}',t')&=& \phi(\mathbf{x},t) -\ \frac{\mathbf{u}}{c}\cdot \mathbf{A}(\mathbf{x},t) + \mathcal{O}(u^2)\ , \\ \mathbf{A}'(\mathbf{x}',t') &=& \mathbf{A}(\mathbf{x},t) -\ \frac{\mathbf{u}}{c}\ \phi(\mathbf{x},t) + \mathcal{O}(u^2)\ , \end{eqnarray}

Remark: It is similar to the one we have obtained for the coordinates with $\mathbf{A}$ playing a role similar to $\mathbf{x}$ and $\phi$ playing the role of $x^0$.

Define the following four-vectors by combining a (three-) vector and a scalar under rotations. (with $\mu=0,1,2,3$)

(24)

We can write an identical transformation law for both four-vectors as follows:

(25)
\begin{align} \begin{pmatrix} v^{'0} \\ v^{'1} \\ v^{'2} \\ v^{'3} \end{pmatrix}= \begin{pmatrix} 1 & -\tfrac{u_1}{c} & -\tfrac{u_2}{c} & -\tfrac{u_3}{c} \\ -\tfrac{u_1}{c} & 1 & 0 & 0 \\ -\tfrac{u_2}{c} & 0 & 1 & 0 \\ -\tfrac{u_3}{c}& 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} v^0 \\ v^1 \\ v^2 \\ v^3 \end{pmatrix} +\mathcal{O}(u^2)\ , \end{align}

where $v^\mu$ can be either $x^\mu$ or $A^\mu.$

### What are the higher order corrections to the first-order mixing?

The idea is to study the invariances of Maxwell's equations. These are those coordinate transformations under which Maxwell's equations do not change form. In other words, we expect both $(\mathbf{E},\mathbf{B})$ and $(\mathbf{E}',\mathbf{B}')$ to solve Maxwell's equations in the coordinate systems of observers one and two respectively. This should, in principle, let us determine the relationship to all-orders in $u$. However, we shall postpone that for later and use the fact that coordinate transformations between frames should form a (continuous) group and that the group is of Lie type. For Lie groups, there is a natural connection between infinitesimal and finite transformations. The infinitesimal transformations are identified with Lie algebras and the finite ones are obtained by taking the exponential of the infinitesimal one.

Recalling how one defines three-vectors i.e., vectors under usual rotations as objects that transform like the spatial coordinates (displacements to be precise), we define four-vectors by the above transformation law. However, we only have the infinitesimal form. Can we obtain the finite form? We use a trick that relates Lie groups to Lie algebras — this is the exponential map. Let A be a $m\times m$ matrix. Then, one has

(26)
\begin{align} e^A = \sum_{n=0}^\infty \frac{A^n}{n!} = I_m + A + \cdots \, \end{align}

where $I_m$ is the $m\times m$ identity matrix. We observe that the infinitesimal Lorentz boost is of the form $I_4 + \lambda(\mathbf{u})+ \cdots$ where $\lambda$ is the matrix

(27)
\begin{align} \lambda(\mathbf{u}) = \begin{pmatrix} 0 & -\tfrac{u_1}{c} & -\tfrac{u_2}{c} & -\tfrac{u_3}{c} \\ -\tfrac{u_1}{c} & 0 & 0 & 0 \\ -\tfrac{u_2}{c} & 0 & 0 & 0 \\ -\tfrac{u_3}{c}& 0 & 0 & 0 \end{pmatrix}\ . \end{align}

Thus, it is natural to guess that the finite form of a Lorentz boost might be given by

(28)
\begin{eqnarray} \Lambda &=& e^\lambda = 1 + \lambda + \cdots \ , \textrm{ or } \\ {\Lambda^\mu}_\nu &=& {\big[\exp(\lambda)\big]^\mu}_\nu = \delta^\mu_\nu + {\lambda^\mu}_\nu + \cdots \ . \end{eqnarray}

Then, a four vector transforms as

(29)
\begin{align} {v'}^\mu = {\Lambda^\mu}_\nu \ v^\nu \ , \end{align}

where we have assumed the Einstein summation convention.

Exercise: Obtain the matrix ${\Lambda^\mu}_\nu$ for the simple case of a boost in the $x_1$ direction i.e., with $u_2=u_3=0$ by computing the matrix exponential.

### Symmetries of Maxwell's equations

Also, define the four-derivative

(30)
\begin{align} \partial_\mu \equiv \frac{\partial}{\partial x^\mu}\ , \end{align}

so that $\partial_\mu x^\mu=4$ and $\partial_\mu A^\mu=(\partial\cdot A)$.

Maxwell's equations in this four-vector notation takes the form of a single equation in terms of the four-vector potential

(31)
\begin{align} \boxed{\phantom{\bigg|} \Box A^\mu - \partial^\mu (\partial_\rho A^\rho)= \frac{4\pi}c J^\mu \ , } \end{align}

where $\partial^\mu=(\partial/\partial x^0, -\nabla)$.

### The Electromagnetic field strength

The electric and magnetic fields can be combined to form a single entity called the Electromagnetic Field Strength defined as follows:

(32)
\begin{align} F^{\mu\nu} \equiv \partial^\mu A^\nu - \partial^\nu A^\mu\ . \end{align}

Thus, one has

(33)
\begin{eqnarray} F^{0i} &=& \partial_0 A_i + \nabla_i \phi = - E_i\ \\ F^{ij} &=& -\nabla_i A_j +\nabla_j A_i = - \epsilon_{ijk} B_k\ . \end{eqnarray}

We will later see that this is second-rank antisymmetric tensor under Lorentz transformations.

More to come