[email protected])
----------------------------------------------------------------------
Part III: A Bit About General Relativity
======================================================================
This is PART III of the "Relativity and FTL Travel" FAQ. It is
an "optional reading" part of the FAQ in that the FTL discussion in
PART IV does not assume that the reader has read the information
discussed below. If your only interest in this FAQ is the
consideration of FTL travel with relativity in mind, then you may only
want to read PART I and PART IV.
In this part, we take a look at general relativity. The
discussion is rather lengthy, but I hope you will find it straight
forward and easy to follow. The subject of GR is quite new to this
FAQ, and your comments on the usefulness, ease of reading, etc. for
this part of the FAQ would be appreciated.
For more information about this FAQ (including copyright
information and a table of contents for all parts of the FAQ), see the
"Relativity and FTL Travel--Introduction to the FAQ" portion which
should be distributed with this document.
Contents of PART III:
5. Introduction to General Relativity
5.1 Reasoning for its Existence
5.2 The "New Inertial Frame"
5.3 Manifolds, Geodesics, Curvature, and Local Flatness
5.4 The Invariant Interval
5.5 A Bit About Tensors
5.6 The Metric Tensor and the Stress-Energy Tensor
5.7 Applying these Concepts to Gravity
5.7.1 The Basic Idea
5.7.2 Some Notes on the Physics and the Math
5.7.3 First Example: Back to SR
5.7.4 Second Example: Stars and Black Holes
5.8 Experimental Support for GR
5. Introduction to General Relativity
Thus far, we have confined our talks to the realm of what is
known as Special Relativity (or SR). In this section I will introduce
a few of the main concepts in General Relativity (or GR). The
difference between the two is basically that GR deals with how
relativity applies to gravitation. As it turns out, our concept of
how gravity works must be changed because of relativity, and GR
explains the new concept of gravity. It is called "General"
relativity because if you look at General Relativity in the case where
there is little or no gravity, you get Special Relativity (SR is a
special case of GR).
Now, GR is a heavily mathematical theory, and while I will try to
simply give the reader some understanding of the physical notions
underlining the theory, some mathematics will inevitably come into
play. I will, however, try to give simple, straight-forward
explanations of where the math comes from and how it helps explain the
theory. I will start by discussing why we might even think that
gravity and relativity are related in the first place. This will lead
us to change our concepts of space and time in the presence of
gravity. To discuss this new concept of space-time, we will need to
introduce the idea of mathematical constructs known as Tensors. The
two tensors we will talk about in specific are called the Metric
Tensor and the Stress-Energy Tensor. Once we have discussed these
concepts, we will look at how it all comes together to produce the
basic ideas behind the theory of general relativity. We will also
consider a couple of examples to illustrate the use of the theory.
Finally, we will mention some of the experimental evidence which
supports general relativity.
5.1 Reasoning for its Existence
To start off our discussion, I want to indicate why one would
reason that gravity and relativity are connected. While I could start
with a somewhat unbelievable thought experiment to explain the first
point I want to make, perhaps it will be better if I just tell you
about actual experimental evidence. We thus start by considering an
experiment in which a light beam is emitted from Earth and rises in
the atmosphere to some point where the light is detected. When one
performs this experiment, one finds that the energy of the light
decreases as it rises.
So, what does this have to do with our view of relativity and
gravity? Well, let's reason through the situation: First, we note
that the energy of light is related to its frequency. (If you think
of light as a wave with crests and troughs, and if you could make note
of the crests and troughs as they passed you, then you could calculate
the frequency of the wave as 1/dt, where dt is the time between the
point when one crest passes you and the point when the next crest
passes you.) So, if the energy of the light decreases (and thus its
frequency decreases), then dt (the time between crests) must increase.
Let's then consider a frame of reference sitting stationary on the
Earth. We will look at a space-time diagram in this frame which shows
the paths that two crests would take as the light travels away from
the Earth.
In Diagram 5-1 I have drawn indications of the paths the two
crests might take. The diagram shows distance above the Earth as
distance in the positive x direction, so as time goes on, the two
crests rise (move in the positive x direction) and eventually meet the
detector. Now, we don't know what the gravity of the Earth might do
to the light. We thus want to generalize our diagram by allowing for
the possibility that the paths of the crests might be influenced in
some unknown way by gravity. So, I have drawn a haphazard path for
the two crests marked with question marks. The actual paths don't
matter for our argument, but what does matter is this: whatever
gravity does to the light, it must act the same way on both crests.
Therefore, the two haphazard paths are drawn the same way.
Diagram 5-1
t # = detector's path
| #
| ?
| ? #
| second ? # dt-final
| crest ? ?
| ? ? #
| ? ? #
| ? ? #
? ? first #
dt-initial | ? crest #
| ? #
------------?------------------#------> x (distance above surface)
| #
| #
As we see in the diagram, because gravity acts the same way on
both crests, the time between them when they leave the surface (dt-
initial) is the same as the time between them when they are detected
(dt-final). Thus, our diagram does not predict that the energy of the
light should change, but experimental evidence shows it does.
According to special relativity, this frame of reference we have drawn
is an inertial frame (we are ignoring the Earth's motion around the
sun here) which should explain the geometry of the situation, but does
not. That indicates that SR must be changed in light of gravity.
However, we have yet to show that SR must be completely thrown out.
What if there were another way to define an inertial frame such
that it's geometry would explain the above situation and other
situations which occur in the presence of a gravitational field? That
is what we will consider next.
5.2 The "New Inertial Frame"
Before starting this section, I want to mention something to the
reader: in the end, when gravity is concerned, we will not be able to
find a single inertial frame of reference which will correctly explain
the geometry of all situations. This will be the actual deathblow to
special relativity. At the beginning of the discussion in this
section, it will look as if the situation is hopeful, and that by
defining a proper inertial frame, SR will be saved. However, later in
this section, we will see where this all falls apart, and I wanted the
reader to realize this from the beginning.
Now, rather than consider a frame of reference which is sitting
stationary on the surface of the Earth, let's consider one which is
freely falling in Earth's gravity near the surface of the Earth. Why
would we want to consider this frame? Okay, let's first address that
question.
In special relativity, an inertial frame was one which moved at a
constant velocity. Thus, an accelerating frame was not an inertial
frame in SR. Consider, then, a frame of reference inside a spaceship
which is accelerating at a constant rate. The ship we are considering
will be far away from any gravitational fields. If you were to
release an object inside that ship, the object would continue to
travel at the velocity it had when you let it go. However, the ship
would continue to accelerate past that velocity, and the bottom of the
ship would soon catch up to the object you released. So, if you are
on this ship, and you release an object, then to you the object seems
to accelerate at a constant rate toward the bottom of the ship.
Notice that the rate of acceleration the object seems to have is that
of the ship, regardless of the object's mass or composition. This is
very similar to the situation in which you release an object while
sitting stationary on the Earth's surface. All such objects near the
Earth's surface accelerate towards the Earth at a constant rate,
regardless of their mass or composition. So we can argue that the
frame of reference which is sitting stationary on the Earth's surface
seems to have the properties of a non-inertial frame (like that of the
accelerating ship).
Now, think of the object being released in the above situation.
Once you release the object, it continues on at the velocity it had
when you released it. It continues at a constant velocity. It is in
an inertial frame. If this situation is quite similar to the one on
Earth, then we might argue that an object released near the surface of
the Earth--an object in free-fall--is also in an inertial frame.
For another important argument, consider a point we mentioned
above--gravity creates the same rate of acceleration for all objects
released at a given point with a given initial velocity. This fact is
what distinguishes gravity from all other forces in nature. With the
other three forces (electromagnetism, the strong nuclear force, and
the weak nuclear force) the motion of an object in the presence of the
force depends on the composition of the object. For example,
electromagnetism doesn't act on neutral particles, but does act on
charged ones. However, when we consider gravity, the path taken by an
object which is released with a given velocity in a gravitational
field does not depend on the composition of the object. Thus, if you
are in a freely falling frame of reference (one in which you are only
being acted on by gravity), then any object you release will follow
the same trajectory that you are following. It will not move with
respect to you, and it will seem to you as if no force is acting on
it. So, the freely falling frame looks, again, like an inertial frame
of reference.
Finally, let's consider the "light rising in the presence of
Earth's gravity" experiment. As it turns out (though I won't go into
the proof) if the light is detected while it is still relatively close
to the Earth, and we consider the experiment in a frame of reference
which is freely falling near the Earth's surface, then in that frame,
the light does not loose energy. Thus, in the freely falling frame of
reference, Diagram 5-1 can correctly depicts the geometry of the
situation.
And so, things are looking deceptively hopeful. At this point it
looks as if we could simply consider free falling frames as inertial,
and the space-time diagrams we have drawn throughout our discussions
would thus work just fine in the presence of gravity, as long as we
understand that they are drawn in free falling frames.
However, there is a problem. To illustrate why, consider the
accelerating ship we were discussing earlier, but let the ship be
very, very tall. No matter how tall the ship is, an object dropped at
the top of the ship will accelerate at the same rate as an object
dropped at the bottom of the ship. However, general gravitational
fields don't work this way. Objects in a weaker gravitational field
(further from the Earth, for example) accelerate at a different rate
than those in a stronger field. Now, as long as you are close to the
surface of the Earth, you won't notice the different acceleration
rates for objects dropped at different heights. However, if you drop
one object close to the surface of the Earth and the other object far
above the first, then they will accelerate at different rates. If you
consider the frame of reference of one of the objects, the other
object will be accelerating in that frame. Thus, while our previous
discussion would have us call both of these frames inertial, one frame
is accelerating in the other frame of reference.
Similarly, consider dropping two objects from different sides of
the Earth. Because they will each fall towards the center of the
Earth, they will be accelerating in different physical directions.
Thus, they will each be accelerating in the other object's frame.
And so, we note that a free falling frame seems much like an
inertial frame as long as you are close to the origin of the frame;
however, if you consider a point further away, the frame does not
represents the inertial frame at that far away point. Not only that,
but if you consider a frame which starts far from the Earth, then that
frame will eventually fall into an area with a stronger gravitational
field. Thus, as time goes on, the frame of reference changes from
representing an inertial frame far from the Earth to representing
another inertial frame close to the Earth. So, the extent to which
the free falling frame represents an inertial frame at the point it
was originally dropped depends on how long you consider that frame.
In other words, the free falling frame only represents a good inertial
frame for a limited time.
In the end, we see that free falling frames can be considered as
inertial frames only over a small distance and for a small period of
time. We call them "local" inertial frames ("local" meaning in space
as well as time). It is similar to noting that locally on a the
surface of a sphere, a plane closely represents a good coordinate
system for the surface of the sphere. However, globally--as you extend
that plane--it stops being a good coordinate system for the curved
surface of the sphere. Similarly in relativity, there is no way to
define a single, rigid frame of reference which has the properties of
an inertial frame everywhere within a gravitational field. In special
relativity, such frames existed, but with gravity involved, we must
rethink the situation.
We will now continue this rethinking process by discussing
concepts which can be used to describe space-time in the presence of
gravity. We will begin by discussing some general ideas which will
help us explain the geometry of space-time.
5.3 Manifolds, Geodesics, Curvature, and Local Flatness
Before we discuss space-time in the presence of gravity, we need
to understand some basic geometric concepts which we will use. We
will develop these concepts by considering normal, spatial geometry
which can be fully grasped using common sense. Applying these
concepts to space-time becomes less intuitive (in part because we
still aren't that used to thinking of time as just another dimension);
therefore, developing them using normal spatial geometry will be
beneficial.
First, we introduce the term "manifold". Basically, for our
purposes, you can think of a manifold as a fancy term for a space.
The space around us that you are used to thinking of can be called a
three dimensional manifold. The surface of a sheet of paper is a two
dimensional manifold, as is the surface of a cylinder or the surface
of a sphere.
Next, we look at a particular type of path on a manifold. This
path is called a geodesic, and it is essentially the path which takes
the shortest distance between two points on the manifold. On a piece
of paper (a flat manifold) the shortest distance between two points is
found by following the path of a straight line. However, for a
sphere, the shortest distance between two points would be traveled by
following a curve known as a great circle. If you imagine cutting a
sphere directly in half and then putting it back together, then the
cut mark on the surface of the sphere would be a great circle. If you
move along the surface of a sphere between two points, then the
shortest path you could take would lie on a great circle. Thus, a
great circle on a sphere is basically equivalent to a line on a flat
manifold--they are both geodesics on their respective manifolds.
Similarly, on any other manifold there would be a path to follow
between two points such that you would travel the shortest distance.
Such a path is a geodesic on that manifold.
Next, we introduce the concept of the curvature of a manifold.
When we discuss this concept, we are talking about an intrinsic
property of the geometry of a manifold. To demonstrate what I mean,
let's consider the surface of a cylinder. You can create such a
surface by taking a flat sheet of paper and rolling it up. While the
two dimensional surface will then look curved in our three dimensional
perspective, the geometry of the surface is exactly the same as the
geometry of the flat sheet of paper from which it was made. If you
were a two dimensional creature confined to live on the two
dimensional surface of the cylinder, then you could not perform an
experiment which would prove that your geometry was that of a three
dimensional cylinder rather than a flat sheet of paper. Thus, though
a cylinder looks curved from our three dimensional perspective, it has
no intrinsic curvature to its geometry.
On the other hand, consider a sphere. You cannot bend a flat
sheet of paper around a sphere without crumpling the paper. The
geometry on the surface of a sphere will then be different from the
geometry of a flat sheet of paper. To distinctly show this, let's
consider a couple of two dimensional creatures who are confined to the
surface of a sphere. Say that they stand next to one another on the
two dimensional surface and begin walking parallel to one another. As
they continue to walk, each will continue in what seems to him to be a
straight line. If they do this--if each of them believes that they
are following a straight line from one step to the next--then they
will follow the path of a geodesic on the sphere. As we said earlier,
this means that they will each follow a great circle. But if they
each follow a great circle on the surface of a sphere, then they will
eventually come towards one another and meet. Now, they started out
moving on parallel paths, and they each believed that they were
walking in a straight line, but their paths eventually came together.
This would not be the case if they performed this experiment on a flat
sheet of paper. Thus, creatures who are confined to live on the two
dimensional surface of a sphere could tell that the geometry of their
space was different from the geometry of a flat piece of paper. That
intrinsic difference is due to the curvature of the sphere's surface.
This, then, is what we want to note about curvature: The
curvature of a manifold as in intrinsic property of the geometry of
the manifold itself. It is intrinsic because it is part of the
manifold, regardless of whether the manifold is considered in a higher
dimensions. In fact, just because a manifold may looked "curved" in a
higher dimension, that doesn't mean that its intrinsic geometry is
different from that of a flat manifold (i.e. it's geometry can still
be flat--like the cylinder). Thus, the test of whether a manifold is
curved does not have anything to do with higher dimensions, but with
experiments that could be performed by beings confined on that
manifold. (Specifically, if two parallel lines do not remain parallel
when extended on the manifold, then the manifold possesses curvature).
This is important to us in our discussion of space-time in the
presence of gravity. It means that the curvature of the four
dimensional manifold of space-time in which we live can be understood
without having to worry about or even speculate on the existence of
any other dimensions.
As a final note in this introduction to manifolds, I want to
mention a bit about local flatness. Note that even though a manifold
can be curved, on a small enough portion of that manifold, it is
fairly flat. For example, we can represent a city on our curved Earth
by using a flat map. The map will be a very good representation of
the city because it is a very small piece of the curved manifold.
Earlier I mentioned that over a small enough piece of space-time in
the presence of gravity, you can define a frame of reference which is
still very similar to an inertial reference frame in special
relativity. This gives an indication as to why the geometry of space-
time in special relativity is that of a flat manifold, while with
general relativity, space-time is said to be curved in the presence of
gravity.
Later we will see how the concepts discussed here will help us in
explaining gravity and relativity. Next, however, we want to discuss
another property of manifolds which itself will tell us everything we
want to know about a particular manifold. We will call this property
the invariant interval.
5.4 The Invariant Interval
Here we will basically be discussing distances on manifolds, and
what we can learn about a manifold based on how we calculate distances
on that manifold. We start by discussing the le "L" (L is positive if
we move east). Next, we need to move north or south on the sphere to
reach P. The distance we move north or south to reach P will be
called "H" (H is positive if we move north). That gives us our
coordinate system. Every point on the sphere can now be represented
by an L-H coordinate pair. The "grid" on the surface of the sphere
which represents this coordinate system would be made of latitude and
longitude lines such as those on a globe.
Next, we need to figure out what infinitesimal distance (ds)
would be associated with moving a small distance in L (dL) and a small
distance in H (dH). For the sake of time, I'll just give the answer
here (Note, R is the radius of the sphere):
ds^2 = (1/R^2)*dH^2 + [cos(H/R)/R]^2*dL^2
Remember what this represents. If you start at some point (L,H) on
the sphere, and you move a small distance in L (dL) and a small
distance in H (dH) then the shortest distance along the sphere between
your first position and your second position would be ds. Note that
this distance depends on your L position (because of the "sin(L/R)"
part of the equation). (This is an interesting point because as soon
as you start moving from one position to the next, the equation for ds
becomes slightly different. We basically think of this difference as
negligible as long as dL is very small, but, in fact, the equation is
only correct when dL is truly "infinitesimal". Such concepts are
generally covered in calculus, and for our purposes, we will just
claim that the equation is practically true as long as dL is very
small.)
So now, we come to an important point in this section. What if I
told you that I could find another coordinate system on the sphere
using two independent coordinates (a and b) such that the invariant
interval on the sphere would be given by the following:
ds^2 = da^2 + db^2?
(Note: by "independent coordinates" I mean that you can always change
your position in one coordinate independent of any change in the
other.)
Here I'll try to show that my claim cannot be true, because it
would imply that the sphere and a flat sheet of paper have the same
geometry, regardless of how I try to define "a" and "b" on the
sphere.
First, what if I draw a normal grid on a flat piece of paper and
label the axes "a" and "b". "Big deal," you might say,
"you could
just as easily label them 'L' and 'H', which were the coordinate you
really did use on the sphere." AH, but here is the difference between
the two labelings. The invariant interval along the flat sheet of
paper would be da^2 + db^2 and dL^2 + dH^2 for the two labelings,
respectively. In the second case, we obviously see that the geometry
of the sphere is different from the geometry of the flat grid (because
the invariant interval on the sphere is different from the "dL^2 +
dH^2" invariant interval on the flat grid). However, I have claimed
that the invariant interval on the sphere using my new a-b system is
"da^2 + db^2". That would make it's physical geometry the same as
that of the flat sheet of paper--which cannot be the case.
Considering this example, let's make some general points: First,
consider some manifold, M1. On M1, we have some coordinate system,
S1. Next we consider two very-nearby points on M1 (call the points P
and Q). If we know the distance between P and Q along each of the
coordinates (like dx and dy, for example), then we can find some
function for ds (the shortest distance on M1 between the very-nearby
points) using the coordinates in S1. Now, consider a second manifold,
M2. If a coordinate system, S2, can be defined on that manifold such
that ds has the same functional form in S2 as it did using the S1
coordinate system on M1, then the geometry of the two manifolds must
be identical.
This indicates that the geometry of a manifold is completely
determined if one knows the form of the invariant interval using a
particular coordinate system on that manifold. And, there you have
it. In fact, starting with the form of the invariant interval in some
coordinate system on a manifold, we can determine the curvature of the
manifold, the path of a geodesic on the manifold, and everything we
need to know about the manifold's geometry.
Now, the mathematics used to describe these properties involves
geometric constructs known as tensors. In fact, the invariant
interval on a manifold is directly related to a tensor known as the
metric tensor on the manifold, and we will discuss this a bit later.
First, I want to give a very brief introduction to tensors in general.
5.5 A Bit About Tensors
In this section I will introduce just a few basic ideas which
will give the reader a feeling for what tensors are. This is simply
meant to provide a minimum amount of information to those who do not
know about tensors.
Basically, a tensor is a geometrical entity which is identified
by its various components. To give a solid example, I note that a
vector is a type of tensor. In an x-y coordinate system, a vector has
one component which points in the x direction (its x component) and
another component which points in the y direction (its y component).
If you consider a vector defined in three dimensional space, then it
will also have a z component as well. Similarly a tensor in general
is defined in a particular space which has some number of dimensions.
The number of dimensions of the space is also called the number of
dimensions of the tensor. Note that vectors have a component for each
individual (one) dimension, and they are called tensors of rank 1.
For other tensors, you have to use two of the dimensions in order to
specify one component of the tensor. In x-y space, such a tensor
would have an xx component, an xy component, a yx component, and a yy
component. In three-space, it would also have components for xz, zx,
yz, zy, and zz. Since you have to specify two of the dimensions for
each component of such a tensor, it is called a tensor of rank 2.
Similarly, you can have third rank tensors (which have components for
xxx, xxy, ...), fourth rank tensors, and so on.
So that you aren't confused, I want to explicitly note that the
dimensionality of a tensor (the number of dimensions of the space in
which the tensor is defined) is independent of the rank of the tensor
(the amount of those dimensions that have to be used to specify each
component of the tensor). In any dimensional space, we can have a
tensor of rank 0 (just a number by itself, because it is not
associated in any way with any of the dimensions), a tensor of rank 1
(like a vector--it has a component for every one dimension you can
specify), a tensor of rank 2 (it has a component for every pair of
dimensions you can specify), etc.
Now we look at a very important property of tensors. In fact, it
is the property which really defines whether a set of components make
up a tensor. This property involves the question of how the tensor's
components change when you change the coordinate system you are using
for the space in which the tensor is defined. So, let's consider an
example in two dimensional space where you go from some coordinate
system (call the coordinates x and y) to some other coordinate system
(call these coordinates x' and y'). There will be some sort of
relationship between the two systems. For example, say we start at
some point in this space such that our coordinates are x,y and x',y'
(depending on which coordinate system you are using). Now, say we
move an "infinitesimal distance" in x (using the first coordinate
system). Call that distance dx. When we do so, we will have changed
or x' position (using the second coordinate system) by some
infinitesimal amount, dx'. Also, we will have changed our y' position
by some amount dy'. We can use these concepts of infinitesimal
changes to define some relationships between the two systems. We can
answer the question "how does x' change when x changes at this point"
by noting the ratio, dx'/dx. Similarly we can write dx/dx' to denote
how much x changes with changes in x' at some point, and dy'/dx
denotes how y' changes with changes in x. All together there are four
of these ratios which denote how the x' and y' coordinates change with
changes in x and y:
dx'/dx, dx'/dy, dy'/dx, and dy'/dy.
Similarly, there are four more to denote how x and y change with
changes in x' and y':
dx/dx', dx/dy', dy/dx', and dy/dy'.
In general the values of these ratios will depend on where you are, so
each ratio is a function of x and y (or x' and y', if you like).
Now, we have these ratios which help us relate one coordinate
system to another. If we have a tensor defined in this space, then we
must be able to use those ratios to find out how the tensor's
components themselves change when we go from considering them in one
coordinate system to considering them in the other. Let's consider a
tensor of rank 1 (a vector) in a two dimensional space. Let the
vector, call it V, have an x component (V_x) and a y component (V_y).
Then, the rules for finding the x' and y' components of the vector at
some point are the following:
V_x' = dx'/dx V_x + dx'/dy V_y
and
V_y' = dy'/dx V_x + dy'/dy V_y.
That is the way in which this type of first rank tensor must transform
from one coordinate system to another. Note that we can write the
above equations by using the following:
V_a = SUM(b = x,y) [da'/db V_b]
In that expression, "a" can be either x or y (so we actually have two
equations). Also, the right side of the equation is a summation where
the first term in the summation is found by letting b = x, and the
second term is found by letting b = y. Further, we could make this
expression more general by noting that it will be true for a space
with higher dimensions when we let "a" be any one of those dimensions
and let the sum with b extend over all the dimensions.
The fact that the physical components of a vector do actually
transform this way is what makes the vector a tensor. However, we
should note that not all types of vectors transform this way.
To show this is so, first we will consider a function which has a
value at every point in x-y space. Call the function f(x,y). Such a
function is a 0 rank tensor, because at any point in the space, it has
some single, numerical value (it does not have components for x and y
like a vector does--you can't ask "what's its value in the x
direction", or "what's its value in the y direction", because it has
only a single number at any point). Note that if we change to another
coordinate system, the value of f at some physical point in the space
will not change. Because it has no x or y component, it is invariant
when you change coordinate systems, as are all 0 rank tensors. This
is the way all 0 rank tensors must transform when you change
coordinate systems--they must be invariant.
Now, back to the point that there are other types of vectors
which do not transform as discussed earlier. Let's take the above
function at some point and ask "how does it change with small changes
in x?" If the function changes by an amount df when we move to
another x location a distance dx away, then we can write the
expression df/dx do tell how f changes with x. We can do the same in
y and have the expression df/dy. Then We could define a vector (call
it G) which has an x component (G_x) equal to df/dx at every point in
x and y, while it has a y component (G_y) equal to df/dy at every
point. Now, what if we do this same procedure in the x'-y' coordinate
system. We will end up with the x' and y' components of the G vector
such that G_x' = df'/dx' and G_y' = df'/dy'. Because of the way this
vector is defined, it turns out that it transforms as follows:
G_x' = dx/dx' G_x + dy/dx' G_y
and
G_y' = dx/dy' G_x + dy/dy' G_y
As before, we can rewrite these two equations as follows:
G_a' = SUM(b = x, y) [db/da' G_b]
Note that we are using ratios like db/da' rather than da'/db (which we
used earlier). That means that this is a different type of vector
(because it transforms in a different way). The vector we discussed
earlier (V) is called a contravariant vector, and the fact that it
transforms as discussed earlier is what defines it as that type of
vector. The G vector is called a covariant vector, and it is defined
as such because of the way it transforms. Usually, we express which
type of vector we have by the way we denote its components. For
contravariant vectors, we denote their components by putting their
indexes (the x or the y) in superscripts:
x y
V and V (or V^{x} and V^{y}),
While we denote the components of covariant vectors by putting their
indices in subscripts:
G and G (or G_x and G_y)
x y
With this notation, the two different transformations begin to
take on an easy to remember form. See if you can't figure out how the
"upper" indices
and the "lower" indices match up on both sides of the
two transformation equations when they are written as follows:
a' da' b
V = SUM(b = x,y) -- V
db
and
db
G = SUM(b = x,y) -- V
a' da' b
Notice that the subscript (or superscript) on one side remains "upper"
(or "lower") in the ratio on the other side. Also, note that the
summation is always over the index which is repeated on the right
side, once in an "upper" position and once in a "lower" position.
This basic "formula" helps to produce equations for all transformation
in tensor analyses (note this in the next part of this section).
It is interesting to note that in the normal spatial coordinates
we are used to using (Cartesian coordinates), db/da' = da'/db, and
there is no distinction between covariant and contravariant vectors.
However, in other systems, the difference is there and must be
considered.
Finally, we note that with higher rank tensors, they are also
defined by the way they transform from one coordinate system to
another. For example, consider a second rank tensor, U. It could be
that both of its indices are associated with the contravariant type of
transformation (note: the following actually denotes four equations
because a'b' can be set to x'x', x'y', y'x', or y'y'):
a'b' da' db' xx da' db' xy da' db' yx da' db' yy
U = -- * -- U + -- * -- U + -- * -- U + -- * -- U
dx dx dx dy dy dx dy dy
[ da' db' ce ]
= SUM(c & e vary over all dimensions) [ -- * -- U ]
[ dc de ]
Or they could both be associated with covariant the type of
transformation:
[ dc de ]
U = SUM(c,e) [ -- * -- U ]
a'b' [ da' db' ce ]
Or it could be a mix of the two:
a' [ da' de c ]
U = SUM(c,e) [ -- * -- U ]
b' [ dc db' e ]
And that about ends our discussion on tensors. To sum up, they
are geometric entities which have components denoted by some number of
indices. Each index can be any of the dimensions in which the tensor
is defined, and the number of indices needed to specify a component of
a tensor is called the tensor's rank. We are familiar with 0 and 1
rank tensors (numbers--or "scalars"--and vectors). Finally, the way
one transforms a tensor from one coordinate system to another depends
on the type of tensor, and it (in fact) defines the tensor itself.
Each index of a vector will transform in either a contravariant way or
a covariant way.
These are the basic ideas behind tensors, and they allow us to
define some very powerful mathematics. If you are familiar with the
usefulness of vectors, then you have touched the surface of the
usefulness of tensors in general. In the following section, we will
look at two particular tensors, and we will see that they can be quite
useful.
5.6 The Metric Tensor and the Stress-Energy Tensor
Now that we have had a glimpse at tensors, let's consider a
couple that will be important to us. The first is called the metric
tensor. I mentioned a couple of sections ago that this tensor is
related to the invariant interval for a certain coordinate system on a
given manifold. So, let's go back and look at a the two specific
invariant intervals which we introduced. First, in normal, x-y,
Cartesian coordinates, we have the following:
ds^2 = dx^2 + dy^2
Second, on the surface of a sphere, using the L-H coordinate system
which we defined, we have this:
ds^2 = (1/R^2)*dL^2 + [cos(L/R)/R]^2*dH^2
Now, let's make this more general by considering an arbitrary, two
dimensional manifold and an arbitrary coordinate system on that
manifold. Let's call the coordinates "a" and "b". Now, in
general,
the invariant interval on this manifold is defined in terms of the
square of that interval (ds^2). The equation for ds^2 involves the
infinitesimal distances da and db in second order combinations. By
second order combinations, I mean, for example, da^2 or da*db. Thus,
in general, the invariant interval will have the following form (note:
the g components are generally formulas of a and b):
ds^2 = g *da^2 + g *da*db + g *db*da + g *db^2
aa ab ba bb
In that equation you see the four components of the metric tensor
in this two dimensional, a-b coordinate system. They are the "g's" in
the equation. For our x-y coordinate system, we have
g = 1, g = 0, g = 0, g = 1
xx xy yx yy
For our L-H coordinate system, we have
g = (1/R^2), g = 0, g = 0, g = [cos(L/R)/R]^2
LL LH HL HH
So, we can construct the invariant interval if we know the metric
tensor for a coordinate system on a manifold. Now, remember that we
said that the form of the invariant interval for a particular
coordinate system tells us everything there is to know about the
manifold for which those coordinates are valid. So, now we see that
all we need to know is the form of the metric tensor. Once we know g,
we know the geometry of the manifold. Using tensor analysis, we can
take the metric tensor and find an equation for geodesics on the
manifold. We can use it to find out all about the curvature of the
manifold. We can even use it to find the dot product (we will discuss
this a bit later) of two vectors in the a particular coordinate
system.
Another thing the metric allows us to do is something generally
called "raising" or "lowering" indices. Basically, if you consider a
tensor with a contravariant index (which transforms in a particular
way as discussed earlier), then there is another corresponding tensor
which has a covariant index (and vice versa). For example, consider
the tensor A^{a}, which has a contravariant index, a. There is a
corresponding covariant tensor, A_a, which can be found using the
metric of the space (and coordinate system) we are dealing with. Here
is an example how you find it (finding A_x when you know A^{x}) for a
coordinate system with some arbitrary coordinates, x and y:
x y
A = g A + g A
x xx xy
For a general space and coordinate system, you can write this rule as
follows (remember, "a" can be any one dimension in the space, so this
represents a number of equations):
b
A = SUM(b varies over all dimensions) g A
a ab
Similarly, if you know the covariant form of A (A_a) you can find the
contravariant form by using the following:
a ab
A = SUM(b varies over all dimensions) g A
b
But that equation involves the contravariant form of the metric
(g^{ab}). In the invariant interval, the metric is expressed in its
covariant form (g_ab). It is therefore important for the reader to
remember as we discuss various metrics below, that for all of them we
have
ab 1
g = --- if a = b
g
ab
ab
g = 0 if a doesn't = b
Thus, using the metric tensor, one can "raise" or "lower" any
index of a tensor. Remember, what one is really doing is finding a
form of that tensor which transforms in a different way.
With this example of how the metric can be used, we will end our
discussion of this tensor. To sum up, the metric tensor on a manifold
is a very important entity which not only tells us all about the
manifold's geometry, but which also provides a very powerful tool
which allows us to deal with that geometry mathematically.
The second tensor we want to mention is the stress-energy tensor.
I don't want to get to deep into a discussion of the stress-energy
tensor, but the reader should know a couple of key points. With the
stress-energy tensor, we see our first example of a tensor explicitly
defined in four dimensional space-time (though later we will look at
the metric tensor defined in 4-d space-time). The stress-energy
tensor (T) is also a tensor of rank 2 (like the metric tensor), which
gives it 16 components in 4 dimensions. Sometimes we express such a
tensor in the form of a matrix as follows:
+- -+
| tt tx ty tz |
| T T T T |
| |
| xt xx xy xz |
ab | T T T T |
T = | |
| yt yx yy yz |
| T T T T |
| |
| zt zx zy zz |
| T T T T |
+- -+
There you can see the 16 different components. Now, each of these
components tell us something about the distribution and "flow" of
energy and momentum in a region. More precisely, T contains
information about all the stresses and pressures and momenta in a
region. For example, The "tt" component of the stress-energy tensor
would be the density of the energy in the region (the amount of
energy--including mass energy--per unit volume).
As to why the stress-energy tensor is important to us, that will
be discussed further in a bit. However, here we can note the
following in order to pull us back towards our discussion of
relativity and gravity: In Newtonian physics, gravity was caused by
the density of mass in an area. However, in SR we find that mass is
just a form of energy, and so we might think that the "tt" component
of the stress-energy tensor would be the right thing to look at when
it comes to gravity. However, if we write a rule using one component
of a tensor, then because the value of that component will depend on
your coordinate system (or frame of reference in space-time) then the
rule will also be frame-dependent. In short gravity would not be an
invariant theory, and it would require a preferred frame if we based
it only on the "tt" component of T. However, if we use all the
components of a tensor to form our theory, then (as it turns out) the
theory can be made frame-independent. Einstein thus considered the
possibility that the whole stress-energy tensor would need to play a
part as the source of gravity. Add to this some insight on curved
manifolds and you end up with general relativity, as we will see.
5.7 Applying these Concepts to Gravity
Now that we have discussed manifolds and their properties along
with some of the basic concepts of tensors, let's see how all of this
applies to relativity and gravitation. First, I will go over the main
ideas which lead us from what we have discussed so far to a general
relativistic theory. After that, I want to mention a few notes on the
physics and the mathematics we will be using given the concepts we
have gone over. Next, we will go back and looking again at special
relativity while applying a bit of our new knowledge. This will show
that GR is indeed general, because when applied to space-time without
the presence of gravity it will explain a special case--special
relativity. Finally, we will look quickly at a specific application
of the GR concepts to a space-time in which there is a gravitational
field. This application will focus on a particular class of stars and
black holes.
5.7.1 The Basic Idea
Lets get started with the basic ideas which combine the concepts
we have discussed to produce GR. Here I will simply state the main
ideas without an explanation of their application. You will get some
feel for their application in our two examples to follow.
So, here are the main claims of GR which involve the concepts we
have discussed. First, the space-time in which we live is a four
dimensional manifold. On that manifold there is a metric tensor (or
just "a metric") which describes the geometry of space-time. The
metric can be used to find geodesics on the space-time manifold, and
when an object goes from one point in space-time to another point in
space-time (note: these are not just two points in space, but two
potric can be used to find the invariant
interval between two space-time points. That interval (recall) can
generally be expressed as
ds^2 = SUM(a & b vary over space and time dimensions) g *da*db
ab
Second, consider a vector in our four dimensional space. Such a
vector (usually called a four-vector) has four components, three
relating to space and one relating to tim
======================================================================
Relativity and FTL Travel
by Jason W. Hinson (