Observables in quantum mechanics are generally treated on the premise that
the values they take (their eigenvalues) are quantities of such a kind as may
be scaled and added: that is, tensor, vector or scalar
quantities. However, General Relativity requires
us to describe the universe as a smooth
manifold, in which context position is not a vector quantity: at
best, one can represent positions (in modest-sized chunks of
space-time) by vectors (using a chart), but then the notions
of addition
and scaling
this induces depend on your choice of
representation.
The need for an observable's eigenvalues to be addable and scalable arises
from the observable being a linear map from S = {system states} to V⊗S
where V is the space of nominally possible
values for the observable,
among which the eigenvalues are the feasible
(or, indeed, observable)
possibilities. [The observable, (V⊗S: Q |S), then has to satisfy a
structural relationship with a given symmetric metric, an antilinear mapping
(dual(S): g |S) for which, for any s, t in S, t·g(s) and s·g(t)
are mutually conjugate; the constraint says that, for any s, t in S,
Q(s)·g(t) and Q(t)·g(s) must also be mutually conjugate; Q is then
described as hermitian
.] When the observable is position on our smooth
manifold, M, V gets replaced by M, which is not a linear space, and the ⊗
tensor space combiner is not
available to us: we cannot ask which mappings (M: |dual(S)) are linear. None
the less, the position observable is clearly meaningful, so how may we describe
it ?
The natural way to approach this is to look at the conventional description and seek the extent to which it is free of the requirement that the observable is a vector quantity. To this end, note that the orthodox treatment in terms of diagonalisation of an observable is alternatively described as decomposition of the observable into a sum: each term in which is the (if necessary tensor) product of an eigenvalue of the observable with a projection operator (projector) which selects the eigenspace for that eigenvalue.
The sum of just the projectors, without multiplication by their eigenvalues, delivers the identity linear operator and these projectors commute with one another (the product of any distinct pair is zero). For any set of values in the space within which the observable's values lie, there is a projector equal to the sum of the projectors for those eigenvalues which lie in the set given.
The probability, when the system being described is in some given state, of finding the observable to have a value in some set turns out to be the result of contracting this projector for the set with the state's bra (on the left; its image, in dual(S), under g) and ket (on the right; the member of S representing the state). One may obtain a hermitian operator (actually a projector), with trace 1, from this bra and ket: their (tensor) product the other way round. The probability just cited is then the trace of the product of this hermitian operator with the projector for the given set of values for the observable.
If we do not know the state of the system but, instead, know a real
probability measure over the possible
states (loosely a probability density
for what state the system is in)
then we can integrate this ket tensor bra product using that measure: the
result will also be a hermitian operator with trace 1 (but not, as far as I
can see, necessarily a projector). Consequently, it is more natural to
describe the state of the system in terms of a hermitian with trace 1 rather
than in terms of definite state vectors: the probability of finding the system
(for which we had a probability measure over possible states) to have its
value for an observable in some set is still the trace of the product of the
projector for the observable to be in that set times the hermitian operator
with trace 1 associated with the system's state.
The decomposition of the identity into a sum of commuting projectors commuting with our observable can, alternatively, be regarded as a (generalised) probability measure on the space within which the observable's values lie. The values taken by the measure lie in a commuting algebra of hermitian projectors. If the values taken by the observable do lie in some vector space (which will, naturally, be real given a hermitian observable) then the expected value of this probability measure comes to the observable itself. However, all other aspects of its use are liberated from dependence on the vectorial nature of the values taken by the observable.
Consequently, we can describe the position observable on a smooth manifold
as a probability measure on that manifold, taking values in the space of
hermitian projectors on the Hilbert space via which we describe our quantum
mechanical states. You can, roughly, think of this as a probability density
for the position, albeit the probabilities
delivered are not real
numbers between 0 and 1. Such numbers may be obtained, however, by
contracting the projector delivered with the unit-trace hermitian operator
describing the state of the system under study and taking the trace of the
product. For any measurable subset of the smooth manifold, the measure yields
a projector; the whole manifold is measurable and its projector is the
identity on our space S of states. Any subset of the manifold in which the
particle definitely isn't yields the zero projector. Any measurable
(e.g. smooth and bounded) mapping, f, from the manifold to some fixed linear
space U may be integrated
using the measure to produce a linear map
(U⊗S: |S) which may be contracted with the hermitian operator
describing the system state to yield an expected value, in U, of f.
We have S = {system states} represented as the unit sphere in a Hilbert space H with hermitian metric encoded by antilinear iso (dual(H)| g |H); we can do linear algebra in H and thus, to a certain degree, in S. We have a smooth manifold M; positions are points of M and not amenable to linear algebra.
An observable Q taking values in some fixed linear space V is encoded as a linear map (V⊗H: Q |H) for which Q(u)·g·v and Q(v)·g·u are mutually conjugate (when V is {scalars} this is equivalent to g(Q(u),v) = g(u,Q(v)) but, more generally, Q's outputs aren't in H for g to accept as inputs) for each u, v in V. Eigenvalues of Q are the v in V for which there is some non-zero h in H with Q(h) = v×h; in such a case, h is an eigenvector of H with eigenvalue v. It is possible to construct a basis b of H, whose members are all eigenvectors, that unit-diagonalises g. If the dual basis of dual(H) is p, with each p(i)·b(j) = 1 if i = j, else 0, then sum(: b(i)×p(i) ←i :) is the identity H; if conjugation of scalars is * then each *∘p(i) is an antilinear ({scalars}: |H) and sum(: p(i)×(*∘p(i)) ←i :) = g, so that each p(i) is in fact g(b(i)). Let (V: e :) give the eigenvalues of the b(i), so Q(b(i)) = e(i)×b(i) for each i; then Q = sum(: e(i)×b(i)×p(i) ←i :).
For each distinct output v of e we have ({v}: e |) as the set of basis-indices of basis members with v as eigenvalue; for other v in V, this set is empty. Define, for each v in V, h(v) = sum(: b(i)×p(i) ←i :({v}:e|)); this is the identity on the sub-space of H on which V has v as eigenvalue; it is, equally, the projection mapping from H to this sub-space. This defines h as a function ({linear (H:|H)}: |V); when v is not an output of e, h(v) is zero. Each h(v) is idempotent; composing it with itself yields itself. Any two distinct outputs of h have zero composite. Thus the outputs of h all commute with one another; indeed, all sums of outputs of h commute with one another. By grouping the i in (:b|) according to equality of v(i), we can re-write the identity sum(: b(i)×p(i) ←i :) = sum(: h(v) ←v |e) = sum(h), i.e. h's (non-zero) outputs constitute a partition of the identity H. We can likewise re-state Q = sum(: e×b×p :) as Q = sum(: v×h(v) ←v |e), whence it is easy to show that, modulo the tensor permutation operators needed to give meaning to h(v)∘Q, the composite of each h(v) with Q is, either way round, v×h(v); and, in particular, the same both ways round, hence each output of h commutes with Q. So the (non-zero) outputs of h provide a partition of the identity into commuting projectors that commute with our observable.
Each member of S is represented as a linear combination of our basis, u = sum(: s(i).b(i) ←i :) with sum(: s(i).*(s(i)) ←i :) = g(u,u) = 1. The expected value of Q in this state is then sum(: s(i).*(s(i)).v(i) ←i :) = Q(u)·g(u) and the probability that an observation of Q will yield value in some set E ⊂ V is sum(: s(i).*(s(i)) ←i :(E:e|)). If we define U = u×g(u) in H⊗dual(H) = {linear (H:|H)} then its trace is just g(u,u) = 1; the trace of Q∘U is Q(u)·g(u), the expected value of Q; and the trace of U·sum(: h(u) ←u |E) is the probability of Q's value being in E. As g(u,u) = 1, U∘U = u×g(u)·u×g(u) = u×1.g(u) = U is idempotent. As g·U = g(u)×*∘g(u) is conjugate-symmetric, U is hermitian (equivalently: each coefficient of a b(i)×p(i) in U is s(i).*(s(i)), hence real). U is thus the unit-trace hermitian projector mentioned above.
If we don't know the actual state in S, only a probability measure μ on S for which state the system is in (this is a classical probability, rather than a quantum one) we'll get μ(: Q(u)·g(u) ←u :S) as the resulting expected value of Q, which is the trace of Q's composite after U = μ(: u×g(u) ←u :S) and we can use this probabilistically blurred U exactly as we used the specific U above to obtain probabilities of Q taking values in particular sub-sets of V; while the sense in which μ is a probability measure is expressed by trace(U) = 1 exactly as before. (It is not clear that it is meaningful, in a quantum context, to include such a classical probability distribution in our discussion; but, if it is, this is what we get; and it fits perfectly sensibly with the foregoing.) The only difference is that, in this case, there is no strong reason to expect U to be idempotent; but it is a unit-trace hermitian (H:|H) which captures all the useful information about our state.
So we express the state of our system not as a member u of H's unit sphere for g but as a unit-trace hermitian (H:U|H) = u×g(u); many members of S may be expressed by the same U, but only U is actually relevant to determining the values of observables (i.e. the s(i) are not observable, but the s(i).*(s(i)) are, at least in principle).
Each subset, E, of V yields sum(: h(v) ←v :E) as the projector that
identifies the sub-space of H spanned by eigenvectors of Q having eigenvalues
in E; in effect, h serves as the density
for a probability measure
(albeit with idempotent (H:|H) values as outputs, instead of scalar ones)
whose integral over E is the sum just given. Let P be this probability
measure; the above sum is then P(: 1 ←v |E), a.k.a. P({1}:|E), and Q =
sum(: v×h(v) ←v |V) is simply P(V), the integral of the identity (:
v ←v |V). Composing with the unit-trace hermitian U that represents a
state of the system, trace(U·P({1}:|E)) is the probability of observing
Q in E when the prior state was represented by this U; and
trace(U·P(V)) is the expected value of Q when in this state.
So now let us consider position on the smooth manifold, M. Unlike Q, its values do not fall in a vector space. However, we can still represent the states of our system by members of H and, from such a representation, obtain a unit-trace hermitian (H:U|H) that encodes everything interesting. The only change we need is that, now, we replace (:h|V) and the measure P it induces on V by a measure P on M. For each measurable subset E of M, P({1}:|E) is an idempotent (H:|H) projecting onto the subspace of H consisting of states for which the particle is in E; for two such sub-sets, P({1}:|E)·P({1}:|F) = P({1}:|E∩F), the corresponding projector for the intersection of E and F. For any function (V:Q|M) from M to a linear space V, we can use this measure to obtain P(Q) linear (V⊗H:|H) which we can contract with the unit-trace hermitian U encoding a state to obtain the expected value of Q in that state; in particular, as before, trace(P({1}:|E)·U) gives the probability of observing, when previously in a state encoded by U, the particle's position to be in E.
A chart is a function from a neighbourhood in M to a vector space; we can
extend it to a mapping from the whole of M to that vector space, e.g. by
mapping all of M outside the neighbourhood to the vector space's origin; as
long as the result is reasonably well-behaved (it need not be continuous), the
result shall be integrable, yielding a vector
representation of
position in M, albeit one only even remotely meaningful when the system is in
a state with negligible probability of the particle lying outside the
neighbourhood covered by the chart. This is the position
that
orthodoxy has always effectively used.
Next we must, naturally, consider the momentum observable. Here, again, we have problems on a smooth manifold: although momentum is a vector quantity, the quantity in question is a tangent vector to our manifold, so has meaning only at each point; there is no intrinsic way to compare, much less combine (e.g. average), its values at different positions. We can, of course, use a chart to reduce positions to vectors and, thus, momenta likewise; this is inevitably what orthodoxy does, so we need to match orthodoxy up to this point and then see what we can have, in the manifold's terms, that looks enough like it to be meaningful.
Momentum is, in quantum mechanics, identified
with a wave co-vector that corresponds
to a gradient of phase, hence also identified
with the differential operator (specifically,
among all the possible differential operators on M, the one that annihilates
the space-time metric), D. There is no intrinsic M-ness in U, to which to
apply D, but the typical observation is now of form
trace(P(somethng)·U), where the something
is at least roughly of
form sum(: x(i)×b(i)×p(i) ←i :) for some mutually dual pair
of bases (H:b:) and (dual(H):p:), with each x(i) vectorial in some sense,
e.g. a mapping from M to some fixed linear space, which may contain M-ness on
which D can act; and this may couple, via our bases, with U's decomposition
via these bases. Archetypically, we use a basis of position eigenvectors, so
that i ranges over M, imposing an M-ness on U and giving it the form U =
integral(: s(m).b(m)×p(m).*(s(m)) ←m |M); then the momentum
operator maps this to (an imaginary scalar multiple of) the trace of
D(u)×g(u) = integral(: D(s(m))×b(m)×p(m).*(s(m)) ←m
|M), so we need P(something) to act on U as something that looks like
this.