]>
Statistical physics is the orthodox formalism for describing thermodynamics, the study of heat and temperature. It's statistical because it normally concerns itself with systems with quantities of matter made up of large numbers of particles (atoms, molecules, ions or whatever). This matter may be a plasma, a gas, a liquid or a solid.
For a general quantum system, given a
collection J of observables of that system and the values ({reals}: Val :J)
taken by those observables, there is a family of states of the system which
are most probable
. For any simultaneous eigenbasis of the observables
in J, these most probable states are superpositions of that eigenbasis
differing only in the phases of their components, with respect to that
basis. The amplitudes of their
components can be inferred
from
I shall ignore, for now, the question of whether h (or, failing that, R(Val)) must necessarily be unique.
Macroscopic systems, including nearly everything with which we normally interact, are made up of many particles of comparatively few kinds. For example, a box of air contains a mixture of gasses, roughly four fifths Nitrogen, one fifth Oxygen, with a smattering of impurities (notably the greenhouse gasses carbon dioxide and water vapour). Even fairly rare impurities, in so far as they're detectable, are present in quantities great enough to imply (at least) millions of particles of the given impurity. It thus makes sense to describe our macroscopic system as being composed of many sub-systems, each corresponding to one thing (molecule, atom or other irreducible component) present in the system, and to decompose the analysis in terms of the different kinds of thing present. It is thus convenient to consider, first, the case of a system whose parts are all of the same kind; although each part may be in a different state, each is of the same kind as the others; any state one of them could be in is a state any of the others could be in just as readilly. Such a system is described as homogeneous: all its parts are of the same kind.
If our system is made up of many identical instances of a simple sub-system, we can hope to benefit by describing the system in terms of a factorisation into many identical copies of the sub-system. If the sub-system's possible state vectors are spanned by some collection C of state vectors (e.g. when C is a basis of the states of the sub-system), we should be able to obtain a basis of the system's state space whose state vectors are of form join(f) with f a list of members of C and join being some operator which combines these. In particular, this will also enable us to define linear maps from states of the whole system by specifying their values at each join(C:f|m) with m natural and extending linearly from what this implies.
For observables of our sub-system, such as perhaps energy, we expect the observable of the whole system to simply add up the observables' values in the separate sub-systems. Thus, for observable X, we expect X(join(C:f|N)) to be sum(: σ(f(i),X(f(i))) ←i |N).join(f), with σ the inner product in the sub-system's state space; for a state c of the sub-system, σ(c,X(c)) is X's expected value in state c; which is X's eigenvalue if c is an eigenstate of X.
In particular, it is worthy of note that this is just the behaviour we would get if join were a multiplicative combiner, like the tensor product, with observables behaving as Leibniz operators, so that, for any scalar-valued operator, X(join(C:f|N)) = sum(: join(at(i,X,f)) ←i |N) with at(i,X,f) being the list (C:|N) which has X(f(i)) at position i and f(j) at each position j other than j = i. (For tensor-valued operators (U⊗C:X|C) we need to permute the U-factor in the tensor product join(at(i,X,f)) out to the front of the product, so that all terms have the same tensor rank (or, more particularly, get the X-induced extra rank U in the same position among their ranks), which makes the formula messier to write; but the behaviour works out similar to the scalar case, as if we had simply taken co-ordinates in U as distinct observables and then combined them after applying each.) When each entry in f is an eigenvector of X, this will produce the expected result given above, with join(f) being an eigenvector of X whose eigenvalue is the sum of the eigenvalues of f's entries.
Our sub-systems are all identical, so the order of items in f should have no impact on the physical meaning of join(f); if s is a permutation of (:f|), then join(f&on;s) needs to be join(f) times a phase, otherwise its expected values for assorted observables will be different. Furthermore, that phase must depend only on s, implying that there should be some mapping ({phases}: φ |{permutations}) for which φ(s).join(f&on;s) is independent of (m|s|m), for given (C:f|m). As noted above, we can define linear maps by their actions on join(f) inputs; each (: φ(s).join(f&on;s) ← join(f) :) must then extend linearly to the identity on state vectors.
Thus if we compose φ(s).join(f&on;s) ←join(f) after the same for some other permutation r, we obtain φ(s).φ(r).join(f&on;r&on;s) which must clearly be equal to φ(r&on;s).join(f&on;r&on;s), so we can infer that φ is a group homomorphism from permutations, under composition, to phases, under multiplication.
Now, like join, the bulk action of the tensor product operator, bulk(×, [a, b, …, z]) = a×b×…×z, suffices to construct, out of bases of the spaces whose vectors it combines, a basis of the compound space; in particular, as for join, this enables us to extend linearly from values given for bulk(×, (C:f|N)) to the span of these seed inputs; so we can define a linear map by extension from
Because φ is a group homomorphism, composing any permutation's φ(r).bulk(×, f&on;r) ← bulk(×, f) with the given linear map merely rearranges the terms in the sum, leaving the same linear map. Thus
presents itself as an ideal candidate for join: it's a multiplicative
combinator so, as noted above, it suffices that observables act as Leibniz
operators on the resulting products. Since, in fact, this permuting product
combinator has all the right properties, we can use it as a model of join,
whose sole meaning is within our quantum-mechanical model, so we lose no
generality in actually using it as join. It remains only to determine what
φ should be. As it happens, the fact that it's a group
homomorphism leaves us with only
two possibilities: either φ is the trivial homomorphism (: +1 ←s
:{permutations}) or it is the signature
homomorphism, sign; each
permutation that simply swaps two (distinct) inputs has signature −1 and
all permutations can be expressed as composites of such swaps. These two
possibilities for φ, in turn, generate two possibilities for join; the
first builds the completely symmetrised tensor product; while sign builds the
completely antisymmetrised tensor product.
The symmetric option allows us arbitrary lists (C:f:) as inputs to join; the antisymmetric one, however, will map to zero any (C:f:) whose outputs aren't distinct (from this we may infer that it also maps to zero any (C:f:) whose outputs aren't linearly independent). This in effect means that, if φ is sign, no two sub-systems can be in the same state (indeed, no sub-system may be in a state expressible as a linear superposition of the states of the others); the number of times any given member of C appears in any f which join doesn't map to zero is in {0,1}. As it happens, particles whose spin is an odd multiple of ℏ/2 (half Dirac's constant) obey exactly this rule – no two of them can be in the same state; this is called the Pauli exclusion principle. In memory of Enrico Fermi – who, with Paul Dirac, first studied the thermodynamics of such particles – these are known as fermions. Particles whose spin is a whole multiple of ℏ (so an even multiple of ℏ/2) use the symmetric form of join, allowing arbitrarily many of them to be in any given state; these are called bosons after Satyendra Nath Bose – who, with Albert Einstein, first studied their thermodynamics.
If we consider the identity on span(C) applied as a Leibniz operator on
the span of products of lists with entries in C, we obtain an observable on
these products for which each join(C:f|m) has eigenvalue m; this, then, counts
the number of sub-systems; call this observable N. Likewise, for each state c
of the sub-system, there's a formal observable c×σ(c) which
measures the probability of a sub-system being observed to be in state c; when
applied as a Leibniz operator, n(c), to states of the composite system, this
sums this probability over all sub-systems, yielding the expected number of
sub-systems observed in the given state. If C is a basis that
unit-diagonalises σ and c is in C, then any list (C:f:) has, as its
eigenvalue for n(c), the number of members ({c}:f|) has i.e. the number of
sub-systems in state c when the whole system is in state join(f). Of course,
because we have a quantum-mechanical system, we may have it in a state which
is a super-position of eigenstates of these number operators; this is then apt
to not be an eigenstate of them; which is why n(c) only determines
an expected value
(i.e. average) for each, in the usual way.
The outputs of (: join :{lists (C::)}) span the states of our system. Given the above Leibniz rule reading of the action of observables on these: we can, when C is a basis that unit-diagonalises σ, treat each c in C as being an eigenstate, with eigenvalue 1, of both n(c) and N; and an eigenstate with eigenvalue 0 of all the other n(e) for e in C. Thus, as long as C is an eigenbasis of all the observables in our collection J of observeds, it is indeed consistent to entertain N and the n(c) as members of J. On this same hypothesis, define a mapping F = (: (: σ(c,X(c)) ←c :C) ←X :{observables}), which extracts the eigenvalues for the individual single sub-system states; and, using S for the metric (induced by σ) on the composite space span(: join :{lists (C::)}), define count = (: (: S(join(f), n(c,join(f))) ←c :C) ←f :{lists (C::)}), for which count(f,c) is just the number of members ({c}:f|) has. For fermions, we only consider those f for which ({0,1}: count(f) |C); for bosons, any count(f,c) can take any natural value. Let Q = {0,1} if our sub-systems are fermionic; but Q = {naturals} if they are bosonic.
We have Z = ({observables}: exp(h(J)) ←h :{distributions on J}) and Y = log&on;trace&on;Z; with trace being a summation over all members of some orthonormal basis of the space of states of the compound system. Expressing any distribution h on J in terms of the ({reals}: H :J) mapping for which h = H@sum, summation weighted by H, we have Z(h) = exp(sum(: X.H(X) ←X :J)). This is exp applied to an operator diagonal with respect to our basis {lists (C:f:): (Q: count(f) |C)}; hence Z(h) is likewise diagonal, with each diagonal entry being (the usual scalar) exp applied to the corresponding diagonal entry of log(Z(h)). Thus the diagonal entry of Z(h) for basis member join(f) is
We want the trace of Z(h), which means summing this last over non-zero values for join(f) for lists (C:f:); this is like summing over all lists (C::) except that join treats any two lists with the same entries, merely permuted, as equivalent (so our sum only counts one join(f) for possibly many f). The result (once we discard lists that join maps to zero) is the same as taking a sum in which each count(f,c) is allowed to range independently over the whole of Q. Thus:
Now, for given v, sum(: power(j, v) ←j |Q) is just 1+v for fermions or 1+v+v.v+v.v.v+… = 1/(1−v) for bosons. Either way, the sum's log is ±log(1±v), with ± being + for fermions, − for bosons. Thus we can infer
So now let's look at Y', once more interpreting h via some H for which h = H@sum = (: sum(: f(X).H(X) ←X :J) ←f :), so that
and we're selecting an h which makes this equal Val, i.e. for which, for each X in J,
In particular, we can insert X = n(c) for some c in C; F(n(c),c) = 1 and F(n(c),e) = 0 for all other e in C; so we can infer Val(n(c)) = 1/(exp(−h(F,c))±1) as the expected number of sub-systems in state c. Likewise, each c in C has F(N,c) = 1 so Val(N) = sum(: 1/(exp&on;h(−F)±1) |C). In particular, for each observable X, we obtain
as we might naïvely have expected; each observable's value in our actual state is the sum, over sub-system states, of the observable's expected value for the given state times the number of sub-systems in that state. Now, from h we are to compute R(Val) = Z(h) / trace(Z(h)), which is diagonal in our chosen basis. From our value for Y(h), we can compute
so that the diagonal entry of Z(h)/trace(Z(h)) for basis state join(f) is
and to do much more than this, we need some description of the possible states, and actual observables, of our system. (However, generally one of our observables is energy, E in J, and the corresponding H(E) turns out to be 1/(k.T), where T is the temperature and k is Boltzmann's constant, k = 13.805e-24 Joules per Kelvin.)
In a system composed of a mixture of many kinds of sub-system – e.g. a mixture of many distinct kinds of molecule – we can apply the analysis above to each distinct kind of sub-system and then combine the results. Given that the above reduces description of the particles of each type to describing products, averaged (with suitable sign) over all permutations, of states of one particle, it makes sense to describe a heterogeneous system's state as a (tensor) product of terms, each of which is a state of all the particles of one kind. The various observables are then Leibniz operators, as before; when acting on products of eigenstates, they simply scale the product by the sum of the eigenvalues of the various states making up the product.
For each species, A, of sub-system we get a number operator, N(A), whose expected value (and, in eigenstates, eigenvalue) is the number of sub-systems of that type. These formal observables correspond (via Avogadro's number per mole) to the amount of each substance present, which is commonly an actually observed property of the system, so these number operators are naturally significant. We also get a mapping (: C |{species}), for which each C(A) is a basis of the states of a single sub-system of kind A. As before, for each c in any C(A), we get a number operator n(c) and each F(X) becomes a mapping ({reals}: |union(C)), wherein union(C) is implicitly a disjoint union (since distinct species yield distinct sets of states, hence bases; even if all their other parameters are equal, a in C(A) and e in C(B), with A and B distinct, describe sub-systems of distinct kinds). We can partition {species} into {bosonic species} and {fermionic species} and introduce D = union(:C:{fermionic species}) and E = union(:C:{bosonic species}), again implicitly using disjoint union. This yields:
If every exp(h(F,c)) is small, Y(h) is well approximated by sum(: exp(h(F)) |union(C)), but this seems unlikely to be realistic.
Written by Eddy.