There are close parallels between the mathematical expressions for the thermodynamic entropy, usually denoted by S, of a physical system in the statistical thermodynamics established by Ludwig Boltzmann and J. Willard Gibbs in the 1870s; and the information-theoretic entropy, usually expressed as H, of Claude Shannon and Ralph Hartley developed in the 1940s.
This article explores what links there are between the two concepts, and how far they can be regarded as connected.
The defining expression for entropy in the theory of information established by Claude E. Shannon in 1948 is of the form:
If all the microstates are equiprobable (a microcanonical ensemble), the statistical thermodynamic entropy reduces to the form on Boltzmann's tombstone,
If all the messages are equiprobable, the information entropy reduces to the Hartley entropy
The logarithm in the thermodynamic definition is the natural logarithm was taken. It can be shown that the Gibbs entropy formula, with the natural logarithm, reproduces all of the properties of the macroscopic classical thermodynamics of Clausius. (See article: entropy (statistical views)).
The logarithm can also be taken to the natural base in the case of information entropy. This is equivalent to choosing to measure information in nats. One nat is about 1.44 bits.
The presence of Boltzmann's constant k in the thermodynamic definitions is a historical accident, reflecting the conventional units of temperature. It is there to make sure that the statistical definition of thermodynamic entropy matches the classical entropy of Clausius, thermodynamically conjugate to temperature, so that one can write the first law of thermodynamics as
The most obvious extension of the Shannon entropy is the differential entropy,
But it turns out that this is not in general a good measure of uncertainty or information. For example, the differential entropy can be negative; also it is not invariant under continuous co-ordinate transformations.
More useful for the continuous case is the relative entropy of a distribution, defined as the Kullback-Leibler divergence from the distribution to a reference measure m(x),
The relative entropy carries over directly from discrete to continuous distributions, and is invariant under co-ordinate reparamatrisations. For an application of relative entropy in a quantum information theory setting, see eg *.
Despite all that, there is an important difference between the two quantities. The information entropy H can be calculated for any probability distribution (if the "message" is taken to be that the event i which had probability pi occurred, out of the space of the events possible). But the thermodynamic entropy S refers to specifically thermodynamic probabilities pi.
Furthermore, the thermodynamic entropy S is dominated by the different arrangements of the system, and in particular its energy, that are possible on a molecular scale. In comparison, the information entropy of any macroscopic events is so small as to be completely irrelevant.
However, a connection can be made between the two, if the probabilities in question are the thermodynamic probabilities pi: the (reduced) Gibbs entropy σ can then be seen as simply the amount of Shannon information needed to define the detailed microscopic state of the system, given its macroscopic description. Or, in the words of G. N. Lewis writing about chemical entropy in 1930, "Gain in entropy always means loss of information, and nothing more".
Furthermore, the prescription to find the equilibrium distributions of statistical mechanics, such as the Boltzmann distribution, by maximising the Gibbs entropy subject to appropriate constraints (the Gibbs algorithm), can now be seen as something not unique to thermodynamics, but as a principle of general relevance in all sorts of statistical inference, if it desired to find a maximally uninformative probability distribution, subject to certain constraints on the behaviour of its averages. (These perspectives are explored further in the article Maximum entropy thermodynamics).
A neat physical thought-experiment demonstrating how just the possession of information might in principle have thermodynamic consequences was established in 1929 by Szilard, in a refinement of the famous Maxwell's demon scenario.
Consider Maxwell's set-up, but with only a single gas particle in a box. If the supernatural demon knows which half of the box the particle is in, it can close a shutter between the two halves of the box, close a piston unopposed into the empty half of the box, and then extract joules of useful work if the shutter is opened again. The particle can then be left to isothermally expand back to its original equilibrium occupied volume. In just the right circumstances therefore, the possession of a single bit of Shannon information (a single bit of negentropy in Brillouin's term) really does correspond to a reduction in physical entropy, which theoretically can indeed be parlayed into useful physical work.
In fact one can generalise: any information that has a physical representation must somehow be embedded in the statistical mechanical degrees of freedom of a physical system.
Thus, Rolf Landauer argued in 1961, if one were to imagine starting with those degrees of freedom in a thermalised state, there would be a real reduction in thermodynamic entropy if they were then re-set to a known state. This can only be achieved under information-preserving microscopically deterministic dynamics if the uncertainty is somehow dumped somewhere else — ie if the entropy of the environment (or the non information-bearing degrees of freedom) is increased by at least an equivalent amount, as required by the Second Law, by gaining an appropriate quantity of heat: specifically kT ln 2 of heat for every 1 bit of randomness erased.
On the other hand, Landauer argued, there is no thermodynamic objection to a logically reversible operation potentially being achieved in a physically reversible way in the system. It is only logically irreversible operations — for example, the erasing of a bit to a known state, or the merging of two computation paths — which must be accompanied by a corresponding entropy increase.
Applied to the Maxwell's demon/Szilard engine scenario, this suggests that it might be possible to "read" the state of the particle into a computing apparatus with no entropy cost; but only if the apparatus has already been SET into a known state, rather than being in a thermalised state of uncertainty. To SET (or RESET) the apparatus into this state will cost all the entropy that can be saved by knowing the state of Szilard's particle.
The relation between information entropy and thermodynamic entropy has become common currency in physics. Thus Stephen Hawking often speaks of the thermodynamic entropy of black holes in terms of their information content; and it is not surprising that computers must obey the same physical laws that steam engines do, even though they are radically different devices.
Stephen Hawking often speaks of the thermodynamic entropy of black holes in terms of their information content. Do black holes destroy information? Didn't Hawking lose a bet about that one? See Black hole entropy and Black hole information paradox.
Hirschman showed in 1957, however, that Heisenberg's uncertainty principle can be expressed as a particular lower bound on the sum of the entropies of the observable probability distributions of a particle's position and momentum, when they are expressed in Planck units. (One could speak of the "joint entropy" of these distributions by considering them independent, but since they are not jointly observable, they cannot be considered as a joint distribution.)
The fluctuation theorem provides a mathematical justification of the second law of thermodynamics under these principles, and precisely defines the limitations of the applicability of that law to the microscopic realm of individual particle movements.
The article Conceptual inadequacy of the Shannon information in quantum measurement published in 2001 by Anton Zeilinger [http://www.quantum.univie.ac.at/zeilinger/ and Caslav Brukner, synthesized and developed these remarks. The so-called Zeilinger's principle suggests that the quantization observed in QM could be bound to information quantization (one cannot observe less than one bit, and what is not observed is by definition "random").
But these claims remain highly controversial. For a detailed discussion of the applicability of the Shannon information in quantum mechanics and an argument that Zeilinger's principle cannot explain quantization, see Timpson 2003 *" target="_blank" >and Mana 2004 [http://arxiv.org/abs/quant-ph/0302049
For a tutorial on quantum information see *.
This article is licensed under the GNU Free Documentation License.
It uses material from the
"Entropy in thermodynamics and information theory".
Home Page • arts • business • computers • games • health • hospitals • home • kids & teens • news • physicians • recreation• reference • regional • science • shopping • society • sports • world