Maximum entropy thermodynamics

Last updated

In physics, maximum entropy thermodynamics (colloquially, MaxEnt thermodynamics) views equilibrium thermodynamics and statistical mechanics as inference processes. More specifically, MaxEnt applies inference techniques rooted in Shannon information theory, Bayesian probability, and the principle of maximum entropy. These techniques are relevant to any situation requiring prediction from incomplete or insufficient data (e.g., image reconstruction, signal processing, spectral analysis, and inverse problems). MaxEnt thermodynamics began with two papers by Edwin T. Jaynes published in the 1957 Physical Review. [1] [2]

Contents

Maximum Shannon entropy

Central to the MaxEnt thesis is the principle of maximum entropy. It demands as given some partly specified model and some specified data related to the model. It selects a preferred probability distribution to represent the model. The given data state "testable information" [3] [4] about the probability distribution, for example particular expectation values, but are not in themselves sufficient to uniquely determine it. The principle states that one should prefer the distribution which maximizes the Shannon information entropy,

This is known as the Gibbs algorithm, having been introduced by J. Willard Gibbs in 1878, to set up statistical ensembles to predict the properties of thermodynamic systems at equilibrium. It is the cornerstone of the statistical mechanical analysis of the thermodynamic properties of equilibrium systems (see partition function).

A direct connection is thus made between the equilibrium thermodynamic entropy STh, a state function of pressure, volume, temperature, etc., and the information entropy for the predicted distribution with maximum uncertainty conditioned only on the expectation values of those variables:

kB, the Boltzmann constant, has no fundamental physical significance here, but is necessary to retain consistency with the previous historical definition of entropy by Clausius (1865) (see Boltzmann constant ).

However, the MaxEnt school argue that the MaxEnt approach is a general technique of statistical inference, with applications far beyond this. It can therefore also be used to predict a distribution for "trajectories" Γ "over a period of time" by maximising:

This "information entropy" does not necessarily have a simple correspondence with thermodynamic entropy. But it can be used to predict features of nonequilibrium thermodynamic systems as they evolve over time.

For non-equilibrium scenarios, in an approximation that assumes local thermodynamic equilibrium, with the maximum entropy approach, the Onsager reciprocal relations and the Green–Kubo relations fall out directly. The approach also creates a theoretical framework for the study of some very special cases of far-from-equilibrium scenarios, making the derivation of the entropy production fluctuation theorem straightforward. For non-equilibrium processes, as is so for macroscopic descriptions, a general definition of entropy for microscopic statistical mechanical accounts is also lacking.

Technical note: For the reasons discussed in the article differential entropy, the simple definition of Shannon entropy ceases to be directly applicable for random variables with continuous probability distribution functions. Instead the appropriate quantity to maximize is the "relative information entropy",

Hc is the negative of the Kullback–Leibler divergence, or discrimination information, of m(x) from p(x), where m(x) is a prior invariant measure for the variable(s). The relative entropy Hc is always less than zero, and can be thought of as (the negative of) the number of bits of uncertainty lost by fixing on p(x) rather than m(x). Unlike the Shannon entropy, the relative entropy Hc has the advantage of remaining finite and well-defined for continuous x, and invariant under 1-to-1 coordinate transformations. The two expressions coincide for discrete probability distributions, if one can make the assumption that m(xi) is uniform – i.e. the principle of equal a-priori probability, which underlies statistical thermodynamics.

Philosophical implications

Adherents to the MaxEnt viewpoint take a clear position on some of the conceptual/philosophical questions in thermodynamics. This position is sketched below.

The nature of the probabilities in statistical mechanics

Jaynes (1985, [5] 2003, [6] et passim) discussed the concept of probability. According to the MaxEnt viewpoint, the probabilities in statistical mechanics are determined jointly by two factors: by respectively specified particular models for the underlying state space (e.g. Liouvillian phase space); and by respectively specified particular partial descriptions of the system (the macroscopic description of the system used to constrain the MaxEnt probability assignment). The probabilities are objective in the sense that, given these inputs, a uniquely defined probability distribution will result, the same for every rational investigator, independent of the subjectivity or arbitrary opinion of particular persons. The probabilities are epistemic in the sense that they are defined in terms of specified data and derived from those data by definite and objective rules of inference, the same for every rational investigator. [7] Here the word epistemic, which refers to objective and impersonal scientific knowledge, the same for every rational investigator, is used in the sense that contrasts it with opiniative, which refers to the subjective or arbitrary beliefs of particular persons; this contrast was used by Plato and Aristotle, and stands reliable today.

Jaynes also used the word 'subjective' in this context because others have used it in this context. He accepted that in a sense, a state of knowledge has a subjective aspect, simply because it refers to thought, which is a mental process. But he emphasized that the principle of maximum entropy refers only to thought which is rational and objective, independent of the personality of the thinker. In general, from a philosophical viewpoint, the words 'subjective' and 'objective' are not contradictory; often an entity has both subjective and objective aspects. Jaynes explicitly rejected the criticism of some writers that, just because one can say that thought has a subjective aspect, thought is automatically non-objective. He explicitly rejected subjectivity as a basis for scientific reasoning, the epistemology of science; he required that scientific reasoning have a fully and strictly objective basis. [8] Nevertheless, critics continue to attack Jaynes, alleging that his ideas are "subjective". One writer even goes so far as to label Jaynes' approach as "ultrasubjectivist", [9] and to mention "the panic that the term subjectivism created amongst physicists". [10]

The probabilities represent both the degree of knowledge and lack of information in the data and the model used in the analyst's macroscopic description of the system, and also what those data say about the nature of the underlying reality.

The fitness of the probabilities depends on whether the constraints of the specified macroscopic model are a sufficiently accurate and/or complete description of the system to capture all of the experimentally reproducible behavior. This cannot be guaranteed, a priori. For this reason MaxEnt proponents also call the method predictive statistical mechanics. The predictions can fail. But if they do, this is informative, because it signals the presence of new constraints needed to capture reproducible behavior in the system, which had not been taken into account.

Is entropy "real"?

The thermodynamic entropy (at equilibrium) is a function of the state variables of the model description. It is therefore as "real" as the other variables in the model description. If the model constraints in the probability assignment are a "good" description, containing all the information needed to predict reproducible experimental results, then that includes all of the results one could predict using the formulae involving entropy from classical thermodynamics. To that extent, the MaxEnt STh is as "real" as the entropy in classical thermodynamics.

Of course, in reality there is only one real state of the system. The entropy is not a direct function of that state. It is a function of the real state only through the (subjectively chosen) macroscopic model description.

Is ergodic theory relevant?

The Gibbsian ensemble idealizes the notion of repeating an experiment again and again on different systems, not again and again on the same system. So long-term time averages and the ergodic hypothesis, despite the intense interest in them in the first part of the twentieth century, strictly speaking are not relevant to the probability assignment for the state one might find the system in.

However, this changes if there is additional knowledge that the system is being prepared in a particular way some time before the measurement. One must then consider whether this gives further information which is still relevant at the time of measurement. The question of how 'rapidly mixing' different properties of the system are then becomes very much of interest. Information about some degrees of freedom of the combined system may become unusable very quickly; information about other properties of the system may go on being relevant for a considerable time.

If nothing else, the medium and long-run time correlation properties of the system are interesting subjects for experimentation in themselves. Failure to accurately predict them is a good indicator that relevant macroscopically determinable physics may be missing from the model.

The second law

According to Liouville's theorem for Hamiltonian dynamics, the hyper-volume of a cloud of points in phase space remains constant as the system evolves. Therefore, the information entropy must also remain constant, if we condition on the original information, and then follow each of those microstates forward in time:

However, as time evolves, that initial information we had becomes less directly accessible. Instead of being easily summarizable in the macroscopic description of the system, it increasingly relates to very subtle correlations between the positions and momenta of individual molecules. (Compare to Boltzmann's H-theorem.) Equivalently, it means that the probability distribution for the whole system, in 6N-dimensional phase space, becomes increasingly irregular, spreading out into long thin fingers rather than the initial tightly defined volume of possibilities.

Classical thermodynamics is built on the assumption that entropy is a state function of the macroscopic variables—i.e., that none of the history of the system matters, so that it can all be ignored.

The extended, wispy, evolved probability distribution, which still has the initial Shannon entropy STh(1), should reproduce the expectation values of the observed macroscopic variables at time t2. However it will no longer necessarily be a maximum entropy distribution for that new macroscopic description. On the other hand, the new thermodynamic entropy STh(2) assuredly will measure the maximum entropy distribution, by construction. Therefore, we expect:

At an abstract level, this result implies that some of the information we originally had about the system has become "no longer useful" at a macroscopic level. At the level of the 6N-dimensional probability distribution, this result represents coarse graining—i.e., information loss by smoothing out very fine-scale detail.

Caveats with the argument

Some caveats should be considered with the above.

1. Like all statistical mechanical results according to the MaxEnt school, this increase in thermodynamic entropy is only a prediction. It assumes in particular that the initial macroscopic description contains all of the information relevant to predicting the later macroscopic state. This may not be the case, for example if the initial description fails to reflect some aspect of the preparation of the system which later becomes relevant. In that case the "failure" of a MaxEnt prediction tells us that there is something more which is relevant that we may have overlooked in the physics of the system.

It is also sometimes suggested that quantum measurement, especially in the decoherence interpretation, may give an apparently unexpected reduction in entropy per this argument, as it appears to involve macroscopic information becoming available which was previously inaccessible. (However, the entropy accounting of quantum measurement is tricky, because to get full decoherence one may be assuming an infinite environment, with an infinite entropy).

2. The argument so far has glossed over the question of fluctuations. It has also implicitly assumed that the uncertainty predicted at time t1 for the variables at time t2 will be much smaller than the measurement error. But if the measurements do meaningfully update our knowledge of the system, our uncertainty as to its state is reduced, giving a new SI(2) which is less than SI(1). (Note that if we allow ourselves the abilities of Laplace's demon, the consequences of this new information can also be mapped backwards, so our uncertainty about the dynamical state at time t1 is now also reduced from SI(1) to SI(2)).

We know that STh(2) > SI(2); but we can now no longer be certain that it is greater than STh(1) = SI(1). This then leaves open the possibility for fluctuations in STh. The thermodynamic entropy may go "down" as well as up. A more sophisticated analysis is given by the entropy Fluctuation Theorem, which can be established as a consequence of the time-dependent MaxEnt picture.

3. As just indicated, the MaxEnt inference runs equally well in reverse. So given a particular final state, we can ask, what can we "retrodict" to improve our knowledge about earlier states? However the Second Law argument above also runs in reverse: given macroscopic information at time t2, we should expect it too to become less useful. The two procedures are time-symmetric. But now the information will become less and less useful at earlier and earlier times. (Compare with Loschmidt's paradox.) The MaxEnt inference would predict that the most probable origin of a currently low-entropy state would be as a spontaneous fluctuation from an earlier high entropy state. But this conflicts with what we know to have happened, namely that entropy has been increasing steadily, even back in the past.

The MaxEnt proponents' response to this would be that such a systematic failing in the prediction of a MaxEnt inference is a "good" thing. [11] It means that there is thus clear evidence that some important physical information has been missed in the specification the problem. If it is correct that the dynamics "are" time-symmetric, it appears that we need to put in by hand a prior probability that initial configurations with a low thermodynamic entropy are more likely than initial configurations with a high thermodynamic entropy. This cannot be explained by the immediate dynamics. Quite possibly, it arises as a reflection of the evident time-asymmetric evolution of the universe on a cosmological scale (see arrow of time).

Criticisms

The Maximum Entropy thermodynamics has some important opposition, in part because of the relative paucity of published results from the MaxEnt school, especially with regard to new testable predictions far-from-equilibrium. [12]

The theory has also been criticized in the grounds of internal consistency. For instance, Radu Balescu provides a strong criticism of the MaxEnt School and of Jaynes' work. Balescu states that Jaynes' and coworkers theory is based on a non-transitive evolution law that produces ambiguous results. Although some difficulties of the theory can be cured, the theory "lacks a solid foundation" and "has not led to any new concrete result". [13]

Though the maximum entropy approach is based directly on informational entropy, it is applicable to physics only when there is a clear physical definition of entropy. There is no clear unique general physical definition of entropy for non-equilibrium systems, which are general physical systems considered during a process rather than thermodynamic systems in their own internal states of thermodynamic equilibrium. [14] It follows that the maximum entropy approach will not be applicable to non-equilibrium systems until there is found a clear physical definition of entropy. This problem is related to the fact that heat may be transferred from a hotter to a colder physical system even when local thermodynamic equilibrium does not hold so that neither system has a well defined temperature. Classical entropy is defined for a system in its own internal state of thermodynamic equilibrium, which is defined by state variables, with no non-zero fluxes, so that flux variables do not appear as state variables. But for a strongly non-equilibrium system, during a process, the state variables must include non-zero flux variables. Classical physical definitions of entropy do not cover this case, especially when the fluxes are large enough to destroy local thermodynamic equilibrium. In other words, for entropy for non-equilibrium systems in general, the definition will need at least to involve specification of the process including non-zero fluxes, beyond the classical static thermodynamic state variables. The 'entropy' that is maximized needs to be defined suitably for the problem at hand. If an inappropriate 'entropy' is maximized, a wrong result is likely. In principle, maximum entropy thermodynamics does not refer narrowly and only to classical thermodynamic entropy. It is about informational entropy applied to physics, explicitly depending on the data used to formulate the problem at hand. According to Attard, for physical problems analyzed by strongly non-equilibrium thermodynamics, several physically distinct kinds of entropy need to be considered, including what he calls second entropy. Attard writes: "Maximizing the second entropy over the microstates in the given initial macrostate gives the most likely target macrostate.". [15] The physically defined second entropy can also be considered from an informational viewpoint.

See also

Related Research Articles

<span class="mw-page-title-main">Entropy</span> Property of a thermodynamic system

Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodynamics, where it was first recognized, to the microscopic description of nature in statistical physics, and to the principles of information theory. It has found far-ranging applications in chemistry and physics, in biological systems and their relation to life, in cosmology, economics, sociology, weather science, climate change, and information systems including the transmission of information in telecommunication.

In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. It does not assume or postulate any natural laws, but explains the macroscopic behavior of nature from the behavior of such ensembles.

<span class="mw-page-title-main">Thermodynamics</span> Physics of heat, work, and temperature

Thermodynamics is a branch of physics that deals with heat, work, and temperature, and their relation to energy, entropy, and the physical properties of matter and radiation. The behavior of these quantities is governed by the four laws of thermodynamics which convey a quantitative description using measurable macroscopic physical quantities, but may be explained in terms of microscopic constituents by statistical mechanics. Thermodynamics applies to a wide variety of topics in science and engineering, especially physical chemistry, biochemistry, chemical engineering and mechanical engineering, but also in other complex fields such as meteorology.

In physics, specifically statistical mechanics, an ensemble is an idealization consisting of a large number of virtual copies of a system, considered all at once, each of which represents a possible state that the real system might be in. In other words, a statistical ensemble is a set of systems of particles used in statistical mechanics to describe a single system. The concept of an ensemble was introduced by J. Willard Gibbs in 1902.

<span class="mw-page-title-main">Second law of thermodynamics</span> Physical law for entropy and heat

The second law of thermodynamics is a physical law based on universal experience concerning heat and energy interconversions. One simple statement of the law is that heat always moves from hotter objects to colder objects, unless energy in some form is supplied to reverse the direction of heat flow. Another definition is: "Not all heat energy can be converted into work in a cyclic process."

Thermodynamic equilibrium is an axiomatic concept of thermodynamics. It is an internal state of a single thermodynamic system, or a relation between several thermodynamic systems connected by more or less permeable or impermeable walls. In thermodynamic equilibrium, there are no net macroscopic flows of matter nor of energy within a system or between systems. In a system that is in its own state of internal thermodynamic equilibrium, no macroscopic change occurs.

The fluctuation theorem (FT), which originated from statistical mechanics, deals with the relative probability that the entropy of a system which is currently away from thermodynamic equilibrium will increase or decrease over a given amount of time. While the second law of thermodynamics predicts that the entropy of an isolated system should tend to increase until it reaches equilibrium, it became apparent after the discovery of statistical mechanics that the second law is only a statistical one, suggesting that there should always be some nonzero probability that the entropy of an isolated system might spontaneously decrease; the fluctuation theorem precisely quantifies this probability.

In classical statistical mechanics, the H-theorem, introduced by Ludwig Boltzmann in 1872, describes the tendency to decrease in the quantity H in a nearly-ideal gas of molecules. As this quantity H was meant to represent the entropy of thermodynamics, the H-theorem was an early demonstration of the power of statistical mechanics as it claimed to derive the second law of thermodynamics—a statement about fundamentally irreversible processes—from reversible microscopic mechanics. It is thought to prove the second law of thermodynamics, albeit under the assumption of low-entropy initial conditions.

<span class="mw-page-title-main">Non-equilibrium thermodynamics</span> Branch of thermodynamics

Non-equilibrium thermodynamics is a branch of thermodynamics that deals with physical systems that are not in thermodynamic equilibrium but can be described in terms of macroscopic quantities that represent an extrapolation of the variables used to specify the system in thermodynamic equilibrium. Non-equilibrium thermodynamics is concerned with transport processes and with the rates of chemical reactions.

In physics, Loschmidt's paradox, also known as the reversibility paradox, irreversibility paradox, or Umkehreinwand, is the objection that it should not be possible to deduce an irreversible process from time-symmetric dynamics. This puts the time reversal symmetry of (almost) all known low-level fundamental physical processes at odds with any attempt to infer from them the second law of thermodynamics which describes the behaviour of macroscopic systems. Both of these are well-accepted principles in physics, with sound observational and theoretical support, yet they seem to be in conflict, hence the paradox.

<span class="mw-page-title-main">Laws of thermodynamics</span> Observational basis of thermodynamics

The laws of thermodynamics are a set of scientific laws which define a group of physical quantities, such as temperature, energy, and entropy, that characterize thermodynamic systems in thermodynamic equilibrium. The laws also use various parameters for thermodynamic processes, such as thermodynamic work and heat, and establish relationships between them. They state empirical facts that form a basis of precluding the possibility of certain phenomena, such as perpetual motion. In addition to their use in thermodynamics, they are important fundamental laws of physics in general and are applicable in other natural sciences.

In statistical mechanics, a canonical ensemble is the statistical ensemble that represents the possible states of a mechanical system in thermal equilibrium with a heat bath at a fixed temperature. The system can exchange energy with the heat bath, so that the states of the system will differ in total energy.

In statistical mechanics, the thermodynamic limit or macroscopic limit, of a system is the limit for a large number N of particles where the volume is taken to grow in proportion with the number of particles. The thermodynamic limit is defined as the limit of a system with a large volume, with the particle density held fixed.

<span class="mw-page-title-main">Gibbs algorithm</span> Criterion for choosing a probability distribution in statistical mechanics

In statistical mechanics, the Gibbs algorithm, introduced by J. Willard Gibbs in 1902, is a criterion for choosing a probability distribution for the statistical ensemble of microstates of a thermodynamic system by minimizing the average log probability

<span class="mw-page-title-main">Microstate (statistical mechanics)</span> Specific microscopic configuration of a thermodynamic system

In statistical mechanics, a microstate is a specific configuration of a system that describes the precise positions and momenta of all the individual particles or components that make up the system. Each microstate has a certain probability of occurring during the course of the system's thermal fluctuations.

The mathematical expressions for thermodynamic entropy in the statistical thermodynamics formulation established by Ludwig Boltzmann and J. Willard Gibbs in the 1870s are similar to the information entropy by Claude Shannon and Ralph Hartley, developed in the 1940s.

The concept entropy was first developed by German physicist Rudolf Clausius in the mid-nineteenth century as a thermodynamic property that predicts that certain spontaneous processes are irreversible or impossible. In statistical mechanics, entropy is formulated as a statistical property using probability theory. The statistical entropy perspective was introduced in 1870 by Austrian physicist Ludwig Boltzmann, who established a new field of physics that provided the descriptive linkage between the macroscopic observation of nature and the microscopic view based on the rigorous treatment of large ensembles of microstates that constitute thermodynamic systems.

<span class="mw-page-title-main">Temperature</span> Physical quantity that expresses hot and cold

Temperature is a physical quantity that expresses quantitatively the perceptions of hotness and coldness. Temperature is measured with a thermometer.

Energy dissipation and entropy production extremal principles are ideas developed within non-equilibrium thermodynamics that attempt to predict the likely steady states and dynamical structures that a physical system might show. The search for extremum principles for non-equilibrium thermodynamics follows their successful use in other branches of physics. According to Kondepudi (2008), and to Grandy (2008), there is no general rule that provides an extremum principle that governs the evolution of a far-from-equilibrium system to a steady state. According to Glansdorff and Prigogine, irreversible processes usually are not governed by global extremal principles because description of their evolution requires differential equations which are not self-adjoint, but local extremal principles can be used for local solutions. Lebon Jou and Casas-Vásquez (2008) state that "In non-equilibrium ... it is generally not possible to construct thermodynamic potentials depending on the whole set of variables". Šilhavý (1997) offers the opinion that "... the extremum principles of thermodynamics ... do not have any counterpart for [non-equilibrium] steady states ." It follows that any general extremal principle for a non-equilibrium problem will need to refer in some detail to the constraints that are specific for the structure of the system considered in the problem.

<span class="mw-page-title-main">Thermal fluctuations</span> Random temperature-influenced deviations of particles from their average state

In statistical mechanics, thermal fluctuations are random deviations of an atomic system from its average state, that occur in a system at equilibrium. All thermal fluctuations become larger and more frequent as the temperature increases, and likewise they decrease as temperature approaches absolute zero.

References

  1. Jaynes, E.T. (1957). "Information theory and statistical mechanics" (PDF). Physical Review. 106 (4): 620–630. Bibcode:1957PhRv..106..620J. doi:10.1103/PhysRev.106.620. S2CID   17870175.
  2. (1957). "Information theory and statistical mechanics II" (PDF). Physical Review. 108 (2): 171–190. Bibcode:1957PhRv..108..171J. doi:10.1103/PhysRev.108.171.
  3. Jaynes, E.T. (1968), p. 229.
  4. Jaynes, E.T. (1979), pp. 30, 31, 40.
  5. Jaynes, E.T. (1985).
  6. Jaynes, E.T. (2003).
  7. Jaynes, E.T. (1979), p. 28.
  8. Jaynes, E.T. (1968), p. 228.
  9. Guttmann, Y.M. (1999), pp. 28, 36, 38, 57, 61.
  10. Guttmann, Y.M. (1999), p. 29.
  11. Jaynes, E.T. (1979).
  12. Kleidon, A., Lorenz, R.D. (2005).
  13. Balescu, R. (1997).
  14. Lieb, E.H., Yngvason, J. (2003). The entropy of classical thermodynamics, Chapter 8 of Greven, A., Keller, G., Warnecke (editors) (2003). Entropy, Princeton University Press, Princeton NJ, ISBN   0-691-11338-6, page 190.
  15. Attard, P. (2012). Non-Equilibrium Thermodynamics and Statistical Mechanics: Foundations and Applications, Oxford University Press, Oxford UK, ISBN   978-0-19-966276-0, p. 161.

Bibliography of cited references

Further reading