Create Presentation
Download Presentation

Download Presentation

Module #4 – Information, Entropy, Thermodynamics, and Computing

Download Presentation
## Module #4 – Information, Entropy, Thermodynamics, and Computing

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Module #4 – Information, Entropy,Thermodynamics, and**Computing A Grand Synthesis**Information and Entropy**A Unified Foundation for bothPhysics and Computation**What is information?**• Information, most generally, is simply that which distinguishes one thing from another. • It is the identity of a thing itself. • Part or all of an identification, or a description of the thing. • But, we must take care to distinguish between the following: • A body of info.: The complete identity of a thing. • A piece of info.: An incomplete description, part of a thing’s identity. • A pattern of info.: Many separate pieces of information contained in separate things may have identical patterns, or content. • Those pieces are perfectly correlated, we may call them copies of each other. • An amount of info.: A quantification of how large a given body or piece of information. Measured in logarithmic units. • A subject of info.: The thing that is identified or described by a given body or piece of information. May be abstract, mathematical, or physical. • An embodiment of info.: A physical subject of information. • We can say that a body, piece, or pattern of information is contained or embodied in its embodiment. • A meaning of info.: A semantic interpretation, tying the pattern to meaningful/useful characteristics or properties of the thing described. • A representation of info.: An encoding of some information within some other (possibly larger) piece of info contained in something else.**Information Concept Map**A Pattern ofInformation An Amountof Information Quantifiedby Quant-ifiedby AnotherPiece orBody ofInfor-mation Is a part of Instanceof A Bodyof Information A Pieceof Information May berepresentedby Completelydescribes, isembodied by Describes,in contained in A PhysicalThing A Thing May be**What is knowledge?**A physical entity A can be said to know a piece (or body) of information I about a thing T, and that piece is considered part of A’s knowledge K, if and only if: • A has ready, immediate access to a physical system S that contains some physical information P which A can observe and that includes a representation of I, • E.g., S may be part of A’s brain, wallet card, or laptop • A can readily and immediately decode P’s representation of I and manipulate it into an explicit form • A understands the meaning of I – how it relates to meaningful properties of T. Can apply I purposefully.**Physical Information**• Physical information is simply information that is contained in a physical system. • We may speak of a body, piece, pattern, amount, subject, embodiment, meaning, or representation of physical information, as with information in general. • Note that all information that we can manipulate ultimately must be (or be represented by) physical information! • In our quantum-mechanical universe, there are two very different categories of physical information: • Quantum information is all info. embodied in the quantum state of a physical system.Can’t all be measured or copied! • Classical informationis just a piece of info. that picks out a particular basis state, once a basis is already given.**Amount of Information**• An amount of information can be conveniently quantized as a logarithmic quantity. • This measures the number of independent, fixed-capacity physical systems needed to encode the information. • Logarithmically defined values are inherently dimensional (not dimensionless, i.e. pure-number) quantities. • The pure number result must be paired with a unit which is associated with the base of the logarithm that was used. log a = (logba) log-b-units = (logca) log-c-units log-c-unit / log-b-unit = logbc • The log-2-unit is called the bit, the log-10-unit the decade, the log-16-unit the nibble, the log-256-unit the byte. • Whereas, the log-e-unit (widely used in physics) is called the nat • The nat is also known as Boltzmann’s constant kB (e.g. in Joules/K) • A.k.a. the ideal gas constant R (may be expressed in kcal/mol/K)**Forms of Information**• Many alternative mathematical forms may be used to represent patterns of information about a thing. • Some important examples we will visit: • A string of text describing some or all properties or characteristics possessed by the thing. • A set or ensemble of alternative possible states (or consistent, complete descriptions) of the thing. • A probability distribution or probability density function over a set of possible states of the thing. • A quantum state vector, i.e., wavefunction giving a complex valued amplitude for each possible quantum state of the thing. • A mixed state (a probability distribution over orthogonal states). • (Some string theorists suggest octonions may be needed!)**Confusing Terminology Alert**• Be aware that in the following discussion I will often shift around quickly, as needed, between the following related concepts: • A subsystemB of a given abstract system A. • A state spaceS of all possible states of B. • A state variableX (a statistical “random variable”) of A representing the state of subsystem B within its state space S. • A setTS of some of the possible states of B. • A statistical “event” E that the subsystem state is one of those in the state T. • A specific statesS of the subsystem. • A valuex = s of the random variable X indicating that the specific state is s.**Preview: Some Symbology**U K Unknowninformation Knowninformation I total Information(of any kind) N S iNcompressible and/or Non-uNcomputableiNformation physicalEntropy**Unknown Info. Content of a Set**• A.k.a., amount of unknown information content. • The amount of information required to specify or pick out an element of the set, assuming that its members are all equally likely to be selected. • An assumption we will see how to justify later. • The unknowninformation content U(S) associated with a set S is defined as U(S) :≡ log |S|. • Since U(S) is defined logarithmically, it always comes with attached logarithmic units such as bits, nats, decades, etc. • E.g., the set {a, b, c, d} has an unknown information content of 2 bits.**Probability and Improbability**• I assume you already know a bit about probability theory! • Given any probability ℘(0,1], the associated improbability ℑ(℘) is defined as 1/℘. • There is a “1 in ℑ(℘)” chance of an event occurring which has probability ℘. • E.g. a probability of 0.01 implies an improbability of 100, i.e., a “1 in 100” chance of the event. • We can naturally extend this to also define the improbability ℑ(E) of an event E having probability ℘(E) by: ℑ(E) :≡ ℑ(℘(E))**Information Gain from an Event**• We define the information gainGI(E) from an event E having improbability ℑ(E) as:GI(E) :≡ log ℑ(E) = log 1/℘(E) = −log ℘(E) • Why? Consider the following argument: • Imagine picking event E from a set S which has |S| = ℑ(E) equally-likely members. • Then, E’s improbability of being picked is ℑ(E), • While the unknown information content of S wasU(S) = log |S| = log ℑ(E). • Thus, log ℑ(E) unknown information must have become known when we found out that E was actually picked.**Unknown Information Content (Entropy) of a Probability**Distribution • Given a probability distribution ℘:S→[0,1], define the unknown information content of ℘ as the expected information gain over all the singleton events E = {s} S. • It is therefore the average information needed to pick out a single element. • The below formula for the entropy of a probability distribution was known to the thermodynamicists Boltzmann and Gibbs in the 1800’s! • Claude Shannon rediscovered/rederived it many decades later. Note the −**Information Content of a Physical System**• The (total amount of) information content I(A) of an abstract physical system A is the unknown information content of the mathematical object D used to define A. • If D is (or implies) only a set S of (assumed equiprobable) states, then we have: I(A) = U(S) = log |S|. • If D implies a probability distribution ℘:S over a set S (of distinguishable states), then: I(A) = U(℘:S) = −℘i log ℘i. • We would expect to gain I(A) information if we measured A (using basis set S) to find its exact actual state sS. • we say that amount I(A) of information is contained inA. • Note that the information content depends on how broad (how abstract) the system’s description D is!**Information Capacity & Entropy**• The information capacity of a system is also the amount of information about the actual state of the system that we do not know, given only the system’s definition. • It is the amount of physical information that we can say is in the state of the system. • It is the amount of uncertainty we have about the state of the system, if we know only the system’s definition. • It is also the quantity that is traditionally known as the (maximum) entropyS of the system. • Entropy was originally defined as the ratio of heat to temperature. • The importance of this quantity in thermodynamics (the observed fact that it never decreases) was first noticed by Rudolph Clausius in 1850. • Today we know that entropy is, physically, really nothing other than (unknown, incompressible) information!**Known vs. Unknown Information**• We, as modelers, define what we mean by “the system” in question using some abstract description D. • This implies some information content I(A) for the abstract system A described by D. • But, we will often wish to model a scenario in which some entity E (perhaps ourselves) has more knowledge about the system A than is implied by its definition. • E.g., scenarios in which E has prepared A more specifically, or has measured some of its properties. • Such E will generally have a more specific description of A and thus would quote a lower resulting I(A) or entropy. • We can capture this by distinguishing the information in A that is knownbyE from that which is unknown. • Let us now see how to do this a little more formally.**Subsystems (More Generally)**• For a system A defined by a state set S, • any partition P of S into subsets can be considered a subsystem B of A. • The subsets in the partition P can be considered the “states” of the subsystem B. Another subsytem of A In this example,the product of thetwo partitions formsa partition of Sinto singleton sets.We say that this isa complete set ofsubsystems of A.In this example, the two subsystemsare also independent. One subsystemof A**Pieces of Information**• For an abstract system A defined by a state set S, any subset TS is a possible piece of information about A. • Namely it is the information “The actual state of A is some member of this set T.” • For an abstract system A defined by a probability distribution ℘:S, any probability distribution ℘′:S such that ℘=0 → ℘′=0 and U(℘′)<U(℘) is another possible piece of information about A. • That is, any distribution that is consistent with and more informative than A’s very definition.**Known Physical Information**• Within any universe (closed physical system) W described by distribution ℘, we say entity E (a subsystem of W) knows a piece P of the physical information contained in system A (another subsystem of W) iff ℘ implies a correlation between the state of E and the state of A, and this correlation is meaningfully accessible to E. • Let us now see how to make this definition more precise. The Universe W Entity(Knower)E The PhysicalSystem A Correlation**What is a correlation, anyway?**• A concept from statistics: • Two abstract systems A and B are correlated or interdependent when the entropy of the combined system S(AB) is less than that of S(A)+S(B). • I.e., something is known about the combined state of AB that cannot be represented as knowledge about the state of either A or B by itself. • E.g.A,B each have 2 possible states 0,1 • They each have 1 bit of entropy. • But, we might also know that A=B, so the entropy of AB is 1 bit, not 2. (States 00 and 11.)**Marginal Probability**• Given a joint probability distribution ℘XY over a sample space S that is a Cartesian product S = X × Y, we define the projection of ℘XY onto X, or the marginal probability of X (under the distribution ℘XY), written ℘X, as ℘X(xX) = ∑yY ℘XY(x,y). • Similarly define the marginal probability of Y. • May often just write ℘(x) or ℘x to mean ℘X(x). S = X×Y X=x → ℘x**Conditional Probability**• Given a distribution ℘XY : X × Y, we define the conditional probability of X given Y (under ℘XY), written ℘X|Y, as the relative probability of XY versus Y. That is, ℘X|Y(x,y) ≝ ℘(xy/y) = ℘XY(x,y) / ℘Y(y),and similarly for ℘Y|X. • We may also write ℘(x|y), or ℘|y(x), or even just ℘x|y to mean ℘X|Y(x,y). • Bayes’ rule is the observation that with this definition, ℘x|y = ℘y|x ℘x / ℘y. X=x S = X×Y ℘y Y=y → ℘x,y ℘x**Mutual Probability**• Given a distribution ℘XY : X × Y as above, the mutual probability ratio ℛX:Y(x,y) or just ℛx:y = ℘xy/℘x℘y. • Represents the factor by which the prob. of either outcome (X = x or Y = y) gets boosted when we learn the other. • Notice that ℛx:y = ℘x|y / ℘x = ℘y|x / ℘y, that is it is the relative probability of x|y versus x, or y|x versus y. • If the two variables represent independent subsystems, then the mutual probability ratio is always 1. • No change in one distribution from measuring the other. • WARNING: Some authors define something they call “mutual probability” as the reciprocal of the definition given here. • This seems somewhat inappropriate, given the name. • In my definition, if the mutual probability ratio is greater than 1, then the probability of xincreases when we learn y. • In theirs, the opposite is true. • The traditional definition should perhaps be instead called the mutual improbability ratio. • Mutual improbability ratio: ℛℑ,x:y = ℑxy/ℑxℑy = ℘x℘y/℘xy.**Marginal, Conditional, Mutual Entropies**• For each of the derived probabilities defined previously, we can define a corresponding informational quantity. • Joint probability ℘XY→ Joint entropy S(XY) = S(℘XY) • Marginal probability ℘X →Marginal entropy S(X) = S(℘X) • Conditional probability ℘X|Y →Conditional entropy S(X|Y) = Exy[S(℘|y(x))] • Mutual probability ratio ℛX:Y →Mutual information I(X:Y) = Exx,y[log ℛx:y] • Expected reduction in entropy of X from finding out Y.**More on Mutual Information**• Demonstration that the reduction in entropy of one variable given the other is the same as the expected mutual probability ratio ℛx:y.**Known Information, More Formally**• For a system defined by probability distribution ℘ that includes two subsystems A,B with respective state variables X,Y having mutual information I℘(X:Y), • The total information content of B is I(B) = U(℘Y). • The amount of information in B that is known by A is KA(B) = I℘(X:Y). • The amount of information in B that is unknown by A is UA(B) = U(℘Y) − KA(B) = S(Y) − I(X:Y) = S(Y|X). • The amount of entropy in B from A’s perspective is SA(B) = UA(B) = S(Y|X). • These definitions are based on all the correlations that are present between A and B according to our global knowledge ℘. • However, a real entity A may not know, understand, or be able to utilize all the correlations that are actually present between him and B. • Therefore, generally more of B’s physical information will be effectively entropy, from A’s perspective, than is implied by this definition. • We will explore some corrections to this definition later. • Later, we will also see how to sensibly extend this definition to the quantum context.**Maximum Entropy vs. Entropy**Total information content I = Maximum entropy Smax =logarithm of # states consistent with system’s definition Unknown information UA= Entropy SA(as seen by observer A) Known informationKA = I− UA= Smax − SAas seen by observer A Unknown information UB= Entropy SB(as seen by observer B)**A spin is a type of simple quantum system having only 2**distinguishable states. In the z basis, the basis states are called “up” (↑) and “down” (↓). In the example to the right, we have a compound system composed of 3 spins. it has 8 distinguishable states. Suppose we know that the 4 crossed-out states have 0 amplitude (0 probability). Due to prior preparation or measurement of the system. Then the system contains: One bit of known information in spin #2 and two bits of entropy in spins #1 & #3 A Simple Example**Entropy, as seen from the Inside**• One problem with our previous definition of knowledge-dependent entropy based on mutual information is that it is only well-defined for an ensemble or probability distribution of observer states, not for a single observer state. • However, as observers, we always find ourselves in a particular state, not in an ensemble! • Can we obtain an alternative definition of entropy that works for (and can be used by) observers who are in individual states also? • While still obeying the 2nd law of thermodynamics? • Zurek proposed that entropy S should be defined to include not only unknown information U, but also incompressible information N. • By definition, incompressible information (even if it is known) cannot be reduced, therefore the validity of the 2nd law can be maintained. • Zurek proposed using a quantity called Kolmogorov complexity to measure the amount of incompressible information. • Size of shortest program that computes the information – intractable to find! • However, we can instead use effective (practical) incompressibility, from the point of view of a particular observer, to yield a definition of the effective entropy for that observer, for all practical purposes.**Two Views of Entropy**• Global view: Probability distribution, from “outside”, of observer+observee “system” leads to “expected” entropy of B as seen by A, and total system entropy. • Local view: Entropy of B according to A’s specific knowledge of it, plus incompressible size of A’s representation of that knowledge, yields total entropy associated with B, from A’s perspective. Conditional Entropy SB|A= Expected entropy of B, from A’s perspective; Joint distribution ℘AB→Total entropy S(AB). Mutual informationIA:B Entity(Knower)A The PhysicalSystem B Joint dist. ℘AB Amount ofunknown info in B, from A’sperspective Physical System B Entity (knower) A Amount ofincompressible info. aboutB represented within A U(℘B) NB Single “actual”distribution ℘Bover states of B SA(B) = U(℘B) +NB**Example Comparing the Two Views**• Example: • Suppose object B contains 1,000 randomly-generated bits of information. (Initial entropy: SB = 1,000 b.) • Suppose observer A reversibly measures and stores (within itself) a copy of one-fourth (250 b) of the information in B. • Global view: • The total information content of B is I(B) = 1000 b. • The mutual information IA:B = 250 b. (Shared by both systems.) • B’s entropy conditioned on A: S(B|A) = I(B)−I(A:B) = 750 b. • Total entropy of joint distribution S(AB) = 1,000 b. • Local view: • A’s specific new dist. over B implies entropy S(℘B) = 750 b of unknown info. • A also contains IA = 250 b of known but incompressible information about B. • There is a total of SA(B) = 750 b + 250 b = 1,000 b of unknown or incompressible information (entropy) still in the combined system. • 750 b of this info is only “in” B, whereas 250 b of it is shared between A+B. Observer A System B 750 bunknown by A 250 bknownby A 250 bincompr.informat. Re: B**Objective Entropy?**• In all of this, we have defined entropy as a somewhat subjective or relative quantity: • Entropy of a subsystem depends on an observer’s state of knowledge about that subsystem, such as a probability distribution. • Wait a minute… Doesn’t physics have a more objective, observer-independent definition of entropy? • Only insofar as there are “preferred” states of knowledge that are most readily achieved in the lab. • E.g., knowing of a gas only its chemical composition, temperature, pressure, volume, and number of molecules. • Since such knowledge is practically difficult to improve upon using present-day macroscale tools, it serves as a uniform standard. • However, in nanoscale systems, a significant fraction of the physical information that is present in one subsystem is subject to being known, or not, by another subsystem (depending on design). • How a nanosystem is designed & how we deal with information recorded at the nanoscale may vastly affect how much of the nanosystem’s internal physical information effectively is or is not entropy (for practical purposes).**Conservation of Information**• Theorem: The total physical information capacity (maximum entropy) of any closed, constant-volume physical system (with a fixed definition) is unchanging in time. • This follows from quantum calculations yielding definite, fixed numbers of distinguishable states for all systems of given size and total energy. • We will learn about these bounds later. • Before we can do this, let us first see how to properly define entropy for quantum systems.**Relative to any given entity, we can make the following**distinctions (among others): A particular piece of information may be: Known vs. Unknown Known information vs. entropy Accessible vs. Inaccessible: Measurable vs. unmeasurable Controllable vs. uncontrollable Stable vs. Unstable Against degradation to entropy Correlated vs. Uncorrelated Also, the fact of the correlation can be known or unknown The details of correlation can be known or unknown The details can be easy or difficult to discover Wanted vs. Unwanted Entropy is usually unwanted Except when you’re chilly! Information may often be unwanted, too E.g., if it’s in the way, and not useful A particular pattern of information may be: Standard vs. Nonstandard With respect to some given coding convention Compressible vs. Incompressible Either absolutely, or effectively Zurek’s definition of entropy: unknown or incompressible info. We will be using these various distinctions throughout the later material… Some Categories of Information**Quantum Information**Generalizing classical information theory concepts to fit quantum reality**Density Operators**• For any given state |, the probabilities of all the basis states si are determined by an Hermitian operator or matrix (called the density matrix): • Note that the diagonal elements i,iare just the probabilities of the basis states i. • The off-diagonal elements are called “coherences”. • They describe the entanglements that exist between basis states. • The density matrix describes the state |exactly! • It (redundantly) expresses all of the quantum info. in |.**Mixed States**• Suppose the only thing one knows about the true state of a system that it is chosen from a statistical ensemble or mixture of state vectors vi (called “pure” states), each with a derived density matrix i, and a probability ℘i. • In such a situation, in which one’s knowledge about the true state is expressed as probability distribution over pure states, we say the system is “in” a mixed state. • Such a situation turns out to be completely described, for all physical purposes, by simply the expectationvalue (weighted average) of the vis’ density matrices: • Note: Even if there were uncountably many vi going into the calculation, the situation remains fully described by O(n2) complex numbers, where n is the number of basis states!**Von Neumann Entropy**• Suppose our probability distribution over states comes from the diagonal of a density matrix . • But, we will generally also have additional information about the state hidden in the coherences. • The off-diagonal elements of the density matrix. • The Shannon entropy of the distribution along the diagonal will generally depend on the basis used to index the matrix. • However, any density matrix can be (unitarily) rotated into another basis in which it is perfectly diagonal! • This means, all its off-diagonal elements are zero. • The Shannon entropy of the diagonal distribution is always minimized in the diagonal basis, and so this minimum is selected as being the true basis-independent entropy of the mixed quantum state . • It is called the von Neumann entropy.**V.N. entropy, more formally**• The traceTr M just means the sum of M’s diagonal elements. • The ln of a matrix M just denotes the inverse function to eM. See the logm[] function in Matlab • The exponential eMof a matrix M is defined via the Taylor-series expansion ∑i≥0Mi/i! (Shannon S) (Boltzmann S)**Quantum Information & Subsystems**• A density matrix for a particular subsystem may be obtained by “tracing out” the other subsystems. • Means, summing over state indices for all systems not selected. • This process discards information about any quantum correlations that may be present between the subsystems! • Entropies of the density matrices so obtained will generally sum to > that of the original system. (Even if original state was pure!) • Keeping this in mind, we may make these definitions: • The unconditioned or marginal quantum entropy S(A) of subsystem A is the entropy of the reduced density matrix ρA. • The conditioned quantum entropyS(A|B) = S(AB)−S(A). • Note: this may be negative! (In contrast to the classical case.) • The quantum mutual information I(A:B) = S(A)+S(B)−S(AB). • As in the classical case, this measures the amount of quantum information that is shared between the subsystems • Each subsystem “knows” this much information about the other.**Tensors and Index Notation**• A tensor is nothing but a generalized matrix that may have more than one row and/or column index.Can also be defined recursively as a matrix of tensors. • Tensor signature: An (r,c) tensor has r row indices and c column indices. • Convention: Row indices are shown as subscripts, and column indices as superscripts. • Tensor product: An (l,k) tensor T times an (n,m) tensor U is a (l+n,k+m) tensor V formed from all products of an element of T times an element of U: • Tensor trace: The trace of an (r,c) tensor T with respect to index #k (where 1 ≤ k ≤ r,c) is given by contracting (summing over) the kth row index together with the kth column index: Example: a (2,2)tensor T in which all 4indices take on values from the set {0,1}: (I is the set of legal values of indices rk and ck) →**Quantum Information Example**AB AB • Consider the state vAB = |00+|11 of compound system AB. • Let ρAB = vv†. • Note that the reduced density matrices ρA= ρB are fully classical: • Let’s look at the quantum entropies: • The joint entropyS(AB) =S(ρAB) = 0 bits. (Because vAB is a pure state.) • The unconditioned entropy of subsystem A isS(A) = S(ρA) = 1 bit. • The entropy of A conditioned on B isS(A|B) = S(AB)−S(A) = −1 bit! • The mutual information between themI(A:B) = S(A)+S(B)−S(AB) = 2 bits! |00 |01 |10 |11**Quantum vs. Classical Mutual Info.**• 2 classical bit-systems have a mutual information of at most one bit, • Occurs if they are perfectly correlated, e.g.,{00, 11} • Each bit considered by itself appears to have 1 bit of entropy. • But taken together, there is “really” only 1 bit of entropy shared between them • A measurement of either extracts that one bit of entropy, • Leaves it in the form of 1 bit of incompressible information (to the measurer). • The real joint entropy is 1 bit less than the “apparent” total entropy. • Thus, the mutual information is 1 bit. • 2 quantum bit-systems (qubits) can have a mutual info. of two bits! • Occurs in maximally entangled states, such as |00+|11. • Again, each qubit considered by itself appears to have 1 bit of entropy. • But taken together, there is no entropy in this pure state. • A measurement of either qubit leaves us with no entropy, rather than 1 bit! • If done right… see next slide. • The real joint entropy is thus 2 bits less than the “apparent” total entropy. • Thus the mutual information is (by definition) 2 bits. • Both of the “apparent” bits of entropy vanish if either qubit is measured. • Used in a communication tech. called quantum superdense coding. • 1 qubit’s worth of prior entanglement between two parties can be used to pass 2 bits of classical information between them using only 1 qubit!**Why the Difference?**• Entity A hasn’t yet measured B and C, which (A knows) are initially correlated with each other, quantumly or classically: • A has measured B and is now correlated with both B and C: • A can use his new knowledge to uncompute (compress away) the bits from both B and C, restoring them to a standard state: Order:ABC Classical Quantum • Knowing he is in state |0+|1, A can unitarily rotate himself back to state |0. Look ma, no entropy! • A, being in a mixed state, still holds a bit of information that is either unknown (external view) or incompressible (A’s internal view), and thus is entropy, and can never go away (by the 2nd law of thermo.).**Proving the 2nd law of thermodynamics**• Closed systems evolve via unitary transforms Ut1t2. • Unitary transforms just change the basis, so they do not change the system’s “true” (von Neumann) entropy. • Theorem:Entropy is constant in all closed systems undergoing an exactly-known unitary evolution. • However, if Ut1t2 is ever at all uncertain, or we disregard some of our information about the state, we get a mixture of possible resulting states, with provably ≥ effective entropy. • Theorem (2nd law of thermodynamics): Entropy may increase but never decreases in closed systems • It can increase if the system undergoes interactions whose details are not completely known, or if the observer discards some of his knowledge.**Maxwell’s Demon**• A longstanding “paradox” in thermodynamics: • Why exactly can’t you beat the 2nd law, reducing the entropy of a system via measurements? • There were many attempted resolutions, all with flaws, until… • Bennett@IBM (‘82) noted… • The information resulting fromthe measurement must bedisposed of somewhere… • The entropy is still present inthe demon’s memory, until heexpels it into the environment.