Shannon's entropy is relative to the receiver. Two distinct receivers with different knowledge and expectations about the information source would calculate different probabilities for each message.

To illustrate this, think of two observers, A and B, watching a coin being tossed, A knows that the coin is not fair, B does not.

For A, the probability of observing heads is not 0.5, the coin is not
fair. For B the probability is 0.5. The difference in the expectations
from each observers results in them calculating different
probabilities for each **message**, i.e., the result of a coin toss.

The entropy quantifies how surprised is an observer to see a particular message:

Observer | Probability of the message | Entropy |
---|---|---|

A | \(P_A(Heads) \neq 0.5\) | \(H(Heads\vert A)=-log_2(P_A)\) |

B | \(P_B(Heads) = 0.5\) | \(H(Heads\vert B)=-log_2(P_B)\) |