Skip to main content

Paradoxes and self references

One of the most celebrated paradoxes in set theory is the Russel's paradox. The story behind the paradox and subsequent developments is rather interesting.

Consider a set of the kind S = {a, b, S}. It seems somewhat unusual because S is a set in which S itself is a member. If we expand it, we get S = {a, b, {a, b, {a, b ....}}} leading to an infinite membership chain.

Suppose we want to express the class of sets that don't have this "foundationless" property. Let us call this set as the set of all "proper" sets, that is, sets that don't contain themselves. We can express this set of sets as: X = {x | x is not a member of x}

Now this begs the question whether X is a member of itself. If X is a member of itself, then by the definition of X (set of all sets that don't contain themselves), X should not be a member of itself. If X is not a member of itself, then by the definition of X (set of all sets that don't contain themselves), X should be a member of itself.

This is the famous Russel's paradox. Bertrand Russel was so flustered by this paradox, he went to great lengths to create an axiomatic basis for set theory that prevents the formation of such paradoxical situations. Incidentally, Russel's paradox also dealt a major blow to David Hilbert's grand idea of representing all of mathematics using set theory. Russel showed that it is not possible to represent some parts of set theory itself using set theory, let alone all of mathematics.

Russel and Whitehead then embarked on a major project that introduced the theory of "types" that was finely intertwined with set theory. This they published as a three volume series called "Principia Mathematica." Set theory could not be described without theory of types and vice versa.

Essentially, the gist of this is as follows. A type is basically a property (a predicate logic statement for instance) and an instance is an element having that property. A set, now becomes a collection of instances sharing a given property. Hence if we have a property isPrimeNumber(x), then the set Primes = {x | x \in Z, isPrimeNumber(x)} is the set of all integers for which isPrimeNumber(x) holds.

In most purposes, we can refer to a set and its type interchangeably. But the subtle difference is that, while we can define a property P(x): x does not contain x without any problem, we cannot define the corresponding set X = {x | P(x)} without getting into a paradox.

In the axiomatic system of Principia Mathematica, a set X can only contain elements belonging to a type lower than that of X itself. This leads to a strict hierarchy among sets, where every set is made up of lower level elements and so on. This brings forth the question, what is the lowest of them all? In order to make sets exist, we will then need to have "atomic" types, below which there are no more sets defined.

This axiom is called the "Axiom of Foundation" in set theory, which prohibits an infinite membership chain and decrees that sets ought to end somewhere.

However, the Foundation Axiom is a little over zealous in its restrictions in order to prevent Russel's paradox like situations. Self reference is not necessarily a paradox.

Suppose we mathematically represent the semantics of each sentence with the set of all nouns in them. Then the first sentence of the previous paragraph would be something like: {Foundation Axiom, restrictions, Russel's paradox}.

Now suppose we want to represent the following sentence in set-theoretic form: This sentence is written in English. We see that there are two nouns here, English and "this sentence". Unfortunately, "this sentence" cannot be an element of "this sentence" according to the Foundation Axiom. Similarly, we can't represent a sentence talking about the universe in general.

There are several situations where self reference is not only consistent, but also necessary. A container class in object oriented programming for example, can contain objects of any class including the container class or even its super classes.

Much later, mathematicians saw the need to relax the Axiom of Foundation and so was born the theory of non-wellfounded sets or "Hypersets." Hypersets are based on the Anti-foundation Axiom that allows self references as long as the cycle does not contain a contradiction. A paradox occurs only when there is a cyclic reference evaluating to a contradiction.

Hence the set X = {x | x contains x} is not paradoxical, while X = {x | x does not contain x} is paradoxical.

It is only that the former set is "non-wellfounded". It is like the story where, a boy who is asked where Leela Palace is, replies that it is in front of Manipal Hospital, and gives directions to Manipal Hospital as "opposite Leela Palace." The sentences are mutually consistent, but by themselves, they don't have any information about how to get there -- either to Leela Palace or to Manipal Hospital.

Such mathematical structures, even though they appear as if they are useless, have some important applications and are very powerful constructs to model some classes of natural phenomena. But more on them in some other post, later. Smiley

Comments

Popular posts from this blog

The Power Law of Monkeys..

Had read this interesting paper by Michael Mitzenmacher some time ago and always wanted to blog about this. Now is a good time as any, while I wait for a long download to finish. The paper talks about the new found interest in the power-law and log-normal distributions especially among web researchers. Michael provides ample evidence to show that these debates are nothing new. Power-law versus log-normal processes have been debated in several areas like biology, chemistry and astronomy for more than a century now. A power-law distribution over a random variable X is one where the probability that X takes a value at least k is proportional to k -g . Here g > 0 is the power-law "exponent". Intuitively a power-law distribution represents a system where X has a very large number of instances of very small values and a very small number of instances of extremely large values. Power-law distributions are also called scale-free distributions as their overall shape appears the s...

Meta-modeling heuristics

(Acknowledgment: This post has inputs from Sanket of First principles fame.) When teaching about formal axiomatic systems, I'm usually asked a question something like, "But, how do you find the axioms in the first place?" What are axioms, you ask? To take a few steps back, reasoning processes based on logic and deduction are set in an "axiomatic" context. Axioms are the ground truths, based on which we set out to prove or refute theorems within the system. For example, "Given any two points, there can be only one line that passes through both of them" is an axiom of Eucledian geometry. When proving theorems within an axiomatic system, we don't question the truth of the axioms themselves. As long as the set of axioms are consistent among themselves (i.e. don't contradict one another) it is fine. Axioms are either self-evident or well-known truths, or somethings that are assumed to be true for the context. But then, the main question becomes ver...