Skip to main content

Paradoxes and self references

One of the most celebrated paradoxes in set theory is the Russel's paradox. The story behind the paradox and subsequent developments is rather interesting.

Consider a set of the kind S = {a, b, S}. It seems somewhat unusual because S is a set in which S itself is a member. If we expand it, we get S = {a, b, {a, b, {a, b ....}}} leading to an infinite membership chain.

Suppose we want to express the class of sets that don't have this "foundationless" property. Let us call this set as the set of all "proper" sets, that is, sets that don't contain themselves. We can express this set of sets as: X = {x | x is not a member of x}

Now this begs the question whether X is a member of itself. If X is a member of itself, then by the definition of X (set of all sets that don't contain themselves), X should not be a member of itself. If X is not a member of itself, then by the definition of X (set of all sets that don't contain themselves), X should be a member of itself.

This is the famous Russel's paradox. Bertrand Russel was so flustered by this paradox, he went to great lengths to create an axiomatic basis for set theory that prevents the formation of such paradoxical situations. Incidentally, Russel's paradox also dealt a major blow to David Hilbert's grand idea of representing all of mathematics using set theory. Russel showed that it is not possible to represent some parts of set theory itself using set theory, let alone all of mathematics.

Russel and Whitehead then embarked on a major project that introduced the theory of "types" that was finely intertwined with set theory. This they published as a three volume series called "Principia Mathematica." Set theory could not be described without theory of types and vice versa.

Essentially, the gist of this is as follows. A type is basically a property (a predicate logic statement for instance) and an instance is an element having that property. A set, now becomes a collection of instances sharing a given property. Hence if we have a property isPrimeNumber(x), then the set Primes = {x | x \in Z, isPrimeNumber(x)} is the set of all integers for which isPrimeNumber(x) holds.

In most purposes, we can refer to a set and its type interchangeably. But the subtle difference is that, while we can define a property P(x): x does not contain x without any problem, we cannot define the corresponding set X = {x | P(x)} without getting into a paradox.

In the axiomatic system of Principia Mathematica, a set X can only contain elements belonging to a type lower than that of X itself. This leads to a strict hierarchy among sets, where every set is made up of lower level elements and so on. This brings forth the question, what is the lowest of them all? In order to make sets exist, we will then need to have "atomic" types, below which there are no more sets defined.

This axiom is called the "Axiom of Foundation" in set theory, which prohibits an infinite membership chain and decrees that sets ought to end somewhere.

However, the Foundation Axiom is a little over zealous in its restrictions in order to prevent Russel's paradox like situations. Self reference is not necessarily a paradox.

Suppose we mathematically represent the semantics of each sentence with the set of all nouns in them. Then the first sentence of the previous paragraph would be something like: {Foundation Axiom, restrictions, Russel's paradox}.

Now suppose we want to represent the following sentence in set-theoretic form: This sentence is written in English. We see that there are two nouns here, English and "this sentence". Unfortunately, "this sentence" cannot be an element of "this sentence" according to the Foundation Axiom. Similarly, we can't represent a sentence talking about the universe in general.

There are several situations where self reference is not only consistent, but also necessary. A container class in object oriented programming for example, can contain objects of any class including the container class or even its super classes.

Much later, mathematicians saw the need to relax the Axiom of Foundation and so was born the theory of non-wellfounded sets or "Hypersets." Hypersets are based on the Anti-foundation Axiom that allows self references as long as the cycle does not contain a contradiction. A paradox occurs only when there is a cyclic reference evaluating to a contradiction.

Hence the set X = {x | x contains x} is not paradoxical, while X = {x | x does not contain x} is paradoxical.

It is only that the former set is "non-wellfounded". It is like the story where, a boy who is asked where Leela Palace is, replies that it is in front of Manipal Hospital, and gives directions to Manipal Hospital as "opposite Leela Palace." The sentences are mutually consistent, but by themselves, they don't have any information about how to get there -- either to Leela Palace or to Manipal Hospital.

Such mathematical structures, even though they appear as if they are useless, have some important applications and are very powerful constructs to model some classes of natural phenomena. But more on them in some other post, later. Smiley

Comments

Popular posts from this blog

Co-occurrence and Correlation

In one of our projects, we encountered this dilemma where we had to nitpick on (the probability of) co-occurrence of a pair of events and correlation between the pair of events. Here is my attempt at disambiguating between the two. Looking forward to any pokes at loopholes in my argument. Consider two events e1 and e2 that have a temporal signature. For instance, they could be login events of two users on a computer system across time. Let us also assume that time is organized as discrete units of constant duration each (say one hour). We want to now compare the login behaviour of e1 and e2 over time. We need to find out whether e1 and e2 are taking place independently or are they correlated. Do they tend to occur together (i.e. co-occur) or do they take place independent of one another? This is where terminologies are freely used and things start getting a bit confusing. So to clear the confusion, we need to define our terms more precisely. Co-occurrence is simply the probability that

Meta-modeling heuristics

(Acknowledgment: This post has inputs from Sanket of First principles fame.) When teaching about formal axiomatic systems, I'm usually asked a question something like, "But, how do you find the axioms in the first place?" What are axioms, you ask? To take a few steps back, reasoning processes based on logic and deduction are set in an "axiomatic" context. Axioms are the ground truths, based on which we set out to prove or refute theorems within the system. For example, "Given any two points, there can be only one line that passes through both of them" is an axiom of Eucledian geometry. When proving theorems within an axiomatic system, we don't question the truth of the axioms themselves. As long as the set of axioms are consistent among themselves (i.e. don't contradict one another) it is fine. Axioms are either self-evident or well-known truths, or somethings that are assumed to be true for the context. But then, the main question becomes ver