Skip to main content

Meta-modeling heuristics

(Acknowledgment: This post has inputs from Sanket of First principles fame.)

When teaching about formal axiomatic systems, I'm usually asked a question something like, "But, how do you find the axioms in the first place?"

What are axioms, you ask?

To take a few steps back, reasoning processes based on logic and deduction are set in an "axiomatic" context. Axioms are the ground truths, based on which we set out to prove or refute theorems within the system. For example, "Given any two points, there can be only one line that passes through both of them" is an axiom of Eucledian geometry.

When proving theorems within an axiomatic system, we don't question the truth of the axioms themselves. As long as the set of axioms are consistent among themselves (i.e. don't contradict one another) it is fine. Axioms are either self-evident or well-known truths, or somethings that are assumed to be true for the context.

But then, the main question becomes very relevant. Can we assume just about anything as an axiom, just as long as it does not contradict with any other axioms? Or should there be some element of objective truth in the axioms?

Technically, we can go with the belief that truth is subjective and we can assume whatever we want as axioms. But then, this is a self-contradicting statement when used in general. Perhaps one can define a wholly abstract system based on an alternate but consistent set of physical laws and prove that (say) pigs can fly. But other than being seen as "intellectual -er- stimulation", such results are generally not accorded a great deal of importance. Wink

Good axiomatic systems are based on axioms that are grounded in reality and encapsulate some element of objective truth. So the question of "how do we find axioms" reduces to "how do we recognize objective truth when we see it?"

The succinct answer is: "wish we knew". This is the question that has perhaps defined humankind's quest throughout history. And I certainly won't claim to have an answer for that!

But even then, it helps to have some kind of thumb rules or heuristics using which, we can be reasonably confident that our assumptions are grounded in reality. So what follows here are some thumb rules that I have found useful at some time or the other. They are by no means exhaustive and by themselves they don't guarantee that they will lead you to true statements. Nevertheless, it is good to have some thumb rules rather than nothing at all. So here goes:

Principle of invariance
Any characteristic of a system that is invariant and displays a property of constancy, is a good candidate for a ground truth. Remember how the north star was used as the basis for designing navigation rules by sailors, long ago?

Principle of least bias (maximum entropy)
If you have insufficient information about an external entity that may impact your system in one or more ways, choose the least biased explanation about how the impact will be. For instance, suppose you are constructing a tall building and need to protect it from heavy winds. If you know the direction in which strong winds typically blow from, then you can construct the building in such a way that the building only has sharp edges in that direction so that the wind can cut through it. However, if strong winds are erratic in that region and it is impossible to predict the direction in which they can blow from, design your building on the assumption that strong winds can blow from any possible direction.

The least-bias principle is also known as the principle of maximum entropy, because in information theoretic terms, this also represents the statement with the maximum information content possible about the unknown variable.

Principle of conservation (Symmetry)
When you have mapped out the inputs and outputs of your system, typically everything else should be conserved. Else there is something like a "memory leak" happening somewhere that may make your system appear to behave strange eventually.

Every debit should have a corresponding credit (if money was not dispensed out of the system) and every buy should have a corresponding sell. So, don't believe an axiom like, "if lot of people are buying shares from others, then the market is bullish." If lot of people are buying shares from others, it also means that a lot of people are selling shares.

Principle of minimum description length (Occam's razor) If the same phenomenon can be described in more than one ways, choose the simplest possible description. An earlier debate some of us had on elevator design is a motivating example. Occam's razor can also be seen as the principle of irreducibility of the axioms. If the axioms can be reduced to something smaller without changing their meaning, then we probably should be using the latter set of axioms.

Extensibility: This is the dual of the above characteristic. The more facts you can explain using a concept, the better it is as an axiom. Extensability can also be termed the principle of generality. If there are two or more theories for a phenomena, and one of them explains far more than the other (correctly of course), then that is the better one. Quantum mechanics is seen as something that supercedes Newtonian mechanics because Quantum mechanics has a theory general enough to explain mechanics in the sub-atomic level as well as (in principle) at macro levels; unlike classical mechanics that described phenomena only at macro levels.

Universality: Many a time, similar phenomena occur in (seemingly) unrelated contexts. Such phenomena are very likely to point to some deeper underlying principle. The power-law distribution is an example of a characteristic that is found in several disparate systems like degree distribution on the web, population distribution across cities, distribution of the sizes of our blood vessels, etc. It is no wonder that the power law has aroused a lot of curiosity and there are various theories (preferential attachment, utility maximization under saturation) about the truth underlying this phenomenon.

Comments

Sids said…
This is an excellent post! I'm bookmarking it for future reference.

Popular posts from this blog

Co-occurrence and Correlation

In one of our projects, we encountered this dilemma where we had to nitpick on (the probability of) co-occurrence of a pair of events and correlation between the pair of events. Here is my attempt at disambiguating between the two. Looking forward to any pokes at loopholes in my argument. Consider two events e1 and e2 that have a temporal signature. For instance, they could be login events of two users on a computer system across time. Let us also assume that time is organized as discrete units of constant duration each (say one hour). We want to now compare the login behaviour of e1 and e2 over time. We need to find out whether e1 and e2 are taking place independently or are they correlated. Do they tend to occur together (i.e. co-occur) or do they take place independent of one another? This is where terminologies are freely used and things start getting a bit confusing. So to clear the confusion, we need to define our terms more precisely. Co-occurrence is simply the probability that

Paradoxes and self references

One of the most celebrated paradoxes in set theory is the Russel's paradox. The story behind the paradox and subsequent developments is rather interesting. Consider a set of the kind S = {a, b, S}. It seems somewhat unusual because S is a set in which S itself is a member. If we expand it, we get S = {a, b, {a, b, {a, b ....}}} leading to an infinite membership chain. Suppose we want to express the class of sets that don't have this "foundationless" property. Let us call this set as the set of all "proper" sets, that is, sets that don't contain themselves. We can express this set of sets as: X = {x | x is not a member of x} Now this begs the question whether X is a member of itself. If X is a member of itself, then by the definition of X (set of all sets that don't contain themselves), X should not be a member of itself. If X is not a member of itself, then by the definition of X (set of all sets that don't contain themselves), X should be a memb