Wednesday, 22 March 2017

Learning Fallacy II

In 'Learning Fallacy' LF we began the reflexion on the model / data duality :
a. what kind of data are relevant for my problem - i.e. am I not typically biased towards over-reducing/localising the problem ?
b. how much specific is my model - that is am I not typically biased towards overfiting, i.e. under-symmetrising my model ?

We call the corresponding heuristic :
a. Data Expansion
b. Large Symmetry

The data expansion heuristic is not a more-data-is-better tale. Here we are talking of perspective on 'reality' [recall that your reality is a work-in-progress...] : what kind of implicit (over-simplifying) hypothesis am I doing by leaving seemingly not relevant data ?

Ex a : We already mentioned the RFIM hypothesis on Finance, which precisely is considered to be relevant more broadly for social domain.
The transdisciplinary (or not) paradigm is one manifestation of the Data Expansion problem : a typical example is given in 'Natural language / neuro economics II'.

Ex b : recall the paradoxical behavior of Markowitz in FL : the Portfolio theoretician adopts a fully symmetrised approach for his skin-in-the-game private financial strategy...
'Learning as categorification IV' propose simple examples where symmetries are explicitly declared the essential part of the problem.

No comments:

Post a Comment