1. Lin & Tegmark (LT): "why does deep and cheap learning work so well? "
a. A categorical approach: p7 Fig 3, p12 Table I
b. Proposes a duality cheap / deep
i. Cheap: 'simple polynomials which are sparse, symmetric and / or low-order play a special role in physics' + II.D: low polynomial order, locality, symmetry
ii. Deep: 'One of the most striking features of the physical world is its hierarchical structure.'
c. Is interested in some papers centered on: RG + deep linear nn
2. Symptomatically, the examples given do not correspond to self-learning type DL, but to concatenation / composition of 'symmetries', in the sense of sparse, CF remark 3
3. We have a category sparce graph SpGr, and categories physics Phys and Image classification ImCl, and functors
Phys → SpGr
ImCl → SpGr
4. Remark 1: the old Kant question: is this 'special role' subjective or objective?
Do we really discover low dimensional symmetries or do we discover what we can discover?
a. Fundamental Ockham / generalization bias: CF "against Vapnik"
b. Computational computational stress (CF "μεταφορά ", 2): the 'vicious' circle of learning:
New symmetry → more data → new symmetry → ...
5. Remark 2: How is symmetry learned?
a. Laborans: over time [not on a particular dataset]
b. Various fields bring to light large classes of symmetry: fundamental physics, algebraic geometry, biology (cf PPI), AI, information systems, cognition, engineering, ... CF" reading Building Machines that learn and think like people" (RB)
6. Remark 3: the heuristics-symmetries point of view:
a. To learn is to build a catalog à la Polya (CF RB) of good heuristics, that is to say good symmetries.
b. Distributivity [Bengio] ↔ sparsity [Bach] ↔ heuristic / symmetries
c. Deep is not a mysterious second / 'dual' dimension of learning: just another symmetry: recursivity / sequencial
d. There is an equivalence between sequencial learning and hierarchical learning, via a 'rotation' time ↔ space (depth)
e. In fine, the question is to see the notion of symmetry as much more general than these classical declensions (groups, CF "SGII") or "reductive" (distributivity / sparsity): the theory of categories seems an interesting attempt in this direction. See also heuristics towers in RB
No comments:
Post a Comment