My Explorations Into Deep Learning: My thoughts about Abstract Reasoning in Neural Network

After reading this:

http://proceedings.mlr.press/v80/santoro18a/santoro18a.pdf

Here are my thoughts:

a. The requirements for reasoning in general are essentially:

ability to generalize,

ability to conceptualize/visualize,

ability to make logical deduction/induction,

ability to question,

ability to reverse the logical thinking,

ability to filter away information/ideas,

ability to create new ideas/imagine new construction,

ability to zoom down and pay attention/focus on a particular idea/theme,

ability to composite (or group ideas logically together to generate a new thought path) etc.

ability to consider different pathway of thoughts - eg through Bayes-based branching or Markovian chaining to generate new ideas.

b. How to "REPRESENT" all these above:

1. graphical model - this will enable us to connect ideas/concept. if the graphs is directed then "logical reasoning", involving temporal step-by-step reasoning can be implemented. Graphical models also allow us to group concept all concepts/ideas together by direct links.

2. vector model - this will enable easy numerical representation of ideas/concept and we can easily group "similar ideas" together by virtue of consine distance rule. When used with convolution, w

3. language model - every "ideas" can be directly mapped to its alphabetical construction. Language model offered "symbolic" way of representing tasks/actions reasoning - contrast this with the vectorial approach, when

4. LSTM/convolution/memory retention - this has the property of "memory effect" - previous information can be filtered and retained.

5. Image model - every pixel and/or its neighborhood is an approximation to a idea/picture etc.

6. Sequence model: this is referring to seq2seq or encoder-decoder construction.

7. other more exotic representation:

Siemese network - it is much easier to represent exceptions, or anomalies, or differences, than to store the construction of every possible instances.

grouped linkages/connections activation: a group of direct linkages may be activated/deactivated as when necessary. so when one layer of activation take effect, the other layer will get deactivated automatically (to present electrical overflow). the two set of linkages may also be connecting different set of nodes - but essentially when one set is on, it will deactivate another to be off.

For more variations in network see:

https://www.asimovinstitute.org/author/fjodorvanveen/

c. Generally, MLP are universal approximators:

https://en.wikipedia.org/wiki/Universal_approximation_theorem

https://www.slideshare.net/theeluwin/universal-approximation-theorem-70937339

http://www2.math.technion.ac.il/~pinkus/papers/acta.pdf

So the differentiation between the different neural architecture/representation may not mean much, what matters is rate of convergence. And it also mean that multiple representation exists to achieve the same means/target.

d. Gradient descent, cost function: base on learning to learn paradigm (meta data learning, lifelong learning), it should be possible to self-optimize the cost functions automatically.

Optimal Transport:

https://arxiv.org/abs/1009.3856

http://cedricvillani.org/wp-content/uploads/2012/08/preprint-1.pdf

http://otnm.lakecomoschool.org/

Lifelong Learning:

https://arxiv.org/pdf/1801.02808.pdf

https://pub.ist.ac.at/~chl/talks/lampert-mpiis2017.pdf

http://ecmlpkdd2017.ijs.si/papers/paperID164.pdf

https://www.seas.upenn.edu/~eeaton/papers/slides-Ruvolo2013Active.pdf

e. Meta Learning (Learning to Learn):

optimizers learning,

initialization learning,

metric learning,

few shot learning,

dataset partitioning/demarcation,

differences among dataset learning, algorithm / scheduling learning (learning how to organize tasks for learning),

objects learning (identifying all physical vs abstract objects),

abstracts vs non-abstract learning etc,

topology of neural networks learning (number of layers, number of nodes, connections and feedback/feedforward attributes, triggering frequencies etc).

https://chatbotslife.com/why-meta-learning-is-crucial-for-further-advances-of-artificial-intelligence-c2df55959adf

https://medium.com/intuitionmachine/machines-that-search-for-deep-learning-architectures-c88ae0afb6c8

https://towardsdatascience.com/whats-new-in-deep-learning-research-understanding-meta-learning-91fef1295660

http://bair.berkeley.edu/blog/2017/07/18/learning-to-learn/

My Explorations Into Deep Learning

Jul 25, 2018

My thoughts about Abstract Reasoning in Neural Network

No comments: