Jun 2, 2018

A random walk into ideas for Machine Learning optimization

Just a summary of terms used in different context.    Details are not covered.

Step sizing, modelling, approximation, statistical randomized vibration/motion, stability vs convergence rate, correlation/anti-correlation, peer pressures/influences/interaction.

minibatching, Boltzmann distribution, sampling based optimization, entropy optimization, Monte Carlo approximation,

Features identification/approximation, attention mechanism,

How does attention mechanism/strategy help to find the optimal path towards solution?

How can "divide and conquer", "tasklist breakdown" heuristics helped in optimization?

Finding the optimal solution is proportional to the number of combinations that have been attempted or generated, is there any correlation?

How to generate changes incrementally, so that each of these generated candidates can be used for optimization criterion evaluation?

How does applying the optimization heuristics early vs applying it late help towards optimal solutions?

Using statistical + variance minization for optimization:

Bayesian approximation + statistics

Reversibility of decomposition, dimensionality transformation: orthogonality and inverse operation, recovering of subfeatures, features of features, higher abstraction features of data  etc.

Weighted sum and activation function.   Reversibility of operations.

Reinforced learning:  incremental knowledge to help veering towards the optimal solutions.

Greedy layer wise optimization.

Adaptive strategies.   Momentum or energy-based optimization.

Crowd/popularity-based optimization.

Heuristics search by optimized strategies for optimization heuristics.

 

www.cs.cmu.edu/~bhiksha/courses/deeplearning/Fall.2016/notes/hefny_Greedy.pdf

https://papers.nips.cc/paper/3048-greedy-layer-wise-training-of-deep-networks.pdf














https://www.youtube.com/watch?v=noRNcDbqtVY&t=2363s




https://arxiv.org/pdf/1603.06160.pdf





No comments: