My Explorations Into Deep Learning

Jul 8, 2019

BERT/MTL/MASS/ELMO

BERT:

https://arxiv.org/pdf/1810.04805.pdf

http://jalammar.github.io/illustrated-bert/

https://medium.com/@_init_/why-bert-has-3-embedding-layers-and-their-implementation-details-9c261108e28a

https://github.com/eclipse/deeplearning4j/issues/7133

https://github.com/google-research/bert

https://github.com/google-research/bert/blob/master/modeling.py

https://github.com/google-research/bert/blob/master/run_classifier.py

https://www.lyrn.ai/2018/11/07/explained-bert-state-of-the-art-language-model-for-nlp/

Sentencepiece/Wordpiece:

https://github.com/google/sentencepiece

https://www.reddit.com/r/MachineLearning/comments/axkmi0/d_why_is_code_or_libraries_for_wordpiece/

https://stackoverflow.com/questions/55382596/how-is-wordpiece-tokenization-helpful-to-effectively-deal-with-rare-words-proble/55416944

https://arxiv.org/pdf/1609.08144.pdf

ERNIE:

http://research.baidu.com/Blog/index-view?id=113

MTL:

https://arxiv.org/pdf/1901.11504.pdf

MASS:

https://www.microsoft.com/en-us/research/blog/introducing-mass-a-pre-training-method-that-outperforms-bert-and-gpt-in-sequence-to-sequence-language-generation-tasks/?fbclid=IwAR2VNTOhAiIvTrAR5AN-trWbjNlRvcGH6rlIiNTWajiHsMIGOEbWKKT6_h0

https://www.microsoft.com/en-us/research/publication/mass-masked-sequence-to-sequence-pre-training-for-language-generation/

TRANSFORMER:

https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/

https://jalammar.github.io/illustrated-transformer/

Posted by Peter Teoh

No comments:

Newer Post Older Post Home

Subscribe to: Post Comments (Atom)

DeepMind Latest Post

Google Research Blog

cs.AI updates on arXiv.org

cs.SE updates on arXiv.org

My Blogs

cs.RO updates on arXiv.org

Awesome Inc. theme. Theme images by molotovcoketail. Powered by Blogger.