Jul 8, 2019

BERT/MTL/MASS/ELMO

BERT:

https://arxiv.org/pdf/1810.04805.pdf

http://jalammar.github.io/illustrated-bert/



Sentencepiece/Wordpiece:

https://github.com/google/sentencepiece

https://www.reddit.com/r/MachineLearning/comments/axkmi0/d_why_is_code_or_libraries_for_wordpiece/

https://stackoverflow.com/questions/55382596/how-is-wordpiece-tokenization-helpful-to-effectively-deal-with-rare-words-proble/55416944

https://arxiv.org/pdf/1609.08144.pdf

ERNIE:

http://research.baidu.com/Blog/index-view?id=113

MTL:

https://arxiv.org/pdf/1901.11504.pdf

MASS:

https://www.microsoft.com/en-us/research/blog/introducing-mass-a-pre-training-method-that-outperforms-bert-and-gpt-in-sequence-to-sequence-language-generation-tasks/?fbclid=IwAR2VNTOhAiIvTrAR5AN-trWbjNlRvcGH6rlIiNTWajiHsMIGOEbWKKT6_h0

https://www.microsoft.com/en-us/research/publication/mass-masked-sequence-to-sequence-pre-training-for-language-generation/

TRANSFORMER:

https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/ 

https://jalammar.github.io/illustrated-transformer/

No comments: