Review — Word2Vec: Efficient Estimation of Word Representations in Vector Space

Word2Vec: Using CBOW or Skip-Gram to Convert Words to Meaningful Vectors, i.e. Word Representation Learning

Word2Vec: CBOW and Skip-Gram

Outline

1. CBOW (Continuous Bag-of-Words) Model

CBOW Model (Figure from https://devopedia.org/word2vec)

2. Skip-Gram Model

Skip-Gram Model (Figure from https://devopedia.org/word2vec)

3. Experimental Results

3.1. Evaluation

Examples of five types of semantic and nine types of syntactic questions in the Semantic-Syntactic Word Relationship test set

3.2. Dataset

3.3. CBOW

Accuracy on subset of the Semantic-Syntactic Word Relationship test set, using word vectors from the CBOW architecture with limited vocabulary

3.4. SOTA Comparison

Comparison of architectures using models trained on the same data, with 640-dimensional word vectors with limited vocabulary
Comparison of publicly available word vectors on the Semantic-Syntactic Word Relationship test set with full vocabulary

3.5. Data vs Epoch

Comparison of models trained for three epochs on the same data and models trained for one epoch

3.6. Large Scale Parallel Training of Models

Comparison of models trained using the DistBelief distributed framework

3.7. Microsoft Research Sentence Completion Challenge

Comparison and combination of models on the Microsoft Sentence Completion Challenge

3.8. Examples of the Learned Relationships

Examples of the word pair relationships, using the best word vectors

Reference

Natural Language Processing (NLP)

My Other Previous Paper Readings

PhD, Researcher. I share what I've learnt and done. :) My LinkedIn: https://www.linkedin.com/in/sh-tsang/, My Paper Reading List: https://bit.ly/33TDhxG