StayLongNight / word2vec Public

forked from tmikolov/word2vec

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Automatically exported from code.google.com/p/word2vec

License

Apache-2.0 license

0 stars 545 forks Branches Tags Activity

Star

Notifications

StayLongNight/word2vec

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
LICENSE		LICENSE
README.txt		README.txt
compute-accuracy.c		compute-accuracy.c
demo-analogy.sh		demo-analogy.sh
demo-classes.sh		demo-classes.sh
demo-phrase-accuracy.sh		demo-phrase-accuracy.sh
demo-phrases.sh		demo-phrases.sh
demo-train-big-model-v1.sh		demo-train-big-model-v1.sh
demo-word-accuracy.sh		demo-word-accuracy.sh
demo-word.sh		demo-word.sh
distance.c		distance.c
makefile		makefile
questions-phrases.txt		questions-phrases.txt
questions-words.txt		questions-words.txt
word-analogy.c		word-analogy.c
word2phrase.c		word2phrase.c
word2vec.c		word2vec.c

Repository files navigation

Tools for computing distributed representtion of words
------------------------------------------------------
We provide an implementation of the Continuous Bag-of-Words (CBOW) and the Skip-gram model (SG), as well as several demo scripts.
Given a text corpus, the word2vec tool learns a vector for every word in the vocabulary using the Continuous
Bag-of-Words or the Skip-Gram neural network architectures. The user should to specify the following:
 - desired vector dimensionality
 - the size of the context window for either the Skip-Gram or the Continuous Bag-of-Words model
 - training algorithm: hierarchical softmax and / or negative sampling
 - threshold for downsampling the frequent words 
 - number of threads to use
 - the format of the output word vector file (text or binary)
Usually, the other hyper-parameters such as the learning rate do not need to be tuned for different training sets. 
The script demo-word.sh downloads a small (100MB) text corpus from the web, and trains a small word vector model. After the training
is finished, the user can interactively explore the similarity of the words.
More information about the scripts is provided at https://code.google.com/p/word2vec/

About

Automatically exported from code.google.com/p/word2vec

Releases

No releases published

Packages

No packages published

Languages

C 84.7%
Shell 14.1%
Makefile 1.2%

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

StayLongNight/word2vec

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

License

StayLongNight/word2vec

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages