1.1.1Python Layer🔗 i
Core engine for the “Trends” tool.
Tokenizes jsexpr. TODO: document this.
Tokens which do not satisfy should_use_token with this language are discarded.
The purpose of the “text” field is to provide an example of an actual use of the word, as the lemma FIXME, but some words (e.g. “DuFay”) shouldn’t be. (Also, some lemmas are strange, like “whatev”.)
Recognizes tokens which should be included in counting with respect to the given spacy.language.Language instance.
Some kinds of tokens which are excluded:
Part-of-speech tags that are considered “boring” notably include "NUM" (numeral) and "SYM" (symbol). Currently, all part-of-speech tags are considered “boring” except for "NOUN" and "PROPN" (i.e. proper and common nouns).