Processing Textual Data
Mathematica has uniquely flexible capabilities for processing large volumes of textual data. Most often data represented as a string is converted to lists or other constructs which can then be manipulated using Mathematica's powerful symbolic language constructs.
Import — import data from files or the web
FindList — search files for records containing particular strings
StringSplit — split a string into words, sentences, etc.
Sort — sort into alphabetical order
Tally — tally numbers of identical strings
Nearest — find the closest-matching string from a list
Hash — find a hash code using a variety of schemes
WordData — find semantic, grammatical, morphological etc. properties of words
TUTORIALS
MORE ABOUT