[line 33]
This class represents a text sample to be parsed.
This separates the analysis of a text sample from the primary LanguageDetect class. After a new profile has been built, the data can be retrieved using the accessor functions.
This class is intended to be used by the Text_LanguageDetect class, not end-users.
$_compile_trigram = false
[line 75]
Whether the parser should compile trigrams
$_compile_unicode = false
[line 68]
Whether the parser should compile the unicode ranges
$_string =
[line 40]
The piece of text being parsed
$_trigrams = array()
[line 47]
Stores the trigram frequencies of the sample
$_trigram_pad_start = false
[line 82]
Whether the trigram parser should pad the beginning of the string
$_trigram_ranks = array()
[line 54]
Stores the trigram ranks of the sample
$_unicode_blocks = array()
[line 61]
Stores the unicode blocks of the sample
$_unicode_skip_symbols = true
[line 89]
Whether the unicode parser should skip non-alphabetical ascii chars
Text_LanguageDetect_Parser (Constructor) [line 108]
void Text_LanguageDetect_Parser(
string
$string)
PHP 4 constructor for backwards compatibility.
Parameters:
string
$string
—
string to be parsed
__construct (Constructor) [line 96]
Text_LanguageDetect_Parser __construct(
string
$string)
Constructor
Overrides
Text_LanguageDetect::__construct() (Constructor)
Parameters:
string
$string
—
string to be parsed
analyze [line 220]
Executes the parsing operation
Be sure to call the set*() functions to set options and the prepare*() functions first to tell it what kind of data to compute
Afterwards the get*() functions can be used to access the compiled information.
getTrigramFreqs [line 194]
Return the trigram freqency table
Only used in testing to make sure the parser is working
- Return: Trigram freqencies in the text sample
- Access: public
getTrigramRanks [line 182]
Returns the trigram ranks for the text sample
- Return: Trigram ranks in the text sample
- Access: public
getUnicodeBlocks [line 204]
array getUnicodeBlocks(
)
Returns the array of unicode blocks
- Return: Unicode blocks in the text sample
- Access: public
prepareTrigram [line 136]
void prepareTrigram(
[bool
$bool = true])
Turn on/off trigram counting
Parameters:
bool
$bool
—
true for on, false for off
prepareUnicode [line 148]
void prepareUnicode(
[bool
$bool = true])
Turn on/off unicode block counting
Parameters:
bool
$bool
—
true for on, false for off
setPadStart [line 160]
void setPadStart(
[bool
$bool = true])
Turn on/off padding the beginning of the sample string
Parameters:
bool
$bool
—
true for on, false for off
setUnicodeSkipSymbols [line 172]
void setUnicodeSkipSymbols(
[bool
$bool = true])
Should the unicode block counter skip non-alphabetical ascii chars?
Parameters:
bool
$bool
—
true for on, false for off
validateString [line 120]
bool validateString(
string
$str)
Returns true if a string is suitable for parsing
- Return: true if acceptable, false if not
- Access: public
Parameters:
string
$str
—
input string to test