SourceForge logo
SourceForge logo
Menu

crm114-discuss

From: Tobias S. <tob...@fr...> - 2008年01月04日 15:41:52
I read the paper "The Spam-Filtering Accuracy Plateau at 99.9% Accuracy and
How to Get Past It." and I have a question about the following part:
 
"In this experiment, we used superincreasing weights as determined by the
formula 
Weight = 22N 
Thus, for features containing 1, 2, 3, 4, and 5 words, the weights of those
features would be 1, 4, 16, 64, and 256 respectively."
 
What does the variable N in the weighting formula stand for?
 
From: Paolo <oo...@us...> - 2008年01月07日 20:58:37
On Fri, Jan 04, 2008 at 04:41:40PM +0100, Tobias Schneider wrote:
> Weight = 2^2N 
> 
> Thus, for features containing 1, 2, 3, 4, and 5 words, the weights of
> those features would be 1, 4, 16, 64, and 256 respectively."
> 
> 
> What does the variable N in the weighting formula stand for?
I think you get the answer in the following slide:
(3) the 2^2N weighting means that weights were 
 1, 4, 16, 64, 256, ... 
for the span lengths of 1, 2, 3, 4, 5 ... words 
Thus N stands for the number of words in the N-gram.
HTH
-- 
 paolo
 
 GPG/PGP id:0x1D5A11A4 - 04FC 8EB9 51A1 5158 1425 BC12 EA57 3382 1D5A 11A4
 - 9/11: the outrageous deception and ongoing coverup: http://911review.org -
From: Tobias S. <tob...@fr...> - 2008年01月07日 22:21:31
Thanks for your response.=20
But if I have for example two words (N=3D2) and put it in the formula, =
the
resulting weight is 16 (2^2*2) and not 4.
Where is my mistake?
-----Urspr=FCngliche Nachricht-----
Von: crm...@li...
[mailto:crm...@li...] Im Auftrag von =
Paolo
Gesendet: Monday, January 07, 2008 9:58 PM
An: crm...@li...
Betreff: Re: [Crm114-discuss] Question about the weighting formula in
theplateau paper
On Fri, Jan 04, 2008 at 04:41:40PM +0100, Tobias Schneider wrote:
> Weight =3D 2^2N=20
>=20
> Thus, for features containing 1, 2, 3, 4, and 5 words, the weights =
of
> those features would be 1, 4, 16, 64, and 256 respectively."
>=20
>=20
> What does the variable N in the weighting formula stand for?
I think you get the answer in the following slide:
(3) the 2^2N weighting means that weights were=20
 1, 4, 16, 64, 256, ...=20
for the span lengths of 1, 2, 3, 4, 5 ... words=20
Thus N stands for the number of words in the N-gram.
HTH
--=20
 paolo
=20
 GPG/PGP id:0x1D5A11A4 - 04FC 8EB9 51A1 5158 1425 BC12 EA57 3382 1D5A =
11A4
 - 9/11: the outrageous deception and ongoing coverup: =
http://911review.org
-
-------------------------------------------------------------------------=
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketpl=
ace
_______________________________________________
Crm114-discuss mailing list
Crm...@li...
https://lists.sourceforge.net/lists/listinfo/crm114-discuss
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.
Thanks for helping keep SourceForge clean.
X





Briefly describe the problem (required):
Upload screenshot of ad (required):
Select a file, or drag & drop file here.
Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL:

AltStyle によって変換されたページ (->オリジナル) /