SourceForge logo
SourceForge logo
Menu

crm114-general

From: Jon D. <jon...@al...> - 2012年11月24日 11:15:34
Attachments: mailfilter.cf
Hello,
Given a sample mail in file '1':
	MFOPTS="-u $HOME/mail/"
	MAILFILTER="$HOME/mail/mailreaver.crm"
	$MAILFILTER $MFOPTS < 1 > 2
2 now contains header
	X-CRM114-Status: UNSURE ( 6.88 )
The mail is ham, so I train it
	$MAILFILTER $MFOPTS --good < 2 > 3
3 contains
	X-CRM114-Action: LEARNED AND CACHED GOOD
Great. Now in my setup, I want to re-inject this trained message into
my mail. Historically (and now) I just save it to INBOX. But what
I would really like to do is have my procmail recipe file it
accordingly. Therefore I'd like to pipe it to e.g. sendmail to be
redelivered. I invoke mailreaver via procmail so this would involve
the trained mail being reclassified. Whenever I try this:
	$MAILFILTER $MFOPTS < 3 > 4
	grep -i ^x-crm 4
	X-CRM114-Action: LEARNED AND CACHED GOOD
	X-CRM114-Version: 20090807-BlameThorstenAndJenny ( TRE 0.8.0 (BSD) ) MR-B9CF6B05 
	X-CRM114-CacheID: sfid-20121123_140912_163398_EE104312 
	X-CRM114-Status: UNSURE ( 6.75 )
	X-CRM114-Notice: Please train this message. 
So despite being learned and cached good, it is classified as UNSURE
again. I've tried deleting the X-CRM114-* headers before injecting
back to procmail and I've tried leaving them in. I'm aware the 'cached'
bit is no use since the redelivered mail will have additional headers.
I don't want to rely on a cache hit, I'd like the classifier to have
trained itself such that the mail is considered good.
This is a short mail, about 40 lines including headers, which is not
atypical for a personal email to me from my friends.
Should I expect this to work? Alternatively, I can resolve this by splitting
my procmailrc up into separate files and constructing a concatenation of them
such that I avoid running crm114 over the learned files again, but it will be a
bit of a pain and perhaps brittle.
As per the header above, I'm running version 20090807-6 (Debian package),
although for some reason my mailreaver.crm file differs from that in the
package. It's possible that it is from an earlier version:
$ md5sum /home/jon/mail/mailreaver.crm /usr/share/crm114/mailreaver.crm
09041a4b975f952432dfbcd1ba7152fe /home/jon/mail/mailreaver.crm
c1b16573137df4d9a01f8162c971308a /usr/share/crm114/mailreaver.crm
Looking at a diff, I see only whitespace changes.
here's my bucket data
	$ cssutil -b -r spam.css
	
	 Sparse spectra file spam.css statistics: 
	
	 Total available buckets : 1048577 
	 Total buckets in use : 760830 
	 Total in-use zero-count buckets : 0 
	 Total buckets with value >= max : 0 
	 Total hashed datums in file : 1048948
	 Documents learned : 3080 
	 Features learned : 441932 
	 Average datums per bucket : 1.38
	 Maximum length of overflow chain : 417 
	 Average length of overflow chain : 6.33 
	 Average packing density : 0.73
	
	$ cssutil -b -r nonspam.css
	
	 Sparse spectra file nonspam.css statistics: 
	
	 Total available buckets : 1048577 
	 Total buckets in use : 897183 
	 Total in-use zero-count buckets : 20 
	 Total buckets with value >= max : 0 
	 Total hashed datums in file : 1125202
	 Documents learned : 2916 
	 Features learned : 625412 
	 Average datums per bucket : 1.25
	 Maximum length of overflow chain : 551 
	 Average length of overflow chain : 11.70 
	 Average packing density : 0.86
I've attached mailfilter.cf should it be of use.
Thanks for any advice!
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.
Thanks for helping keep SourceForge clean.
X





Briefly describe the problem (required):
Upload screenshot of ad (required):
Select a file, or drag & drop file here.
Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL:

AltStyle によって変換されたページ (->オリジナル) /