antispam:learning

Configure the spam filter learning settings.

warden --task=antispam:learning
Option
Value Default Description
--use_learner <1|0> 1 Whether to use any machine-learning classifiers with SpamAssassin, such as the default "BAYES_*" rules. Setting this to 0 will disable use of any and all human-trained classifiers.
--use_bayes <1|0> 1 Whether to use the naive-Bayesian-style classifier built into SpamAssassin. This is a master on/off switch for all Bayes-related operations.
--use_bayes_rules <1|0> 1 Whether to use rules using the naive-Bayesian-style classifier built into SpamAssassin. This allows you to disable the rules while leaving auto and manual learning enabled.
--bayes_auto_learn <1|0> 1 Whether SpamAssassin should automatically feed high-scoring mails (or low-scoring mails, for non-spam) into its learning systems. The only learning system supported currently is a naive-Bayesian-style classifier.
--bayes_auto_learn_on_error <1|0> 1 With bayes_auto_learn_on_error off, autolearning will be performed even if bayes classifier already agrees with the new classification (i.e. yielded BAYES_00 for what we are now trying to teach it as ham, or yielded BAYES_99 for spam). This is a traditional setting, the default was chosen to retain backwards compatibility. With bayes_auto_learn_on_error turned on, autolearning will be performed only when a bayes classifier had a different opinion from what the autolearner is now trying to teach it (i.e. it made an error in judgement). This strategy may or may not produce better future classifications, but usually works very well, while also preventing unnecessary overlearning and slows down database growth.
--bayes_auto_learn_threshold_nonspam <int> 0.1 The score threshold below which a mail has to score, to be fed into SpamAssassins learning systems automatically as a non-spam message.
--bayes_auto_learn_threshold_spam <int> 12 The score threshold above which a mail has to score, to be fed into SpamAssassins learning systems automatically as a spam message. Note: SpamAssassin requires at least 3 points from the header, and 3 points from the body to auto-learn as spam. Therefore, the minimum working value for this option is 6.
--bayes_min_spam_num <digit> 200 To be accurate, the Bayes system does not activate until a certain number of spam have been learned.
--bayes_min_ham_num <digit> 200 To be accurate, the Bayes system does not activate until a certain number of ham have been learned.
--bayes_learn_during_report <1|0> 1 The Bayes system will, by default, learn any reported messages (spamassassin -r) as spam. If you do not want this to happen, set this option to 0.
--bayes_use_hapaxes <1|0> 1 Should the Bayesian classifier use hapaxes (words/tokens that occur only once) when classifying? This produces significantly better hit-rates.
--bayes_expiry_max_db_size digit> 1000000 What should be the maximum size of the Bayes tokens database? When expiry occurs, the Bayes system will keep either 75% of the maximum value, or 100,000 tokens, whichever has a larger value.
--bayes_auto_expire <1|0> 1 If enabled, the Bayes system will try to automatically expire old tokens from the database. Auto-expiry occurs when the number of tokens in the database surpasses the bayes_expiry_max_db_size value. If a bayes datastore backend does not implement individual key/value expirations, the setting is silently ignored.
--bayes_store_module <mysql|redis> mysql The storage backend to use for bayes data. The default is backend is MySQL. Using Redis will give the highest performance.
--bayes_sql_dsn <string> DBI:MariaDB:database=danami_warden;host=localhost Config parameters affecting a connection to a MySQL or Redis server. When using redis the server with port number is required where the password and database options are optional. The database number referes to the redis database to use 0-15 with 0 being the default.
--bayes_token_ttl <string> 120d Controls token expiry (ttl value in SECONDS, sent as-is to Redis). The default value is 120 days. Expiry is done internally in Redis using *_ttl settings.
--bayes_seen_ttl <string> 8d Controls "seen" expiry (ttl value in SECONDS, sent as-is to Redis). This is used so that Amavis can avoid re-learning a message it has already seen. The default value is 8 days. Expiry is done internally in Redis using *_ttl settings.
--default <yes> Reset all settings to their default values.
--default_option <option> Reset a specific setting to its default value.
--reload <yes> Reload the service after saving settings.

Examples

// disable bayes_auto_learn
warden --task=antispam:learning --bayes_auto_learn=0 --reload=yes

// reset bayes_auto_learn to its default value
warden --task=antispam:learning --default_option=bayes_auto_learn --reload=yes

// reset all settings to their default values
warden --task=antispam:learning --default=yes --reload=yes