Learning Settings

Use learner - use_learner
Whether to use any machine-learning classifiers with SpamAssassin, such as the default "BAYES_*" rules. Setting this to 0 will disable use of any and all human-trained classifiers.
Default: 1

Use bayes - use_bayes
Whether to use the naive-Bayesian-style classifier built into SpamAssassin. This is a master on/off switch for all Bayes-related operations.
Default: 1

Use bayes rules - use_bayes_rules
Whether to use rules using the naive-Bayesian-style classifier built into SpamAssassin. This allows you to disable the rules while leaving auto and manual learning enabled.
Default: 1

Bayes auto learn - bayes_auto_learn
Whether SpamAssassin should automatically feed high-scoring mails (or low-scoring mails, for non-spam) into its learning systems. The only learning system supported currently is a naive-Bayesian-style classifier.
Default: 1

Bayes auto learn threshold nonspam - bayes_auto_learn_threshold_nonspam
The score threshold below which a mail has to score, to be fed into SpamAssassins learning systems automatically as a non-spam message.
Default: 0.1

Bayes auto learn threshold spam - bayes_auto_learn_threshold_spam
The score threshold above which a mail has to score, to be fed into SpamAssassins learning systems automatically as a spam message. Note: SpamAssassin requires at least 3 points from the header, and 3 points from the body to auto-learn as spam. Therefore, the minimum working value for this option is 6.
Default: 12

Bayes min spam num - bayes_min_spam_num
To be accurate, the Bayes system does not activate until a certain number of spam have been learned.
Default: 200

Bayes min ham num - bayes_min_ham_num
To be accurate, the Bayes system does not activate until a certain number of ham have been learned.
Default: 200

Bayes learn during report - bayes_learn_during_report
The Bayes system will, by default, learn any reported messages (spamassassin -r) as spam. If you do not want this to happen, set this option to 0.
Default: 1

Bayes use hapaxes - bayes_use_hapaxes
Should the Bayesian classifier use hapaxes (words/tokens that occur only once) when classifying? This produces significantly better hit-rates.
Default: 1

Bayes journal max size - bayes_journal_max_size
SpamAssassin will opportunistically sync the journal and the database. It will do so once a day, but will sync more often if the journal file size goes above this setting, in bytes. If set to 0, opportunistic syncing will not occur.
Default: 102400

Bayes expiry max db size - bayes_expiry_max_db_size
What should be the maximum size of the Bayes tokens database? When expiry occurs, the Bayes system will keep either 75% of the maximum value, or 100,000 tokens, whichever has a larger value. 150,000 tokens is roughly equivalent to a 8Mb database file.
Default: 150000

Bayes auto expire - bayes_auto_expire
If enabled, the Bayes system will try to automatically expire old tokens from the database. Auto-expiry occurs when the number of tokens in the database surpasses the bayes_expiry_max_db_size value. If a bayes datastore backend does not implement individual key/value expirations, the setting is silently ignored.
Default: 1

Bayes learn to journal - bayes_learn_to_journal
If this option is set, whenever SpamAssassin does Bayes learning, it will put the information into the journal instead of directly into the database. This lowers contention for locking the database to execute an update, but will also cause more access to the journal and cause a delay before the updates are actually committed to the Bayes database.
Default: 0