Learning Settings

Use learner - use_learner
Whether to use any machine-learning classifiers with SpamAssassin, such as the default "BAYES_*" rules. Setting this to 0 will disable use of any and all human-trained classifiers.
Default: 1

Use bayes - use_bayes
Whether to use the naive-Bayesian-style classifier built into SpamAssassin. This is a master on/off switch for all Bayes-related operations.
Default: 1

Use bayes rules - use_bayes_rules
Whether to use rules using the naive-Bayesian-style classifier built into SpamAssassin. This allows you to disable the rules while leaving auto and manual learning enabled.
Default: 1

Bayes auto learn - bayes_auto_learn
Whether SpamAssassin should automatically feed high-scoring mails (or low-scoring mails, for non-spam) into its learning systems. The only learning system supported currently is a naive-Bayesian-style classifier.
Default: 1

Bayes auto learn on error - bayes_auto_learn_on_error
With bayes_auto_learn_on_error off, autolearning will be performed even if bayes classifier already agrees with the new classification (i.e. yielded BAYES_00 for what we are now trying to teach it as ham, or yielded BAYES_99 for spam). This is a traditional setting, the default was chosen to retain backwards compatibility. With bayes_auto_learn_on_error turned on, autolearning will be performed only when a bayes classifier had a different opinion from what the autolearner is now trying to teach it (i.e. it made an error in judgement). This strategy may or may not produce better future classifications, but usually works very well, while also preventing unnecessary overlearning and slows down database growth.
Default: 1

Bayes auto learn threshold nonspam - bayes_auto_learn_threshold_nonspam
The score threshold below which a mail has to score, to be fed into SpamAssassins learning systems automatically as a non-spam message.
Default: 0.1

Bayes auto learn threshold spam - bayes_auto_learn_threshold_spam
The score threshold above which a mail has to score, to be fed into SpamAssassins learning systems automatically as a spam message. Note: SpamAssassin requires at least 3 points from the header, and 3 points from the body to auto-learn as spam. Therefore, the minimum working value for this option is 6.
Default: 12

Bayes min spam num - bayes_min_spam_num
To be accurate, the Bayes system does not activate until a certain number of spam have been learned.
Default: 200

Bayes min ham num - bayes_min_ham_num
To be accurate, the Bayes system does not activate until a certain number of ham have been learned.
Default: 200

Bayes learn during report - bayes_learn_during_report
The Bayes system will, by default, learn any reported messages (spamassassin -r) as spam. If you do not want this to happen, set this option to 0.
Default: 1

Bayes use hapaxes - bayes_use_hapaxes
Should the Bayesian classifier use hapaxes (words/tokens that occur only once) when classifying? This produces significantly better hit-rates.
Default: 1

Bayes journal max size - bayes_journal_max_size
SpamAssassin will opportunistically sync the journal and the database. It will do so once a day, but will sync more often if the journal file size goes above this setting, in bytes. If set to 0, opportunistic syncing will not occur.
Default: 102400

Bayes expiry max db size - bayes_expiry_max_db_size
What should be the maximum size of the Bayes tokens database? When expiry occurs, the Bayes system will keep either 75% of the maximum value, or 100,000 tokens, whichever has a larger value. 150,000 tokens is roughly equivalent to a 8Mb database file.
Default: 150000

Bayes auto expire - bayes_auto_expire
If enabled, the Bayes system will try to automatically expire old tokens from the database. Auto-expiry occurs when the number of tokens in the database surpasses the bayes_expiry_max_db_size value. If a bayes datastore backend does not implement individual key/value expirations, the setting is silently ignored.
Default: 1

Bayes learn to journal - bayes_learn_to_journal
If this option is set, whenever SpamAssassin does Bayes learning, it will put the information into the journal instead of directly into the database. This lowers contention for locking the database to execute an update, but will also cause more access to the journal and cause a delay before the updates are actually committed to the Bayes database.
Default: 0

Storage Settings

For high volume servers we recommend switching from the default MySQL storage backend to the the Redis backend for bayes data. This allows spam training and bayes lookups to run in memory which makes it extremely fast.

Bayes store module - bayes_store_module
The storage backend to use for bayes data. The default is backend is MySQL. Using Redis will give the highest performance.
Default: Mail::SpamAssassin::BayesStore::MySQL

Bayes SQL DSN - bayes_sql_dsn
Config parameters affecting a connection to a MySQL or Redis server. RHEL/Centos/CloudLinux 8 uses DBI:MariaDB while all others use DBI:mysql
Default: DBI:mysql:database=danami_warden;host=localhost
Default RHEL/Centos/CloudLinux 8: DBI:MariaDB:database=danami_warden;host=localhost

Bayes token TTL - bayes_token_ttl
Controls token expiry (ttl value in SECONDS, sent as-is to Redis). The default value is 21 days. Expiry is done internally in Redis using *_ttl settings.
Default: 21d

Bayes seen TTL - bayes_seen_ttl
Controls "seen" expiry (ttl value in SECONDS, sent as-is to Redis). The default value is 8 days. Expiry is done internally in Redis using *_ttl settings.
Default: 8d

Installing Redis

It is up to you to secure your Redis installation. You should make sure that the Redis port 6379 is not exposed to the Internet. We recommend binding Redis to the IPv4 loopback interface 127.0.0.1 and setting a password for it.

RHEL/Centos/Cloudlinux

yum install redis
systemctl enable redis --now 

Debian/Ubuntu

yum install redis-server
systemctl enable redis-server --now 

Bayes Storage Connectors

Bayes Storage connection information is stored in the /etc/mail/spamassassin/local.cf file. If the connection entries are missing in your config re-saving the page at Warden -> Settings -> Learning Settings will add them again.

Example MySQL Connector:

bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn DBI:mysql:database=danami_warden;host=localhost
bayes_sql_username danami_warden
bayes_sql_password XXXXXXXXX
bayes_sql_override_username amavis

Example Redis Connector:

The server with port number is required where the password and database options are optional. The database number referes to the redis database to use 0-15 with 0 being the default.

bayes_store_module Mail::SpamAssassin::BayesStore::Redis
bayes_sql_dsn server=127.0.0.1:6379;password=foo;database=0
bayes_token_ttl 21d
bayes_seen_ttl 8d