amavisd-new documentation bits and pieces

The most recent version of this document is available at http://www.ijs.si/software/amavisd/amavisd-new-docs.html

Performing mail checks

The following checks on mail are available

Although checks are presently not performed in parallel, it is best to consider the order of their evaluation unspecified (unknown). Besides possible future parallel implementation, another reason is the caching of results, where subsequent mail with the same contents may benefit from earlier checks if validity of these check results has not yet expired -- so a check result may be instantly available, regardless of whether it has been asked for or not.

Using configuration variables @bypass_virus_checks_maps, @bypass_banned_checks_maps, @bypass_header_checks_maps and @bypass_spam_checks_maps each recipient (or administrator on their behalf) may suggest that certain tests are not needed, primarily for performance reasons. Although the @bypass_*_checks_maps pertain to individual recipients, a mail check is an operation done on the whole message, regardless of the number of recipients and their individual preferences. Suggestion by some of the recipients that certain check is not needed (is to be bypassed) does not guarantee the test will not be performed.

Similarly the (hard) blacklisting or whitelisting of sender address may make running spam check unnecessary, but it does not guarantee the spam check result will not be available for subsequent decisions.

There are two primary reasons why a check result may still be available despite the bypass hint or a sender being black- or whitelisted:

The amavisd-new program is allowed to skip some check for performance reasons if all recipients agree that a check is not necessary (that it may be bypassed), or if the outcome of a check to be skipped could not influence further mail processing and delivery/non-delivery of the message (as is the case of a sender being black- or whitelisted regarding spam check).

For example spam checks may be skipped if it is already known that a mail is infected. This is an implementation and optimization issue, and no guarantee is given about interdependency of checks. Future version may use a different strategy of performing checks (e.g. some checks may be performed in parallel), as long as a change does not affect the final outcome.

Acting on mail checks results

Based on the outcome of mail checks performed during mail analysis or cached from previous mail with the same contents, and based on global settings and individual recipient preferences, the program now decides what action to perform next. As described in the previous section, not all results of checks are necessarily known (e.g. if all recipients voted for some check to be bypassed). For the purpose of deciding further actions, unknown results of a check are considered equivalent to negative (false) results, i.e. skipped virus check is treated the same as non-infected mail, bypassed spam check is equivalent to low spam score (ham).

The following decisions are made at this stage:

and regarding mail delivery and/or sender (non)delivery notifications:

For the purpose of deciding on these actions, a mail is classified based on all available checks results. It is quite possible that more than one check results would be positive (e.g. virus and banned and bad header, or spam and bad header, or virus and spam), yet a mail is considered to be only in one category. The logic is currently hard-wired into the program and can not be influenced by configuration variables. The following order is used, the first condition met decides the outcome:

  1. a virus is detected: mail is considered infected;
  2. contains banned name or type: mail is considered banned;
  3. spam level is above kill level for at least one recipient, or a sender is blacklisted: mail is considered spam;
  4. bad (invalid) headers: mail is considered as having a bad header.

This decision order explains why amavisd-new is not free to skip (to optimize away) virus checks if a presence of a banned name or a bad header is already known or can easily be determined. The order was chosen with the intention that a more informative or a stronger assertion is the one to base further mail delivery on, and to be quoted in notifications and in the log. Even at the expense of possibly longer processing time, it is more important to declare a mail infected than complain about a bad header, a banned executable or spamy contents.

The determined mail category now governs further action. Administrators are notified if enabled for the category, mail is quarantined if quarantining if enabled for the category, recipients are notified if enabled for the category.

Next a mail delivery is attempted. A decision to deliver depends on mail category and on global and individual recipient preferences. The global setting $final_*_destiny=D_PASS or a per-recipient setting @*_lovers_maps ensure mail delivery for corresponding mail category even if mail would otherwise be blocked for being infected or banned or spam or having a bad header.

A mail that is decided to be passed to an individual recipient undergoes some simple header editing which happens on-the-fly during mail forwarding. Certain mail header fields may be inserted or removed, or an existing header field (e.g. Subject) may be modified. This header editing may be different for each recipient even in multi-recipient messages. If necessary, a multi-recipient mail is split into more than one forwarding transaction, grouping (clustering) recipients with same settings into one SMTP transaction.

Based on decisions to forward or to block mail to each recipient, and on the global setting for the mail category ($final_*_destiny=D_BOUNCE or D_REJECT), the sender (non)delivery notification is now prepared in case of D_BOUNCE, and MTA receives a 2xx status (success); or in case of D_REJECT the MTA receives a 5xx (reject) status and preparing sender notifications is thus delegated to MTA (not recommended in post-queue or dual-MTA content filtering setup).

Even in cases of mail non-delivery when a (non-)delivery status notification (DSN) for the sender should have been prepared and sent, there are certain exceptions where the DSN is suppressed, which makes mail effectively lost as far as the sender and the recipient are concerned (but quarantining is not affected):

tag, tag2 and kill levels

When SpamAssassin is called upon to analyze a mail message, it returns a spam score (spam level, hits), which is a numeric representation of spaminess. The higher the number, the more spamy the message is considered. Small numbers near zero or negative indicate a clean message, colloquially called ham. Spam score is a characteristic of the whole message, and does not depend on recipient preferences. SpamAssassin is called only once for each message regardless of the number of recipients.

To determine further course of action, amavisd-new compares the spam score to three numeric values: tag level, tag2 level and kill level. These values may be different for each recipient, and further actions may be different for each recipient. If necessary, the mail forwarding is split into more than one transaction to cater for different recipient preferences.

tag level
if spam score is at or above tag level, spam-related header fields (X-Spam-Status, X-Spam-Level) are inserted for local recipients; undefined (unknown) spam score is interpreted as lower than any spam score;
tag2 level
if spam score is at or above tag2 level, spam-related header fields (X-Spam-Status, X-Spam-Level, X-Spam-Flag and X-Spam-Report) are inserted for local recipients, and X-Spam-Flag and X-Spam-Status bear a YES; also recipient address extension (if enabled) is tacked onto recipient address for local recipients; for these actions to have any effect, mail must be allowed to be delivered to a recipient;
kill level
if spam score is at or above kill level, mail is blocked; and sender receives a nondelivery notification unless spam score exceeds dsn cutoff level.

The general idea is that kill level is what controls the main actions as far as MTA and amavisd-new is concerned (regardless of what recipients' MUA later does with the mail).

Reaching kill level for at least one recipient controls the following:

On the other hand the tag2 level just adds some mark to the passed mail (only for local recipients), which recipient or his MUA may decide to act on or not. Specifically:

For mail below kill level, if a recipient (or his MUA) decides to discard a message based on tag2 marking, there is no way to retrieve it later from a quarantine, the sender is never notified, spam administrator is never notified. As far as the MTA and amavisd-new are concerned, the message was successfully delivered. Whatever MUA does with the mail is entirely the responsibility and jurisdiction of the recipient and his LDA and MUA.

Quarantine

Mail quarantining is attempted when enabled for a given contents category, which usually includes infected, or banned, or spam mail with score for at least one of its recipients at or above his kill level. It is also possible to enable quarantining of clean messages for archiving or troubleshooting purposes. The *quarantine_to for each recipient (when nonempty), along with a corresponding global *_quarantine_method, determines where the quarantine location should be.

quarantine_method

The *_quarantine_method can be considered a static and a site-wide setting, generally controlling a format and location of the quarantine on the system. The *quarantine_to can be considered a dynamic part of the quarantine location, possibly affected by per-recipient settings and the type of malware (contents category). It serves to fully specify the final location, e.g. a file or a mailbox.

Depending on mail contents category (type of malware), the following variables specify the quarantine method: $virus_quarantine_method, $spam_quarantine_method, $banned_files_quarantine_method, and $bad_header_quarantine_method. One way to globally disable quarantine is to specify undef or an empty string as a value of these variables. A nonempty string should follow a syntax:

The local:, bsmtp: and sql: methods are the usual methods for quarantining. The smtp: or lmtp: methods are only useful for quarantining if quarantine location is some dedicated mailbox instead of a local file or directory. The smtp:, lmtp: and pipe: methods are more often used for forwarding and notifications, and only rarely for quarantining. The following features became available with version 2.5.0: the lmtp: method, support for IPv6, and specifying a Unix socket to a smtp: or lmtp: method.

When quarantine method starts with local:, the rest of the string is a filename-template, which serves to specify a file name to store a quarantined message. The template may contain placeholders which are composed of a percent character, followed by exactly one character. The following expansions are recognized:

If a filename-template ends up in .gz, the resulting file will be gzip-compressed.

quarantine_to

Depending on the method specified (local/bsmtp/smtp/sql) a per-recipient setting *quarantine_to adopts different semantics and syntax, possibly modified by the configuration variable $QUARANTINEDIR.

method quarantine_to $QUARANTINEDIR effect
anything empty or undef anything not quarantined
empty or undef anything anything not quarantined
local: pseudo-alias mapped through %local_delivery_aliases directory stored as an individual file below the directory $QUARANTINEDIR, file name comes from the template specified in the *_quarantine_method; if a template file name ends in .gz the message will be gzip-compressed
local: pseudo-alias mapped through %local_delivery_aliases filename of a mailbox appended to a file $QUARANTINEDIR in mbox format
local: pseudo-alias mapped through %local_delivery_aliases empty or undef not quarantined
local: e-mail address containing '@'-sign anything sent via SMTP to a mailer for storage, uses $notify_method to specify how to deliver to MTA; much like a newer 'smtp:' entry below
smtp: e-mail address anything sent via SMTP to a mailer for storage, uses the specified IP address and port, or a Unix socket for delivery; formerly a 'local:' method was used for this purpose
lmtp: e-mail address anything sent via LMTP to a mailer for storage, uses the specified IP address and port, or a Unix socket for delivery
bsmtp: anything (nonempty) anything stored in a file specified in the *_quarantine_method in BSMTP format (if file name is absolute, i.e. starts with a "/")
bsmtp: anything (nonempty) directory stored in a file specified in the *_quarantine_method in BSMTP format (file name relative to $QUARANTINEDIR)
sql: anything (nonempty) anything stored into SQL database specified by @storage_sql_dsn

The *quarantine_to is currently quite limited in functionality, it is often used only to turn off the quarantining for some user or local subdomain. The reason for this limited functionality is a more vulnerable nature of this value, as it may come from SQL or LDAP lookups where non-careful access controls to these databases might permit users to enter any value in the *quarantine_to field, which is why we do not let it control the directory or the exact file name of the quarantine file. This may be somewhat relaxed in the future.

In common setups the quarantine location (e.g. a directory or a dedicated mailbox) is the same for all recipients. If at least one recipient specifies a nonempty *quarantine_to specifying this location, the message is quarantined (stored) there once, regardless of the number of recipients.

The general algorithm is: the *quarantine_to value associated with each recipient is looked up. Empty or undef values are ignored and duplicates are discarded. A mail to be quarantined is then stored/sent to each unique location remaining on the list.

The "bsmtp:" quarantine method is somewhat special in that the quarantine file location is entirely determined by the *_quarantine_method setting, and the value of per-recipient *quarantine_to settings do not influence the quarantine location, as long as this value is nonempty.

When using the "bsmtp:" quarantine method and versions of amavisd-new earlier than 2.2.0, the *_quarantine_to was completely ignored, which made it impossible to turn off quarantining selectively for certain users by specifying an empty or undef value. Since 2.2.0, an empty *_quarantine_to turns off quarantine for a recipient regardless of the quarantine method. A nonempty string in *_quarantine_to (the exact value is ignored) must now be used even with "bsmtp:" to enable quarantining.

Releasing from a quarantine

The utility amavisd-release tells the amavisd daemon to fetch a mail from a local quarantine, and send it to MTA through its regular channels ($notify_method), bypassing re-checking.

By default it connects to socket /var/amavis/amavisd.sock, on which amavisd should be listening for AM.PDP protocol, but one can use inet socket instead of a Unix socket if there is a need to run amavisd-release from a remote host.

In the amavisd.conf the following should be added:

$unix_socketname = "$MYHOME/amavisd.sock";  # listen on Unix socket

# alternatively (less common):
# $inet_socket_port = [10024, 9998];  # listen on listed inet tcp ports

# apply policy bank AM.PDP-SOCK on a Unix socket:
#  (note that this precludes the use of old amavis-milter
#   helper program (with sendmail) on the same socket)
$interface_policy{'SOCK'} = 'AM.PDP-SOCK';

# apply policy bank AM.PDP-INET to some inet tcp socket, e.g. tcp port 9998:
$interface_policy{'9998'} = 'AM.PDP-INET';

$policy_bank{'AM.PDP-SOCK'} = {
  protocol => 'AM.PDP',  # select Amavis policy delegation protocol
  auth_required_release => 0,  # don't require secret_id for amavisd-release
};
$policy_bank{'AM.PDP-INET'} = {
  protocol => 'AM.PDP',  # select Amavis policy delegation protocol
  inet_acl => [qw( 127.0.0.1 [::1] )],  # restrict access to these IP addresses
# auth_required_release => 0,  # don't require secret_id for amavisd-release
};

Setting of $auth_required_release decides whether the requestor needs to specify secret_id in addition to mail_id to authorize a mail release. The secret_id is stored in SQL table msgs when logging to SQL is enabled, otherwise this information is not accessible.

Note that turning off $auth_required_release check is safe as long as access to the socket is restricted, like with file protections on a Unix socket, or restricted with inet_acl to specific IP addresses. Enabling or disabling $auth_required_release is a management / setup decision and convenience.

To release a mail message an exact quarantine location from a log file should be specified as an argument to amavisd-release, e.g.:

amavis[29297]: (29297-01-6) Blocked SPAM,
  ... <xxx> -> <yyy>,
  quarantine: spam/U/UM3XM3XDbN52.gz,
  Message-ID:<...>, mail_id: UM3XM3XDbN52, Hits: 13.365,

$ amavisd-release spam/U/UM3XM3XDbN52.gz
250 2.6.0 Ok, id=rel-UM3XM3XDbN52,
  from MTA([193.2.4.66]:10025): 250 2.0.0 Ok: queued as F137717B88B

The amavisd-release utility also accepts mail_id from STDIN if releasing more than one message in one go is more convenient:

$ amavisd-release -
spam/U/UM3XM3XDbN52.gz
spam/g/gnwKVFKiuey3.gz
spam/X/Xpkj9mLLBHTR.gz

Redirecting malware to a different mailbox -- plus addressing

Amavisd-new can tag passed malware by appending an address extension to a recipient address. An address extension is usually a short string (such as 'spam') appended to the local part of the recipient address, delimited from it by a single character delimiter, often a '+' (or sometimes a '-'). This is why address extensions are also known as "plus addressing". Examples of such mail addresses belonging to user jim@example.com are: jim+spam@example.com, jim+cooking@example.com, jim+health@example.com, jim+postfix@example.com.

Most mailers (MTA), including Postfix and sendmail, have some provision to put address extensions to good use. Similarly, local delivery agents (LDA) such as Cyrus or LDAs that come with MTA, can be configured to recognize and make use of address extensions.

The most common application for address extensions is to provide additional information to LDA to store mail into a separate mail folder. Users may for example choose to use this feature to let LDA automatically file messages from mailing lists to a dedicated subfolder, or to file spam to a spam folder, just by letting LDA simply and quickly examine the envelope recipient address, without having to parse mail header or having to configure and run filters such as procmail or Sieve.

Mailers (MTA and LDA) usually attempt first to examine (to check for validity, to lookup in virtual or aliases maps) a full unmodified recipient address. If the attempt is unsuccessful, they strip away the extension part, and try again. This way a presence of some unknown address extension is simply ignored. For example, a delivery for jim+health@example.com would deliver the mail to the main Jim's inbox if he hasn't provided a subfolder health in his mailbox.

For this fallback to work (to ignore unknown extensions), it is important that all components that need to deal with address extensions (MTA, LDA, content filters) have the same notion of the delimiter in use on the system. For Postfix the configuration option is recipient_delimiter=+ (see also propagate_unmatched_extensions), for amavisd-new the option is $recipient_delimiter='+'; for Cyrus the delimiter is hardcoded as '+', see Cyrus IMAP FAQ -> plus addressing.

The amavisd-new configuration options for adding address extensions are @addr_extension_virus_maps, @addr_extension_spam_maps, @addr_extension_banned_maps, @addr_extension_bad_header_maps. The configuration must also ensure the malware mail is to be delivered, otherwise there is nothing to tack an address extension on -- either by setting kill level sufficiently high, or by declaring spam lovers, or by $final_spam_destiny=D_PASS; an example:

$recipient_delimiter = '+';
@addr_extension_spam_maps = ('spam');
$sa_tag2_level_deflt = 6.7 ;    # score above which spam extension is added
$sa_kill_level_deflt = 15;      # block higher score entirely
$final_spam_destiny=D_DISCARD;  # junk all above kill level

or provide extension string more selectively for certain users or subdomains:

@addr_extension_spam_maps = (
  { '.sub1.example.com' => 'spam',     # an entire subdomain
    'user1@example.com' => 'spam',     # a particular user
    'user2@example.com' => 'malware',  # another user wants a different ext.
    '.'                 => '' }  # all the rest do not receive an extension
);

If one is considering using a quarantine mechanism but wants a per-user (or perhaps per-subdomain) quarantines, this is not such a good idea, because quarantined files are not supposed to be directly visible or handled by recipients: to protect the privacy of the sender, some header pre-processing must be performed on a quarantined file before handing it over to a recipient.

The cleanest way to achieve per-user quarantine which may be directly accessible and/or manipulated by recipients is to turn on adding address extensions, and configure MTA and/or LDA to store such mail wherever necessary, either to a user's dedicated subfolder, or perhaps to some centralized dedicated set of malware mailboxes (per-user or perhaps per-subdomain).

If it is desired to reroute extension-tagged mail to some mailbox away from the usual LDA, the virtual alias mapping by MTA is the tool for the job. With Postfix, a pcre-based virtual map can specify for example:

/^(.*)\+spam@([^@]*)\.example\.com$/   spam-$2-box@example.com

which will collect all spam into one mailbox for each subdomain.

For the Postfix local(8) LDA, a presence of a file $HOME/.forward+spam can redirect mail for user+spam to some dedicated file. For the Postfix virtual(8) LDA, a virtual_mailbox_maps may contain entries like:

user1         mbxfile1
user1+spam    mbxspamfile1
user2         mbxfile2
user2+spam    mbxspamfile2

Hard black- and whitelisting senders regarding spam

The blacklisting and the whitelisting are ways of telling that we already know that a message is spam or is ham (non-spam) just by examining the envelope sender address and comparing it to lists of known spammers or to lists of known legitimate senders of ham. It is a quick check, potentially saving us the trouble of examining the mail contents. It has a big drawback however in that the sender mail address can be (and often is) faked and there is no guarantee that the claimed sender address represents the actual sender.

The sender address is usually faked for spam messages, so whitelisting some sender address is a of questionable value, and often lets in far more spam than it does good by approving legitimate mail. For a reliable way of permitting certain sending clients to send spamy mail see policy banks.

Blacklisting however is still useful: spammer has no desire to pretend to be some blacklisted sending address, when he can choose any other address. Genuine sender that is intentionally blacklisted can only avoid being blocked by falsifying his address (joining spammers in his methods) and sending non-spamy mail, the later being our objective anyway. Although amavisd-new does provide blacklisting, it is functionally equivalent but more effective to blacklist senders at the MTA, preventing such mail from even entering the mail system.

It should be emphasized that whitelisting (and blacklisting) only affects spam checks. It has no influence on other checks such as virus, banned or header checks. Infected mail from whitelisted sender would still be blocked if our policy is to block viruses.

Another point to bear in mind is that the sender address examined is the one from the SMTP protocol, exactly as provide by MTA to amavisd-new. It is known as the envelope sender address or return path. This address does not necessarily match the mail author's address from the mail header (From:) or the sender's address from the header (Sender:). This is most obvious with mail from mailing lists, where the envelope sender address is usually the address of a mailing list management service, while the author's address (From:) is the address of a person sending the message. Using the envelope sender address in most cases makes it easier to black- or whitelist mail from mailing lists, compared to guessing a sender address by parsing mail header.

To avoid surprises, whitelisted sender suppresses inserting/editing the tag2-level header fields (X-Spam-*, Subject), appending spam address extension, and quarantining, even if we know the message is spam (e.g. because the spam check result on the same mail contents has been cached from some earlier mail or known from check on behalf of another recipient).

For mail from blacklisted senders, the effect is as if the spam level were artificially pushed high, resulting in 'X-Spam-Flag: YES', high 'X-Spam-Level' bar and other usual reactions to spam, including possible rejection. If the message nevertheless still passes (e.g. for spam loving recipients), it is tagged as BLACKLISTED in the 'X-Spam-Status' header field, but the reported spam value and set of tests in this report header field is not adjusted (if available from SpamAssassin, which may or may not have been called)

If all recipients of a message either white- or blacklist the sender, amavisd is free to skip spam scanning (calling the SpamAssassin), saving on time. There is no guarantee however that spam scanning will actually and always be skipped.

The following variables (lists of lookup tables) are available, with the semantics and syntax as specified in README.lookups: @whitelist_sender_maps, @blacklist_sender_maps, which implement global policy applicable to all recipients. Similarly there are $per_recip_blacklist_sender_lookup_tables and $per_recip_whitelist_sender_lookup_tables, which make possible for each recipient or subdomain to specify its own set of black- or whitelisted senders. The per-recipient tables take precedence over global tables.

For SQL lookups, amavisd-new will first lookup the recipient in table users in order of descending priority, e.g. user@sub.domain.org, user, @.sub.domain.org, @.domain.org, @.org, and @. (which can be considered a catchall). Each matching recipient record may have a list of senders associated (through join on field users.id and wblist.rid). The sender address is then looked up in the associated list of senders (wblist) in order of descending priority, e.g. sender@sub.example.com, @.sub.example.com, @.example.com, @.com, and @. . This search stops at the first matching sender record with a non-NULL field wblist.wb. The value of a field wblist.wb from the matched record determines if the sender is considered whitelisted ('W'), blacklisted ('B') or neutral (' ') for this recipient.

The neutral value is there just as a way to explicitly stop the search, which may be used by a recipient to overrule site-wide or static white- or blacklisting defaults for some specific sender, and to explicitly neither whitelist nor blacklist the sender, letting the normal spam check determine the spaminess of a mail.

For recipient user@sub.domain.com and sender sender@sub.example.com the following search is performed:

user@sub.domain.org
  sender@sub.example.com @.sub.example.com @.example.com @.com @.

user
  sender@sub.example.com @.sub.example.com @.example.com @.com @.

@.sub.domain.org
  sender@sub.example.com @.sub.example.com @.example.com @.com @.

@.domain.org
  sender@sub.example.com @.sub.example.com @.example.com @.com @.

@.org
  sender@sub.example.com @.sub.example.com @.example.com @.com @.

@.
  sender@sub.example.com @.sub.example.com @.example.com @.com @.

Soft black- and whitelisting senders regarding spam -- @score_sender_maps

Instead of hard black- or whitelisting a sender address (unconditionally considering mail spam or ham solely based on sender address regardless of mail contents), a more gentle approach is to add score points (penalties) to the spam score for mail from certain senders or sending domains. Positive points lean towards blacklisting, negative towards whitelisting. This is much like adding SpamAssassin rules or using its white/blacklisting, except that here only envelope sender addresses are considered (not addresses in a mail header), and that score points can be assigned per-recipient (or per-domain or globally), and that the assigned penalties are customarily much lower than the default SpamAssassin white/blacklisting score.

The table structure of @score_sender_maps is similar to $per_recip_blacklist_sender_lookup_tables i.e. the first level key is recipient address, pointing to by-sender lookup tables. The essential difference is that scores from all matching by-recipient lookups (not just the first that matches) are summed to give the final score boost. That means that both the site and domain administrators, as well as the recipient can have a say on the final score.

For SQL lookups, the mechanism is much like the one described for hard black- or whitelisting, with the following differences:

Namely, amavisd will lookup the recipient, e.g. user@sub.domain.org, user, @.sub.domain.org, @.domain.org, @.org, and @. . Since the search will not stop at the first recipient match, the search order in this case is unimportant, although it is actually the same descending-priority order as with hard b/w listing. Each matching recipient record may have a list of senders associated (through join on field users.id and wblist.rid). The sender address is then looked up in the associated list of senders (wblist) in order of descending priority, e.g. sender@sub.example.com, @.sub.example.com, @.example.com, @.com, and @. . This search stops at the first matching sender record with a non-NULL field wblist.wb, but this does not terminate the outer recipients search. Numeric values of a field wblist.wb from matched records are summed up across all matching recipients tables, and the result is added to the spam score as produced by SpamAssassin.

Unlike static tables, where hard and soft w/b-listing use separate tables, the SQL-based hard and soft w/b-listing uses the same SQL tables and the same field wblist.wb. Mixing the 'W', 'B' with numeric values is somewhat frowned upon, but is supported to facilitate transition. The search goes like described above as long as only numeric field values are encountered, summing up the values and adding the accumulated sum to the final score. If a non-numeric value of field wblist.wb is encountered during this search, its value (W or B or space) is interpreted as described for hard w/b listing, and the search stops at this point.

Configuration variables

The behaviour of the amavisd-new is controlled by a set of configuration variables, which are just normal module-global Perl variables (in package Amavis::Conf). At daemon startup time these variables are first assigned an initial value (often just an undefined value, the undef). The default values of configuration variables are documented in file amavisd.conf-defaults, which lists all configuration variables.

Next a configuration file amavisd.conf (or other file as specified by option -c) is read and interpreted by the Perl interpreter itself. The amavisd.conf is just a normal Perl program, and can in principle do whatever and however it pleases, but its main purpose is to assign values to configuration variables.

After execution of amavisd.conf is done, the daemon may correct some configuration variable values (mainly to maintain backwards compatibility with earlier version of configuration file), and may assign a default value to certain variables which are still undefined -- these variables and their default values are marked "after-defaults" in the documentation file amavisd.conf-defaults. The main reason for existence of the "after-defaults" concept is that some default values depend on other configuration variables and can not be computed before the amavisd.conf is finished. To force such variables to an off/false/disabled state, one needs to assign some false but defined value to them, such as '' (an empty string) or a 0 for booleans.

Perl variables always start with a character $, @ or % to indicate a type of variable. This leading character is part of the variable name for all practical purposes.

$ (dollar character)
indicates a scalar variable (a string, a number, a reference)
@ (at sign)
indicates an array variable (a list)
% (percent character)
indicates an associative array (also known as hash), which maps keys to values

A couple of Perl syntactical elements deserve mention at this point, as they are often used in the amavisd.conf configuration file.

"...", a double-quoted string
is a string; variables within are evaluated, e.g. "$MYHOME/tmp"
'...', a single-quoted string
is a string; variables within are not evaluated, the $ and @ loose their special meaning, e.g. 'user@example.com'
(...)
is a list of comma-separated expressions, e.g. (1,2,"test"); a list is normally assigned to an array variable
qw(string)
is an operator that interprets its argument as a single string, splits it on whitespace to words, and returns a list of words (strings); it is a convenience to avoid some typing, e.g. qw(user@example.com .example.net .org) is exactly equivalent to ('user@example.com', '.example.net', '.org');
[...]
is a reference to an anonymous list of comma-separated expressions, e.g. [1,2,"test"]; (note: a reference is a scalar)
{...}
is a reference to an anonymous associative array, e.g. {'alfa'=>1, 'beta'=>99, 'other'=>'test'}; (note: a reference is a scalar)
\variable
is a reference to a variable, e.g. \$virus_admin, \@mynetworks, \%whitelist_sender; (note: a reference is a scalar)

Historically amavisd-new accessed all configuration variables directly with their name, e.g. %spam_lovers, @spam_lovers_acl, $spam_lovers_re. Later it became apparent that certain groups of variables (lookups) are always used together in the same way, so new array variables like @spam_lovers_maps were introduced. The program now never accesses old lookup table variables directly, but always through higher level lists. The solution is fully backwards compatible, as the default value for the new lists references the old variables, e.g.:

@spam_lovers_maps = (\%spam_lovers, \@spam_lovers_acl, \$spam_lovers_re);

Administrator is free to modify or replace the lists in variables like @spam_lovers_maps, perhaps rearranging the order or loosing all references to legacy variables, and replacing them with other variables, often anonymous arrays/lists or anonymous associative maps (hashes), or constants which can serve as a convenient catchall default value when used last in the list.

Since amavisd-new version 2.0, there is one further generalization step in the way a program accesses configuration variables. More than a hundred configuration variables which control amavisd-new operation on a by-message level (as opposed to by-recipient and truly global settings) are now grouped in associative array called a policy bank. These configuration variables are no longer accessed directly by their variable name by the program, but always through a currently installed policy bank. Administrator is free to modify the policy bank, normally by providing replacement policy banks and specifying under what conditions the replacement policy bank is to be automatically installed.

Policy banks

Policy banks hold sets of configuration variables controlling most of per-message settings, including: static lookup tables, IP interface access rules, forwarding address, log level, templates, administrator addresses, spam trigger levels, quarantine rules, lists of anti-virus scanner entries (or just a subset), banned names rules, defang settings, etc. The whole set of these settings may be replaced with another predefined set based on incoming port number, making it possible for one amavisd daemon to cope with more diverse needs of served user communities which could so far only be implemented by running more than one instance of the amavisd daemon, each with its own configuration file.

This mechanism brings new potentials for the future: in principle policy banks could be swapped not only based on port number or SMTP client IP address, but on any characteristics pertaining to a mail message as a whole (not specific to each of its recipients), or to characteristics of a connection from a mailer (e.g. the interface address or protocol);

Until a better mechanism is available, a policy bank named 'MYNETS' has special semantics: this policy bank is loaded (if it exists) whenever MTA supplies a SMTP client's IP address (through Postfix XFORWARD extension to the SMTP protocol, or via a new AM.PDP protocol) and that address matches the @mynetworks list (actually: the list referenced by 'mynetworks_maps' key in the currently installed policy map).

An associative array %interface_policy is a current mechanism of assigning a policy bank to an incoming TCP port number (port must be in the list @$inet_socket_port, otherwise amavisd will not listen on that port). Whenever a connection from MTA is received, first a built-in policy bank with an empty name -- the $policy_bank{''} gets loaded, which brings in all the global/legacy settings. Then it is overlaid by whatever configuration settings are in the bank named in the $interface_policy{$port} if any, and finally the policy bank named 'MYNETS' (i.e. settings from $policy_bank{'MYNETS'}) is overlaid if such policy bank exists and the SMTP client IP address is known (by XFORWARD SMTP extension command from MTA) and it matches the current mynetworks_maps.

When a new policy bank is overlaid over an existing set of configuration variables, the variables not present in the new policy bank retain their value. This makes it possible to specify new policy banks which carry only a minimal set of settings that need to be changed.

The built-in policy bank (with empty name) is predefined, and includes references to most other variables (the dynamic config variables), which are accessed only indirectly through the currently installed policy bank. Overlaying a policy bank with another policy bank may bring in references to entirely different variables, possibly unnamed, and may remove references to legacy variables if it so chooses.

Configuration variables are referenced from a policy bank (which is implemented as a perl associative array, i.e. a hash) by keys of the same name, e.g. { log_level => \$log_level, inet_acl => \@inet_acl, ...}. For scalars one level of indirection is allowed, e.g. a policy bank { log_level => \$log_level }; $log_level=2; is equivalent to { log_level => $log_level } or to { log_level => 2 }, but in the first example with an indirect reference, the $log_level may be assigned to even _after_ the policy bank has already been formed.

A word of caution: the syntax of entries within a policy bank hash is slightly different from assignments to configuration variables. This is because entries within policy bank are not assignments, but key=>value pairs as in any Perl associative array. And these pairs are delimited by commas, unlike statements, which are delimited by semicolons. Value is separated from its key by '=>' (or by a comma), whereas the assignment operator is '='. Keys of a policy bank are without leading $ or @ or %, unlike variable names. Values of an associative array can only be scalars (e.g. strings or numbers or references to arrays or references to associative array).

Compare:

And a final note: Perl can detect and report typing mistakes in variable names, but mistyped key is just some unused associative array entry lurking in a hash, never used and never reported as mistyped/useless.

Putting policy banks to good use -- examples

The sender address can be faked, so comparing envelope sender address to @local_domains_maps or some other lookup table to base some important decisions on would not be trustworthy. The only reliable information is the recipient's e-mail address and information about client SMTP session, such as the IP address of the sending SMTP client and the server port number or the interface address. Such information can be made available by MTA to amavisd-new through a feeding protocol (e.g. XFORWARD extension or via AM.PDP), or separate MTA paths can be set up for mail that needs to be treated differently, such as internally originating and externally originating mail, or perhaps separating authenticated mail from the rest.

Amavisd-new has two ways of receiving such extra information from MTA:

The following examples illustrate several ways of distinguishing between different mail origins. For most common purposes the only distinction that really matters is separating internally originating mail from the rest, and for this purpose the use of policy bank MYNETS and a sufficiently recent version of Postfix supporting XFORWARD suffices -- the complication with multiple ports and multiple interfaces is needed only for more demanding sites which prefer maximum flexibility.

Example 1

As stated earlier, a policy bank named 'MYNETS' is loaded (if it exists) whenever MTA supplies an original SMTP client's IP address (e.g. via the Postfix XFORWARD extension) and that address matches the @mynetworks list. This covers most common needs to distinguish internally-originating mail from the rest, and allows them to be treated differently, as illustrated by the following example:

$policy_bank{'MYNETS'} = {  # mail originating from @mynetworks
  virus_admin_maps => ["security\@$mydomain"], # alert of infected local hosts
  spam_admin_maps  => ["abuse\@$mydomain"],    # alert of internal spam
  spam_kill_level_maps => [7.0],  # slightly more permissive spam kill level
  spam_dsn_cutoff_level_maps => [15],
  banned_filename_maps => [
    new_RE(
    # block double extensions in names:
      qr'\.[^./]*\.(exe|vbs|pif|scr|bat|cmd|com|cpl|dll)\.?$'i,
    # allow any name or type (except viruses) within an archive:
      [ qr'^\.(Z|gz|bz2|rpm|cpio|tar|zip|rar|arc|arj|zoo)$' => 0],
    # blocks MS executable file(1) types, unless allowed above:
      qr'^\.(exe-ms)$',
    ),
  ],
};

Example 2

In the following example some of the external mail is coming in via fetchmail, the rest of the externally originating mail is coming in via normal SMTP at tcp port 25, and all internally originating mail is coming to MTA via mail submission port 587 reserved for that purpose, or via dedicated IP address accessible only from inside, or through a Postfix pickup service. We'll use Postfix in this example, although it does not rely on any particular Postfix capability that wouldn't be available in any general purpose MTA in some form or another.

Only the specifics of this setup are described here. Missing bits like the MTA re-entry port 10025 and other options are described in README.postfix and are assumed here. Specifying additional smtpd restrictions and options may be desired, and is omitted here for brevity.

To let amavisd-new be able to distinguish between all four mail entry routes, we let amavisd listen on four TCP ports (the fifth is for good measure, to be used in the next example): $inet_socket_port = [10040,10041,10042,10043,10044]; (any unused non-privileged TCP ports can be used)

In Postfix configuration file master.cf we attach different content_filter options to each of the Postfix services receiving mail. We'll assume the MTA host has two IP addresses 192.0.2.1 and 192.0.2.2 assigned (IP aliases or separate physical interfaces), which makes it easier to distinguish between internally originating mail and the rest even if XFORWARD can not be used (older Postfix versions or some other MTA):

# regular incoming mail, originating from anywhere (usually from outside)
# the MX record (or backup mailers) should point to this IP address
192.0.2.1:smtp inet  n  -  n  -  -  smtpd
  -o content_filter=amavisfeed:[127.0.0.1]:10040

# incoming mail from fetchmail, considered externally originating
# (add 'smtphost localhost/2345' to the poll section in .fetchmailrc)
127.0.0.1:2345 inet  n  -  n  -  -  smtpd
  -o content_filter=amavisfeed:[127.0.0.1]:10041
  -o smtpd_client_restrictions=permit_mynetworks,reject
  -o mynetworks=127.0.0.0/8

# IP address to be used by internal hosts for mail submission
192.0.2.2:smtp inet  n  -  n  -  -  smtpd
  -o content_filter=amavisfeed:[127.0.0.1]:10042
  -o smtpd_client_restrictions=permit_mynetworks,reject

# or, tcp port 587 to be used by internal hosts for mail submission
submission inet  n  -  n  -  -  smtpd
  -o content_filter=amavisfeed:[127.0.0.1]:10042
  -o smtpd_client_restrictions=permit_mynetworks,reject

# locally originating mail submitted on this host through a sendmail binary
pickup     fifo  n  -  n  60  1  pickup
  -o content_filter=amavisfeed:[127.0.0.1]:10043

A global option content_filter in file main.cf could provide a convenient default, only services that need a different setting would then need to override it.

Now let's make up names for policy banks which will cover all four cases. We'll pick names EXT, EXT-FM, INT, INT-HOST for policy banks. The amavisd needs to be told to load corresponding policy when a request comes in on each of the listening ports:

  $interface_policy{'10040'} = 'EXT';
  $interface_policy{'10041'} = 'EXT-FM';
  $interface_policy{'10042'} = 'INT';
  $interface_policy{'10043'} = 'INT-HOST';
  $interface_policy{'10044'} = 'AUTH';  # to be used in the next example

Next we'll prepare each policy and specify there the options which should be different from global options. Note that the following policies serve mostly as an example and to provide ideas -- they should not be considered a recommendation. For example:

# regular incoming mail, originating from anywhere (usually from outside)
$policy_bank{'EXT'} = {
  # just use global settings, no special overrides
};

# incoming mail from fetchmail, considered externally originating
$policy_bank{'EXT-FM'} = {
  log_level => 2,
    # no bounces for spam, not even for score below spam_dsn_cutoff_level_maps:
  final_spam_destiny => D_DISCARD,
};

# locally originating mail guaranteed to be from inside
$policy_bank{'INT'} = {
    # enable/redirect admin notifications for locally originating malware:
  virus_admin_maps => ["virusalert\@$mydomain"],
  spam_admin_maps  => ["virusalert\@$mydomain"],
    # be slightly more permissive on spam levels for mail from our hosts:
  spam_kill_level_maps => [7.0],
  spam_dsn_cutoff_level_maps => [15],
  final_virus_destiny => D_BOUNCE,  # (unless in viruses_that_fake_sender_maps)
  final_spam_destiny  => D_BOUNCE,  # (unless above spam_dsn_cutoff_level_maps)
  bypass_banned_checks_maps => [ 1 ],  # allow sending any file type or name
    # provide customized sender notifications for spam from our users:
  notify_spam_sender_templ => read_text("$MYHOME/notify_spam_sender.txt"),
};

# mail locally submitted on the host on which MTA runs
$policy_bank{'INT-HOST'} = {
    # NOTE: this is just an example; ignoring internally generated spam
    # may not be such a good idea, consider zombified infected local PCs
  bypass_spam_checks_maps   => [ 1 ],
  bypass_banned_checks_maps => [ 1 ],
  final_spam_destiny   => D_PASS,
  final_banned_destiny => D_PASS,
};

# authenticated mail (used by the next example)
$policy_bank{'AUTH'} = {
    # enable admin notifications for malware originating from our users:
  virus_admin_maps => ["virusalert\@$mydomain"],
  spam_admin_maps  => ["virusalert\@$mydomain"],
    # be slightly more permissive on spam levels for mail from our users:
  spam_kill_level_maps => 7.0,
  spam_dsn_cutoff_level_maps => 15,
  bypass_banned_checks_maps => 1,  # allow sending any file type or name
  final_bad_header_destiny => D_BOUNCE;  # block invalid headers
};

If not all four cases need to be distinguished, the same policy bank name (or none at all) can be assigned to more than one port. Also the MTA configuration can use the same amavisd port for more than one of its incoming services if there is no need for different settings.

Example 3

Besides setting different content_filter options for different Postfix services, one may use the option FILTER in Postfix lookup tables, as described in Postfix man pages access(5) and header_checks(5), to specify different content_filter settings based on various conditions, such as sender domain name or IP address, mail header fields, etc.

Consider the next example which uses the FILTER settings to distinguish from internally originating, authenticated external mail and the rest.

# global default:
content_filter=amavisfeed:[127.0.0.1]:10044

# note that permit_mynetworks only checks for key presence and ignores rhs
mynetworks = cidr:/etc/postfix/mynetworks-filter.cidr

smtpd_sender_restrictions =
  ... the usual rejects if any ...
  check_client_access cidr:/etc/postfix/mynetworks-filter.cidr
  permit_mynetworks
  permit_sasl_authenticated
  permit_tls_clientcerts
  check_sender_access regexp:/etc/postfix/filter-catchall.regexp

The check_client_access cidr:/etc/postfix/mynetworks-filter.cidr preceeds the permit_mynetworks (which uses the same cidr table, but ignores the righthand side), and it serves to override the global content_filter setting by the use of FILTER for each of the networks (presumably internal) listed in mynetworks-filter.cidr. The final effect is that mail matching networks listed in mynetworks-filter.cidr will be sent for content filtering to tcp port 10042 (the FILTER setting in access map), authenticated non-local mail will be sent for content filtering to port 10044 (the global setting), while all the rest will be sent to port 10040 (as specified in catchall filter). If there are any other overrides in master.cf like in the previous example, they take precedence over the global settings, but the FILTER rules take the ultimate precedence.

/etc/postfix/mynetworks-filter.cidr :

127.0.0.0/8    FILTER amavisfeed:[127.0.0.1]:10042
10.0.0.0/8     FILTER amavisfeed:[127.0.0.1]:10042
172.16.0.0/12  FILTER amavisfeed:[127.0.0.1]:10042
192.168.0.0/16 FILTER amavisfeed:[127.0.0.1]:10042

/etc/postfix/filter-catchall.regexp:

/^/            FILTER amavisfeed:[127.0.0.1]:10040

Note that in place of the last catchall entry: check_sender_access regexp:/etc/postfix/filter-catchall.regexp one would be tempted to do: check_sender_access static:FILTER amavisfeed:[127.0.0.1]:10040, but unfortunately spaces are not allowed within an option value in master.cf, so we have to resort to a lookup table.

$max_requests

Amavisd-new runs under process control of Net::Server. This is a pre-forked environment where $max_servers child processes are constantly kept alive and ready to accept new tasks (mail messages to be checked). Each amavisd child process is able to handle several tasks in a row, which helps to reduce startup (fork) costs. In case of SMTP or LMTP protocol, each session may consist of several SMTP/LMTP transactions. Each SMTP/LMTP transaction is counted a one task, regardless of whether it came in from the same SMTP/LMTP client in a multi-transaction session, or as separate sessions, possibly from different SMTP/LMTP clients.

A configuration variable $max_requests (default value 20) controls the approximate number of tasks each child process is willing to handle. After that the child process terminates and Net::Server provides a new child process to take its place.

The exact value of $max_requests is not critical. There are two opposing needs, and some in-between value should be chosen.

On the low side, the number should not be too small in order for the startup cost to be averaged out / sufficiently diluted over an entire child lifetime. A value above 5 or 10 meets this goal in most amavisd-new configurations.

On the high side, the value depends on the amavisd-new configuration. The amavisd daemon itself is conservative in its use of dynamically allocated memory and does not load mail into memory, but keeps mail being processed and its components on files. Similarly, most of the called external virus scanners and decoders are rational in their use of memory (a notable exception was Archive::Tar which was used if a pax or cpio command was not available, but is no longer supported). Unfortunately this is not true for Perl module Mail::SpamAssassin, which expects to have an entire decoded mail in memory in order to be able to run its large set of rules on it in reasonable time. This is a design decision of SpamAssassin.

When amavisd-new is not configured to use SpamAssassin, the value of $max_requests can be quite high without any known or expected problems. For general sanity reasons, an upper limit could be a 100 for example, although anything above 20 or so would not bring measurable benefit to the maximum sustained mail throughput.

When amavisd-new is configured to use SpamAssassin however, the slurping of entire mail in memory and decoding it may have implications, depending on the $sa_mail_body_size_limit value, on the maximum mail size allowed at the MTA (e.g. Postfix setting for message_size_limit) and on the mail compression factor. Even though the allocated memory is reclaimed by Perl after mail processing, and is reused for subsequent processing, the process virtual memory footprint never shrinks, it can only expand as needed.

The $sa_mail_body_size_limit sets a limit on a mail size beyond which SpamAssassin is not called, so it can not contribute to memory usage much beyond this limit, times a small factor (2-5?, due to multiple internal representations of a message). If the $sa_mail_body_size_limit is large, and MTA mail size is not limited, or if mail has a huge mail header, the memory footprint can become noticable. For the rest of a lifetime the child process that processed the mail stays at its high virtual memory size. If this happens frequently, host resources may become scarce. Limiting the number of tasks is very much desirable in this case.

The default value of 20 for $max_servers was chosen as a good compromise between averaging-out the startup costs and not wasting too much resources on hosts with high message size limit and SpamAssassin enabled.

In the setup with Postfix where its lmtp client is chosen to feed amavisd-new, this client tries to keep LMTP session open and submit several mail messages in multiple transactions. With recent Postfix versions its SMTP client is capable and willing of using multiple transaction sessions as well, although it seems to be less persistent than the LMTP client.

According to SMTP and LMTP protocol specifications, dropping the session on the server side is considered rude and should be used only as a last resort. In order to respect the $max_requests setting (which is not strictly enforced by amavisd, and is considered an advisory value), the client side should preferably be configured with a comparable limit. Starting with amavisd-new-2.2.0 the amavisd daemon is more strict in enforcing the limit and drops the SMTP or LMTP session after $max_servers is exceeded by one. This was a recommendation from the Postfix community, as the option of reducing Postfix max_use setting is considered less appropriate.

Nevertheless, Postfix doesn't take session dropping lightly, it backs off a while after content filter forcibly drops the session, which is undesired. Better behaviour is achieved when Postfix voluntarily terminates a SMTP session before amavisd would reach its $max_requests limit. This can be achieved by applying max_use to the Postfix smtp service feeding a content filter (typically this entry in master.cf is named 'amavisfeed').

Setting up DKIM mail signing and verification

A DKIM standard (RFC 4871) states the following, which applies to its predecessor DomainKeys (historical: RFC 4870) as well:

DomainKeys Identified Mail (DKIM) defines a mechanism by which email messages can be cryptographically signed, permitting a signing domain to claim responsibility for the introduction of a message into the mail stream. Message recipients can verify the signature by querying the signer's domain directly to retrieve the appropriate public key, and thereby confirm that the message was attested to by a party in possession of the private key for the signing domain.

The DomainKeys specification was a primary source from which the DomainKeys Identified Mail [DKIM] specification has been derived. The purpose in submitting the RFC 4870 document is as an historical reference for deployed implementations written prior to the DKIM specification.

The main advantage of DKIM signing to sending domains is that it allows recipients to reliably validate mail origin for purposes of whitelisting on spam checks and whitelisting reception of otherwise banned mail contents. By signing outbound mail you give your correspondents a chance to distinguish between your genuine mail, and fraud or spam mail which may happen to carry your domain name as a sender address. Signing outbound mail is a kind gesture towards recipients, making it much easier for them to treat your mail as important or desirable if they choose so.

The main advantage of DKIM signature verification to recipients is that it allows them to reliably distinguish genuine mail originating from a claimed sending domain from other (possibly faked) mail. It makes signature-based whitelisting a reliable mechanism. It also makes it possible to recognize and automatically discard fake mail claiming to be from domains which are known to always sign their outbound mail and to always send mail directly. Coupled with reputation schemes (mostly manual/static at present, or dynamic in the future) makes it possible to assign score points (positive or negative) based on merit and past experience with each signing domain. A valid signature also offers non-repudiation: a domain which signed a message can not disclaim message origin, which offers recipient a strong argument when reporting abuse to the signing domain.

For the impatient - signing from scratch

Here is a quick Spartanic setup of DKIM signing and DKIM/DK verification by amavisd for the impatient, without much explanation, assuming all originating mail comes from internal networks (not from authenticated roaming clients), only one domain needs signing, using default signature tags, no milters are in use and no mailing list manager needs signing. No changes in Postfix configuration is necessary for this simple setup. For more information and more complex setups please see sections further on.

Generate a signing key:

  $ amavisd genrsa /var/db/dkim/example-foo.key.pem

add to amavisd.conf:

  $enable_dkim_verification = 1;
  $enable_dkim_signing = 1;
  dkim_key('example.com', 'foo', '/var/db/dkim/example-foo.key.pem');
  @dkim_signature_options_bysender_maps = (
    { '.' => { ttl => 21*24*3600, c => 'relaxed/simple' } } );
  @mynetworks = qw(0.0.0.0/8 127.0.0.0/8 10.0.0.0/8 172.16.0.0/12
                   192.168.0.0/16);  # list your internal networks

run:

  $ amavisd showkeys

add the public key (as displayed) to your DNS zone, increment SOA sequence number and reload DNS; then test signing and a published key:

  $ amavisd testkeys

if all went well:

  $ amavisd reload

For the impatient - replacing signing by dkim-milter with signing by amavisd

For sites already signing their mail by dkim-milter, most work of preparing signing keys and publishing public keys in DNS has already been done. All it needs to be done is to declare these signing keys in amavisd.conf and turn on $enable_dkim_signing.

To facilitate transition of DKIM signing from dkim-milter to amavisd-new, a new command-line tool is available with amavisd-new-2.6.2 (the extra utility code is not loaded during normal operation), taking a file name as its argument, e.g.:

  $ amavisd convert_keysfile /var/db/dkim/keysfile.txt

and writing to stdout a set of lines that may be directly included into amavisd.conf configurations file, matching semantics of a dkim-filter keys file. It can be useful during transition, or for those who prefer to specify signing keys and sender-to-key mappings as a file in a syntax compatible with options -K -k of dkim-filter, and can live with limitations of such syntax. See dkim-filter(8) man page for details on the syntax.

The produced output consists of signing key declarations (calls to a procedure dkim_key), where each call normally corresponds to exactly one DNS resource record publishing a corresponding DKIM public key. When necessary output also produces an assignment to a list of lookup tables @dkim_signature_options_bysender_maps, which supplies non-default mappings of sender domains to signing keys, e.g. when third-party signatures are desired.

Implementation and mail flow

Signing of originating mail (or mail being redistributed by our domain), and verifying signatures of incoming mail are two tasks that can be performed by the same program, or they can be performed by separate entities. Traditionally with sendmail, both tasks are performed by one milter, which may be easier to maintain, but has certain disadvantages.

Verifying signatures should be performed early, before any local mail transformations get a chance of invalidating a signature, e.g. by performing MIME conversions to quote-printable, by fixing syntactically invalid mail header section, by reformatting or reordering some header fields (some MTAs do it frivolously), by modifying/inserting/removing certain header fields, or by a local mailing list modifying mail text, e.g. by appending footers.

Signing outgoing mail should be performed late, after mail sanitation, after conversion to 7-bit characters (to avoid later uncontrollable changes by a relaying or receiving MTA), and after editing header section by a content filter. Similar applies to local mailing lists, which may be rewriting messages, requiring them to be re-signed by the domain hosting a mailing list, just before being sent out.

Starting with amavisd-new version 2.6.0, DKIM signing can be performed directly by amavisd (using a Perl module Mail::DKIM, which is the same module as used by DKIMproxy and by SpamAssassin). Signing directly by amavisd reduces setup complexity using a milter or DKIMproxy, and avoids additional data transfers. Regarding mail flow through the system there are similarities between signing in amavisd and signing by dkim-milter, which is why the diagram below shows both possibilities.

For verification there are three choices: either amavisd itself can do it by calling Mail::DKIM directly, or a SpamAssassin plugin can do it by calling the same Perl module, or a milter in verification-only mode can be invoked by an incoming Postfix smtpd service.

Advantage of invoking signature verification by amavisd is that all mail is checked for signatures, regardless of whether SpamAssassin is called or not. Typically messages beyond a certain size are not passed to SpamAssassin, and neither are infected message or identified bounces. Amavisd also offers loading of policy banks based on valid DKIM/DK signatures (e.g. allowing some domains to send-in otherwise banned files, or whitelisting on spam), offers to add score points based on signing domain reputation, and adds Authentication-Results header field (like a dkim-milter does).

Invoking signature verification by SpamAssassin has an advantage that DKIM-based or DomainKeys-based whitelisting or scoring can be used, but has a disadvantage that possibly not all mail is checked (e.g. large mail and infected mail may be exempt from spam checks). Performing the same signature validation task twice (by amavisd and by SA) may seem wasteful, but in practice it is not too bad: thanks to DNS server caching a network lookup for a public signing key is only done once, and as SpamAssassin does not receive large mail for processing, its signature verification is very quick: few milliseconds for non-signed mail, and of the order of a tenth of a second for signed mail.

Invoking signature verification by calling a milter from incoming smtpd service has an advantage that it has the best chance of seeing mail in its pristine form (before canonical and virtual mapping or masquerading by MTA, regardless of their settings). Because it is poorly integrated with the rest of the chain (e.g. with SpamAssassin rules and amavisd policy banks), and because it adds one extra data transfer, it is mainly still useful as a way to double-check the correctness of DKIM validation by having two independent implementations in use, each inserting its independently derived Authentication-Results header field into passed mail.

To sign as late as possible with a dkim-milter, the signing milter can be invoked by a Postfix smtpd service which is receiving content-checked mail from a content filter such as amavisd-new. As this second-stage smtpd service does not reliably know how a given message came into a mail system and whether it is supposed to be signed or not, a clean solution is to provide two (or more) parallel paths through MTA and through a content filter, one used for mail that is eligible for being signed (originating mail), the other for all the rest. This same dual path approach through amavisd is beneficial for signing by amavisd too, for the same reason of providing a reliable source of information on mail origin to a signature choosing code:

              +------+
              |verify|          (verify)
              +--+---+              | (by amavisd and/or SA)
                ^^^ milter          |
incoming:       |||             +---v-------+
  MX ---->  25 smtpd ---> 10024 >           >---> 10025 smtpd -->
                 ||             |           |
  SASL -->  25 smtpd \          |  amavisd  | (notifications)
submission        |   +->       |           >--->_
  mynets->  25 smtpd ---> 10026 >ORIGINATING>---> 10027 smtpd -->
submission            +->       +-------^---+            |
       --> 587 smtpd /  :               |                v milter
                       (convert         |             +------+
                       to 7-bit)      (sign)          | sign |
                                                      +------+

There are other benefits to providing two parallel paths: a content filter may be configured to apply different rules and settings to mail that is known to be originating from our users. Some suggestions: apply less strict banning rules, enable spam administrator notifications for internally originating spam and viruses, letting SpamAssassin rules be conditionalized based on amavisd-new policy banks loaded, etc.

Configuring multiple mail paths in Postfix

Here is one way of configuring Postfix for providing two paths through a content filter. Locally submitted or authenticated mail will go to a content filter to its port 10026 and will be signed on its way out (either by amavisd or by a signing milter). All other mail (incoming) will be diverted to port 10024 for normal content filtering, and will not be eligible for signing.

main.cf:

  # on re-queueing of a message smtpd_*_restrictions do not apply,
  # so we'd better provide a safe default for a content_filter,
  # even at an expense of later flipping the choice twice
  # (which adds a bit to log clutter, but never mind)
  #
  content_filter = amavisfeed:[127.0.0.1]:10024

  # each triggered FILTER deposits its argument into a
  # content_filter setting, the last deposited value applies
  #
  smtpd_sender_restrictions =
    check_sender_access regexp:/etc/postfix/tag_as_originating.re
    permit_mynetworks
    permit_sasl_authenticated
    permit_tls_clientcerts
    check_sender_access regexp:/etc/postfix/tag_as_foreign.re

  # Make sure to assign FILTER tags in restrictions which
  # are only invoked once per message, e.g. client or sender
  # restrictions, but NOT on smtpd_recipient_restrictions,
  # as a message may have multiple recipients, so multiple
  # passes through FILTER tag assignments can yield a
  # surprising (and incorrect) result.

/etc/postfix/tag_as_originating.re:

  /^/  FILTER amavisfeed:[127.0.0.1]:10026

/etc/postfix/tag_as_foreign.re:

  /^/  FILTER amavisfeed:[127.0.0.1]:10024

In master.cf set up two listening smtpd services for receiving filtered mail from amavisd (as per README.postfix), one on tcp port 10025 (for inbound mail) and the other on port 10027 (for originating mail). If a signing milter is in use it will be attached to a smtpd service on 10027 only. If no milters are in use and signing is done by amavisd, both smtpd services can have exactly the same settings, and in fact only one suffices, in which case redirecting $forward_method and $notify_method to 'smtp:[127.0.0.1]:10027' in later example can be disregarded.

Configuring multiple mail paths in amavisd

In amavisd.conf two parallel paths need to be provided, one receiving on port 10024 and forwarding to 10025, the other receiving on port 10026 and forwarding to 10027.

  $inet_socket_port = [10024,10026];  # listen on two ports

The 10024>10025 path will be controlled by a default policy bank, the other (10026>10027), dedicated to mail intended to be signed, will use a policy bank (arbitrarily) named ORIGINATING:

  $forward_method = 'smtp:[127.0.0.1]:10025';  # MTA with non-signing service
  $notify_method  = 'smtp:[127.0.0.1]:10027';  # MTA with signing service

  # switch policy bank to 'ORIGINATING' for mail received on port 10026:
  $interface_policy{'10026'} = 'ORIGINATING';

  $policy_bank{'ORIGINATING'} = {  # mail originating from our users
    originating => 1,  # indicates client is ours, allows signing
    #
    # force MTA to convert mail to 7-bit before DKIM signing
    # to avoid later conversions which could destroy signature:
    smtpd_discard_ehlo_keywords => ['8BITMIME'],
    #
    # forward to a smtpd service providing DKIM signing service
    # (if using a signing milter instead of signing by amavisd):
    forward_method => 'smtp:[127.0.0.1]:10027',
    #
    # other special treatment of locally originating mail,
    # just some suggestions here:
    spam_admin_maps  => ["spamalert\@$mydomain"],  # warn of spam from us
    virus_admin_maps => ["virusalert\@$mydomain"],
    banned_filename_maps => ['ALT-RULES'],         # more relaxed rules
    spam_quarantine_cutoff_level_maps => undef,    # quarantine all spam
    spam_dsn_cutoff_level_maps => undef,
    spam_dsn_cutoff_level_bysender_maps => # bounce to local senders only
      [ { lc(".$mydomain") => undef,  '.' => 15 } ],
  };

The smtpd_discard_ehlo_keywords=>['8BITMIME'] serves to persuade Postfix to convert mail to 7-bit quoted-printable before submitting it to content filtering and signing. Avoiding 8-bit characters in mail body makes signatures less susceptible to breaking by some relaying or receiving MTA over which we have no control. The same effect (making Postfix convert outgoing mail to 7-bits before DKIM signing) could be achieved by a Postfix setting smtp_discard_ehlo_keywords=8bitmime on a smtp service feeding mail-to-be-signed to amavisd, but this would require setting up two such services, one with the option and one without.

Note that 8-bit to 7-bit conversion may break a S/MIME or PGP signature, so if mail signing is in use, it may not be desirable to let Postfix do the conversion, and it may be acceptable to take a risk that a remote MTA will clobber signatures if it decides the mail text is to be converted to 7-bits QP. The only reliable solution in this case is to configure MUA clients to stick to 7-bit characters/encodings before generating S/MIME or PGP signatures.

The following text from the Postfix documentation file MILTER_README should be disregarded -- amavisd is 8-bit clean, and we do want Postfix to convert to 7-bits on the signing path but not on the other path: Content filters may break domain key etc. signatures. If you use an SMTP-based content filter, then you should add a line to master.cf with "-o disable_mime_output_conversion=yes", as described in the advanced content filter example.

While testing how the configured system plays with some mailing lists (such as postfix-users or SpamAssassin users list), one has to keep in mind that amavisd-new caches spam checking results of recently seen message bodies: a mail going out to a mailing list is not yet signed as it reaches a content filter, but the SpamAssassin verdict is remembered at that point (claiming the message is not signed). When this message with unchanged body comes back from a mailing list, this time signed in the header section by our domain, the signature should prove correct, yet the cached result from a minute ago still claims the message is not signed. If this is of concern, one can turn off caching of spam checking results for ham by setting: $spam_check_negative_ttl = 0;

While on the topic of providing multiple paths through amavisd, when one has to deal with a mailing list manager (e.g. Mailman) in the same setup, and re-signing of its fan-out mail is desired, it may be useful to add a third path through amavisd, this one stripped down to bare bones, providing only DKIM signing and nothing else (no virus or spam checks, no decoding), as these checks were already done once on mail before it reached a mailing list manager. Here is one possibility, accepting mail on port 10028 and sending it to 10025:

  $inet_socket_port = [10024,10026,10028];

  $interface_policy{'10028'} = 'NOCHECKS';

  $policy_bank{'NOCHECKS'} = {  # no checks, just DKIM signing
    originating => 1,  # allows signing
    forward_method => 'smtp:[127.0.0.1]:10025',
    smtpd_greeting_banner =>
      '${helo-name} ${protocol} ${product} NOCHECKS service ready',
    mynetworks_maps => [],  # avoids loading MYNETS policy unnecessarily
    os_fingerprint_method => undef,
    penpals_bonus_score => undef,
    bounce_killer_score => 0,
    bypass_decode_parts => 1,
    bypass_header_checks_maps => [1],
    bypass_virus_checks_maps  => [1],
    bypass_spam_checks_maps   => [1],
    bypass_banned_checks_maps => [1],
    spam_lovers_maps          => [1],
    banned_files_lovers_maps  => [1],
    archive_quarantine_to_maps => [],
    remove_existing_x_scanned_headers => undef,
    remove_existing_spam_headers => undef,
    signed_header_fields => { 'Sender' => 1 },
  };

Hooking-in dkim-milter (optional)

This section can be ignored when all DKIM signing and verification is to be done by amavisd, and dkim-milter will not be used. It is mainly provided for compatibility reasons, retaining the old documentation section.

Let's begin by starting a dkim milter in two instances, one dedicated to signing, the other to verification. For security reasons all milters should run under a dedicated username, certainly not as root, not as user amavis and not as user postfix or mail:

verifying:

  dkim-filter -u dkfilter -b v \
    -l -p inet:4443@127.0.0.1 -P /var/run/dkim-filter-v.pid

signing:

  dkim-filter -u dkimfilter -b s -m ORIGINATING \
    -c relaxed/simple -S rsa-sha1 \
    -d example.com -s myselector -k /var/db/dkim/mykey.pem \
    -l -p inet:4445@127.0.0.1 -P /var/run/dkim-filter-s.pid

Generating a public and a private pair of keys and publishing a public key in DNS is described in the dkim milter documentation and also in the DKIM RFC document.

We are not specifying option -i to milters, the default of -i 127.0.0.1 suits our setup just fine, as mail to be signed is coming from a content filter, usually on a loopback interface from the IP address 127.0.0.1.

Now we can tie the verifying milter to a Postfix smtpd service listening for incoming mail:

master.cf:

  smtp inet n - n - 300 smtpd
    -o milter_default_action=accept
    -o milter_macro_daemon_name=MTA
    -o smtpd_milters=inet:127.0.0.1:4443

and tie the signing milter to a Postfix smtpd service that is receiving checked mail from amavisd, intended to be signed:

master.cf:

  # mail return from a content filter (non-signing)
  10025 inet n - n - - smtpd
    -o content_filter=
    ... (other options, mail not to be signed) ...

  # mail from our users returning from a content filter (DKIM signing)
  10027 inet n - n - - smtpd
    -o content_filter=
    ... (other options, mail intended to be signed) ...
    -o milter_default_action=accept
    -o milter_macro_daemon_name=ORIGINATING
    -o smtpd_milters=inet:127.0.0.1:4445

As a sidenote, attaching milters to sendmail would use the same order of invocations: signature verifying milter first, content filters next, and signing milter last, for example:

  dnl Verifiers:
  INPUT_MAIL_FILTER(`dkim-filter-v', `S=inet:4443@127.0.0.1, T=R:2m')

  dnl Content filter:
  INPUT_MAIL_FILTER(`amavisd-milter',
    `S=unix:/var/amavis/amavisd-milter.sock, F=T, T=S:10m;R:10m;E:10m')

  dnl Signers:
  INPUT_MAIL_FILTER(`dkim-filter-s', `S=inet:4445@127.0.0.1, T=R:2m')

Setting up DKIM signature verification in amavisd

Starting with 2.6.0, verification of DKIM signatures (and historical DomainKeys signatures) is provided directly by amavisd (not only by a SpamAssassin plugin DKIM). A required version of a perl module Mail::DKIM is 0.31 or later, but recommended is 0.33 or later. Signature verification is sufficiently fast so there is no need for concern about extra processing load (see TIMING breakdown in your log, level 2). To turn on DKIM (and historical DomainKeys) signature verification, please add the following line to amavisd.conf (if not already there):

  $enable_dkim_verification = 1;

Benefits:

Currently the ADSP (RFC 5617, Author Domain Signing Practices, formerly SSP) is not implemented by amavisd, but is implemented in the SpamAssassin's plugin DKIM as of version 3.3.0.

Setting up DKIM signing in amavisd

A recommended version of a perl module Mail::DKIM is 0.33 or later when signing.

1. Generate one or more keys to be used for signing, and enable signing code by adding the following line to amavisd.conf (if not already there):

  $enable_dkim_signing = 1;  # loads DKIM signing code

Signing keys must be made available to amavisd, each private key in a separate file in PEM format. Customarily such keys would be generated and kept in a dedicated directory such as /var/db/dkim or /var/lib/dkim, preferably owned by root.

Private keys can be generated by a 'openssl genrsa' command (see RFC 4871 Appendix C), or by an amavisd equivalent. Commonly one key per signing domain or one key per signing host is used, but other choices are possible. If such keys were already prepared for some other DKIM-signing solution, they can be reused by amavisd.

  # amavisd genrsa /var/db/dkim/a.key.pem
  # amavisd genrsa /var/db/dkim/b.key.pem 786
  # amavisd genrsa /var/db/dkim/sel-example-com.key.pem
  # amavisd genrsa /var/db/dkim/g-guest-ex-com.key.pem
  # amavisd genrsa /var/db/dkim/notif-mail.key.pem 512

Amavisd already ensures the generated files are only readable by owner, but a manual procedure may require explicitly setting file permissions. Private keys must be protected from unauthorized access, only the signing software such as amavisd should have access. Amavisd loads these files on startup before dropping privileges, so if amavisd is started as root it is not necessary that these key files are readable by uid under which amavisd is running.

2. Add commands to amavisd.conf to load private keys, associate them with signing domains and selectors, and describe constraints (tags) to be published with public keys.

Calls to dkim_key() load all available private keys and supply their public key RR constraints. Arguments are a domain, a selector, a key (a file name of a private key in PEM format), followed by optional attributes/constraints (tags, represented here as Perl hash key/value pairs) which are allowed by RFC 4871 in a public key resource record (v, g, h, k, n, s, t), of which only g, h, k, s and t are considered to be constraints limiting the choice of a signing key. A command 'amavisd showkeys' can be used for displaying corresponding public keys in a format directly suitable for inclusion into DNS zone files.

For example:

#        signing domain  selector     private key              options
#        -------------   --------     ----------------------   ----------
dkim_key('example.org', 'abc',       '/var/db/dkim/a.key.pem');
dkim_key('example.org', 'yyy',       '/var/db/dkim/b.key.pem', t=>'s');
dkim_key('example.org', 'zzz',       '/var/db/dkim/b.key.pem', h=>'sha256');
dkim_key('example.com', 'sel-2008',  '/var/db/dkim/sel-example-com.key.pem',
         t=>'s:y', g=>'*', k=>'rsa', h=>'sha256:sha1', s=>'email',
         n=>'testing; 1, 2');
dkim_key('guest.example.com', 'g',     '/var/db/dkim/g-guest-ex-com.key.pem');
dkim_key('mail.example.com',  'notif', '/var/db/dkim/notif-mail.key.pem');

A selector paired with a domain name uniquely identifies a key, both for a signer as well as for a recipient. There may be multiple keys for each domain as long as each one has its own selector.

A selector along with a domain name will be used by a receiving mailer in assembling a DNS query (selector._domainkey.signingdomain) to fetch a public key from a signing domain's DNS server when verifying signature validity.

A selector paired with a domain name will also be used by a signing amavisd when choosing a key applicable to signing, meeting constraints on its public key (tags, RFC 4871 section 3.6) as given by optional arguments. Optional arguments serve as site documentation, may help amavisd choose between multiple choices (ruling out keys with incompatible tags), and supply additional information for step 3.

For a list of options (tags) see RFC 4871 section 3.6. Amavisd does not check the syntax of tag values, except for performing qp-section encoding of a tag 'n'. Note the Perl syntax of key/value pairs, e.g. t => 's:y' will end up as "t=s:y", and n => 'testing; 1, 2' will end up encoded as "n=testing=3B 1, 2".

3. Prepare and publish public keys.

Public keys can be extracted from generated key files (which contain both a private and a public key). To publish public keys they need to be edited into a format suitable for inclusion in a DNS server's zone file for each signing domain, either by following a procedure in RFC 4871 Appendix C, or if step 2 was completed, by asking amavisd to do so:

  # amavisd showkeys

or more selectively, e.g.:

  # amavisd showkeys  .org example.com

This step is not needed if public keys were already prepared and published earlier for some other DKIM-signing solution.

4. Edit zone files in master DNS server(s) for each signing domain, adding the just prepared TXT resource records, not forgetting to bump up the serial number in a SOA record. Optionally add a TXT record with ADSP information (formerly SSP) if a default Author Domain Signing Practices is not appropriate. Then reload zone(s) or restart DNS server(s).

5. Test published public keys.

Similar to 'showkeys', a 'testkeys' command walks through available signing keys (as declared by calls to dkim_key), generates test messages each signed with one key, and validates them by fetching a corresponding public key from a DNS server.

  # amavisd testkeys

or more selectively, e.g.:

  # amavisd testkeys  .org example.com

(btw, if testkeys fails and you believe your DNS is correctly serving your DKIM public keys, you may need to upgrade Perl module Mail-DKIM to version 0.33)

6. Restart amavisd, watch the log at log level 2, searching for " dkim: ".

Note that signing could be started (amavisd reload) right after completing step 2, but mail recipients would not be able to verify validity of signatures until public keys are made available by a signing domain through its DNS. Recipients are supposed to treat mail with signatures which fail verification exactly the same as mail with no signatures, so there is usually no harm done with a premature start of signing, but there is no benefit either.

7. Optional: to override default values for signature tags, one may specify by-sender signature tags through @dkim_signature_options_bysender_maps.

@dkim_signature_options_bysender_maps maps author/sender addresses or domains to signature tags/requirements. Possible signature tags according to RFC 4871 are: (v), a, (b), (bh), c, d, (h), i, l, q, s, (t), x, z; of which the following are determined automatically: v, b, bh, h, t (tag h is controlled by %signed_header_fields). Currently ignored tags are l and z. Instead of an absolute expiration time (tag x) one may use a pseudo tag 'ttl' to specify a relative expiration time in seconds, which is converted to an absolute expiration time prior to signing: x = t + ttl. A built-in default is provided for each tag if no better match is found.

For example:

@dkim_signature_options_bysender_maps = ( {
  'postmaster@mail.example.com' => { a => 'rsa-sha1', ttl =>  7*24*3600 },
  'spam-reporter@example.com'   => { a => 'rsa-sha1', ttl =>  7*24*3600 },
  'mail.example.com'            => { a => 'rsa-sha1', ttl => 10*24*3600 },
  # explicit 'd' forces a third-party signature on foreign (hosted) domains
  'ggg.example.net'             => { d => 'guest.example.com' },
  '.example.com'                => { d => 'example.com' },
  # catchall defaults
  '.' => { a => 'rsa-sha256', c => 'relaxed/simple', ttl => 30*24*3600 },
  # 'd' defaults to a domain of an author/sender address,
  # 's' defaults to whatever selector is offered by a matching key
} );

The result of a by-sender lookup into @dkim_signature_options_bysender_maps is a hash (a set) of DKIM signing requirements (tags), i.e. canonicalization method, hashing algorithm, domain, identity, selector and expiration time. All matching entries can participate in the result: for each tag individually the first setting (the most specific) is chosen from all matching entries. Resulting tags are then used to choose the most appropriate signing key from a set of keys as declared by calls to dkim_key. Main selection criterium is a match on tags d (domain) and s (selector), but other signature requirements must also meet the constraints of a public key (e.g. subdomain matching flag, granularity, hashing algorithm, key type). If a lookup does not find a signing key which meets requirements, no signing takes place. Also, only mail with 'originating' flag is eligible for signing. A lookup is based on either the From header field, the Sender header field, the Resent-From and Resent-Sender header field, or on a mail_from address from the envelope, whichever yields a useful result first. Note that neither the Sender header field, nor the Resent-* header fields, nor a mail_from address has any special meaning in the standard (RFC 4871). This results either in an author signature (i.e. a first-party signature, when based on a From header field), or in a third-party signature (when signing domain does not match the From, regardless of what other header field (or forced through a 'd' tag) it was based on.

An associative array %signed_header_fields controls which header fields are to be signed. By default it contains a standard (RFC 4871) set of header field names, augmented by some additional header field names considered appropriate at the time of a release (RFC 4021, RFC 3834). In addition a 'Sender' header field is excluded because it is frequently replaced by a mailing list, and as the RFC 2821 mandates there can only be one such header field the original one is dropped, invalidating a signature. Also the 'To' and 'Cc' are excluded from a default set because sendmail mailers are known to gratuitously reformat the list, invalidating a signature.

The default set of header fields to be signed can be controlled by setting %signed_header_fields elements to true (to sign) or to false (not to sign). Keys must be in lowercase, e.g.:

  $signed_header_fields{'received'} = 0;  # turn off signing of Received
  $signed_header_fields{'sender'} = 1;    # turn on signing of Sender
  $signed_header_fields{'to'} = 1;        # turn on signing of To
  $signed_header_fields{'cc'} = 1;        # turn on signing of Cc
  $signed_header_fields{lc('X-MySpecialFlag')} = 1;

Putting DKIM verification to good use in SpamAssassin

In SpamAssassin all that is necessary is to add (or uncomment) a line in any of the .pre files (e.g. in local.pre, or in init.pre and v320.pre):

  loadplugin Mail::SpamAssassin::Plugin::DKIM

Perl module Mail::DKIM needs to be installed. Note that Mail::DKIM starting with version 0.20 also recognizes DomainKeys signatures, so that Plugin::DomainKeys is not needed any longer, and in fact its underlying module is not supported any longer. It is advisable to stick to the most recent version of Mail::DKIM, at least 0.32.

The following SpamAssassin rules (in local.cf) work quite well.

  score DKIM_VERIFIED -0.1
  score DKIM_SIGNED    0

  # don't waste time on fetching ASP record, hardly anyone publishes it
  score DKIM_POLICY_SIGNALL  0
  score DKIM_POLICY_SIGNSOME 0
  score DKIM_POLICY_TESTING  0

  # DKIM-based whitelisting of domains with good reputation:
  score USER_IN_DKIM_WHITELIST -8.0

  whitelist_from_dkim  *@ebay.com
  whitelist_from_dkim  *@*.ebay.com
  whitelist_from_dkim  *@ebay.co.uk
  whitelist_from_dkim  *@*.ebay.co.uk
  whitelist_from_dkim  *@ebay.at
  whitelist_from_dkim  *@ebay.ca
  whitelist_from_dkim  *@ebay.de
  whitelist_from_dkim  *@ebay.fr
  whitelist_from_dkim  *@*.paypal.com
  whitelist_from_dkim  *@paypal.com
  whitelist_from_dkim  *@*                paypal.com
  whitelist_from_dkim  *@*.paypal.be

  whitelist_from_dkim  *@cern.ch
  whitelist_from_dkim  *@amazon.com
  whitelist_from_dkim  *@springer.delivery.net
  whitelist_from_dkim  *@cisco.com
  whitelist_from_dkim  *@alert.bankofamerica.com
  whitelist_from_dkim  *@bankofamerica.com
  whitelist_from_dkim  *@cnn.com
  whitelist_from_dkim  *@*.cnn.com
  whitelist_from_dkim  *@skype.net
  whitelist_from_dkim  service@youtube.com
  whitelist_from_dkim  *@welcome.skype.com
  whitelist_from_dkim  *@cc.yahoo-inc.com  yahoo-inc.com
  whitelist_from_dkim  *@cc.yahoo-inc.com
  whitelist_from_dkim  rcapotenoy@yahoo.com
  whitelist_from_dkim  googlealerts-noreply@google.com

  # DKIM-based whitelisting of domains with less then perfect
  # reputation can be given fewer negative score points:
  score USER_IN_DEF_DKIM_WL -1.5
  def_whitelist_from_dkim   *@google.com
  def_whitelist_from_dkim   *@googlemail.com
  def_whitelist_from_dkim   *@*  googlegroups.com
  def_whitelist_from_dkim   *@*  yahoogroups.com
  def_whitelist_from_dkim   *@*  yahoogroups.co.uk
  def_whitelist_from_dkim   *@*  yahoogroupes.fr
  def_whitelist_from_dkim   *@yousendit.com
  def_whitelist_from_dkim   *@meetup.com
  def_whitelist_from_dkim   dailyhoroscope@astrology.com

  # reduce default scores, which are being abused
  score ENV_AND_HDR_DKIM_MATCH -0.1
  score ENV_AND_HDR_SPF_MATCH  -0.5

Another suggestions - penalize mail claiming to be from PayPal, eBay, Yahoo or Gmail but was not signed by their official mailers:

  header   __ML1        Precedence =~ m{\b(list|bulk)\b}i
  header   __ML2        exists:List-Id
  header   __ML3        exists:List-Post
  header   __ML4        exists:Mailing-List
  header   __ML5        Return-Path:addr =~ m{^([^\@]+-(request|bounces|admin|owner)|owner-[^\@]+)(\@|\z)}mi
  meta     __VIA_ML     __ML1 || __ML2 || __ML3 || __ML4 || __ML5
  describe __VIA_ML     Mail from a mailing list

  header   __AUTH_YAHOO1  From:addr =~ m{[\@.]yahoo\.com$}mi
  header   __AUTH_YAHOO2  From:addr =~ m{\@yahoo\.com\.(ar|au|br|cn|hk|mx|my|ph|sg|tw)$}mi
  header   __AUTH_YAHOO3  From:addr =~ m{\@yahoo\.co\.(id|in|jp|nz|th|uk)$}mi
  header   __AUTH_YAHOO4  From:addr =~ m{\@yahoo\.(ca|cn|de|dk|es|fr|gr|ie|it|no|pl|se)$}mi
  meta     __AUTH_YAHOO   __AUTH_YAHOO1 || __AUTH_YAHOO2 || __AUTH_YAHOO3 || __AUTH_YAHOO4
  describe __AUTH_YAHOO   Author claims to be from Yahoo

  header   __AUTH_GMAIL   From:addr =~ m{\@gmail\.com$}mi
  describe __AUTH_GMAIL   Author claims to be from gmail.com

  header   __AUTH_PAYPAL  From:addr =~ /[\@.]paypal\.(com|co\.uk)$/mi
  describe __AUTH_PAYPAL  Author claims to be from PayPal

  header   __AUTH_EBAY    From:addr =~ /[\@.]ebay\.(com|at|be|ca|ch|de|ee|es|fr|hu|ie|in|it|nl|ph|pl|pt|se|co\.(kr|uk)|com\.(au|cn|hk|mx|my|sg))$/mi
  describe __AUTH_EBAY    Author claims to be from eBay

  meta     NOTVALID_YAHOO !DKIM_VERIFIED && __AUTH_YAHOO && !__VIA_ML
  priority NOTVALID_YAHOO 500
  describe NOTVALID_YAHOO Claims to be from Yahoo but is not

  meta     NOTVALID_GMAIL !DKIM_VERIFIED && __AUTH_GMAIL && !__VIA_ML
  priority NOTVALID_GMAIL 500
  describe NOTVALID_GMAIL Claims to be from gmail.com but is not

  meta     NOTVALID_PAY   !DKIM_VERIFIED && (__AUTH_PAYPAL || __AUTH_EBAY)
  priority NOTVALID_PAY   500
  describe NOTVALID_PAY   Claims to be from PayPal or eBay, but is not

  score    NOTVALID_YAHOO  2.8
  score    NOTVALID_GMAIL  2.8
  score    NOTVALID_PAY    6

  # accept replies from abuse@yahoo.com even if not dkim/dk-signed:
  whitelist_from_rcvd abuse@yahoo.com          yahoo.com
  whitelist_from_rcvd MAILER-DAEMON@yahoo.com  yahoo.com

Some experience with DKIM and DomainKeys

Recent versions of software components must be used to avoid bugs and known interoperability problems:

Several big players are already signing mail from their customers or employees: Yahoo! (worldwide), Gmail, eBay, Earthlink, google.com, Amazon, Springer, CNN, Skype, YouTube, Cisco, many universities, etc.

Mail transformations as performed by some mailing lists are probably the most challenging problem facing DKIM deployment (and to other schemes as well). Nevertheless, mailing lists can be configured to either avoid transformations which invalidate mail signatures, or can re-sign fan-out mail. Examples of mailing lists which work very well with DKIM (and DomainKeys), preserving existing signatures provided by posters, are the postfix-users ( postfix-users@postfix.org ) and the SpamAssassin users list ( users@spamassassin.apache.org ). Example of re-signing mailing lists are Yahoo groups. A representative of another type of mailing lists is Mailman, which often modifies mail body and strips out original signatures, unless explicitly configured not to.

When signatures are missing on mail from domains which are known to be signing all their mail (yahoo.com, gmail.com), the most common reason is that a sender submitted his mail through some other provider, but supplied his Yahoo or gmail e-mail address in the From header field. Similar to other schemes designed to prevent faking of sending address, the DKIM (and the DomainKeys) encourages mail submission only through a domain which is used in the From address.

People need to become aware that their best choice is to submit mail through their native domain to prevent their messages from being treated as second-class. With a widespread support for authorized mail submission for roaming users (SASL, TLS) through a mail submission port (tcp port 587, RFC 4409), supported by practically all modern clients and mailers, there is no longer any good excuse for submitting mail through foreign mail submission agents.

Note that some spam is also being signed by DomainKeys or DKIM lately, which is a good thing -- it indicates the sender owns (or ownz) a domain they are sending mail from. This either shows sender's sincere desire of not hiding behind a faked sender mail address (in which case such mail can be easily filtered if necessary), or they are using a short-lived temporary domain (domain kiting), which can be counteracted by black lists of few-days old freshly registered domains (such as http://support-intelligence.com/dob/), spameatingmonkey.net or other reputation schemes. Signing and verifying mail is a good mechanism for companies to reliably whitelist mail from their partner companies or frequent clients.

Links


mm
Last updated: 2010-10-20

Valid XHTML 1.0!