USING SQL FOR LOOKUPS, LOG/REPORTING AND QUARANTINE =================================================== This text only describes SQL specifics for a MySQL database and provides a schema. For general aspects of lookups, please see README.lookups. For general SQL notes and further examples please see README.sql. For PostgreSQL-specific notes and schema please see README.sql-pg (which in most respects also applies to a SQLite database). SERIAL can be used instead of INT UNSIGNED NOT NULL AUTO_INCREMENT with databases which do not recognize AUTO_INCREMENT; The attribute SERIAL was introduced with MySQL 4.1.0, but it implicitly creates an additional UNIQUE index, which is redundant. Instead of declaring a time_iso field in table msgs as a string: time_iso char(16) NOT NULL, one may want to declare is as: time_iso TIMESTAMP NOT NULL DEFAULT 0, in which case $timestamp_fmt_mysql *MUST* be set to 1 in amavisd.conf to avoid MySQL inability to accept ISO 8601 timestamps with zone Z and ISO date/time delimiter T; failing to set $timestamp_fmt_mysql makes MySQL store zero time on INSERT and write current local time on UPDATE if auto-update is allowed, which is different from the intended mail timestamp (localtime vs. UTC, off by seconds) Field quarantine.mail_text should be of data type 'blob' and not 'text' as suggested in earlier documentation; this is to prevent it from being unjustifiably associated with a character set, and to be able to store any byte value; to convert existing field from type 'text' to type 'blob' the following clause may be used: ALTER TABLE quarantine CHANGE mail_text mail_text blob; Although MySQL is not particularly picky in checking and enforcing data types, it is appropriate to declare fields users.email, mailaddr.email, and maddr.email as byte strings with no associated character set (which is what these fields are, according to RFC 2821) instead of CHAR or VARCHAR. The following clauses convert pre-2.6.0 tables into the now recommended form: ALTER TABLE users CHANGE email email varbinary(255); ALTER TABLE mailaddr CHANGE email email varbinary(255); ALTER TABLE maddr CHANGE email email varbinary(255); If VARBINARY data type is chosen for these three fields, the setting $sql_allow_8bit_address should be set to true to let the amavisd program use the appropriate data type in SQL commands: $sql_allow_8bit_address = 1; # maddr.email: VARCHAR (0), VARBINARY/BYTEA (1) Similarly, semantics of some other fields is strings of octets too, with no associated character set (which is a characteristic of char and varchar data types), so the following alternations to pre-2.6.2 schema may be beneficial to prevent SQL server from checking validity of octets data against non-applicable arbitrary character set constraints: ALTER table msgs CHANGE mail_id mail_id varbinary(16); ALTER table msgs CHANGE secret_id secret_id varbinary(16); ALTER table msgs CHANGE quar_loc quar_loc varbinary(255); ALTER table msgrcpt CHANGE mail_id mail_id varbinary(16); ALTER table quarantine CHANGE mail_id mail_id varbinary(16); and for good measure: ALTER table msgrcpt CHANGE rid rid bigint unsigned; ALTER table msgs CHANGE sid sid bigint unsigned; Starting with amavisd-new-2.7.0, three fields need to be added to table msgrcpt and one to table msgs: ALTER TABLE msgrcpt ADD rseqnum integer DEFAULT 0 NOT NULL; ALTER TABLE msgrcpt ADD content char(1) DEFAULT ' ' NOT NULL; ALTER TABLE msgrcpt ADD is_local char(1) DEFAULT ' ' NOT NULL; ALTER TABLE msgs ADD originating char(1) DEFAULT ' ' NOT NULL; If you need to create a primary key on table msgrcpt for some reason (clustering?), try something like: UPDATE msgrcpt SET rseqnum=1+floor(999999999*rand()) WHERE rseqnum=0; ALTER TABLE msgrcpt ADD PRIMARY KEY (partition_tag,mail_id,rseqnum); -- local users CREATE TABLE users ( id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY, -- unique id priority integer NOT NULL DEFAULT '7', -- sort field, 0 is low prior. policy_id integer unsigned NOT NULL DEFAULT '1', -- JOINs with policy.id email varbinary(255) NOT NULL UNIQUE, fullname varchar(255) DEFAULT NULL -- not used by amavisd-new -- local char(1) -- Y/N (optional field, see note further down) ); -- any e-mail address (non- rfc2822-quoted), external or local, -- used as senders in wblist CREATE TABLE mailaddr ( id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY, priority integer NOT NULL DEFAULT '7', -- 0 is low priority email varbinary(255) NOT NULL UNIQUE ); -- per-recipient whitelist and/or blacklist, -- puts sender and recipient in relation wb (white or blacklisted sender) CREATE TABLE wblist ( rid integer unsigned NOT NULL, -- recipient: users.id sid integer unsigned NOT NULL, -- sender: mailaddr.id wb varchar(10) NOT NULL, -- W or Y / B or N / space=neutral / score PRIMARY KEY (rid,sid) ); CREATE TABLE policy ( id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY, -- 'id' this is the _only_ required field policy_name varchar(32), -- not used by amavisd-new, a comment virus_lover char(1) default NULL, -- Y/N spam_lover char(1) default NULL, -- Y/N unchecked_lover char(1) default NULL, -- Y/N banned_files_lover char(1) default NULL, -- Y/N bad_header_lover char(1) default NULL, -- Y/N bypass_virus_checks char(1) default NULL, -- Y/N bypass_spam_checks char(1) default NULL, -- Y/N bypass_banned_checks char(1) default NULL, -- Y/N bypass_header_checks char(1) default NULL, -- Y/N virus_quarantine_to varchar(64) default NULL, spam_quarantine_to varchar(64) default NULL, banned_quarantine_to varchar(64) default NULL, unchecked_quarantine_to varchar(64) default NULL, bad_header_quarantine_to varchar(64) default NULL, clean_quarantine_to varchar(64) default NULL, archive_quarantine_to varchar(64) default NULL, spam_tag_level float default NULL, -- higher score inserts spam info headers spam_tag2_level float default NULL, -- inserts 'declared spam' header fields spam_tag3_level float default NULL, -- inserts 'blatant spam' header fields spam_kill_level float default NULL, -- higher score triggers evasive actions -- e.g. reject/drop, quarantine, ... -- (subject to final_spam_destiny setting) spam_dsn_cutoff_level float default NULL, spam_quarantine_cutoff_level float default NULL, addr_extension_virus varchar(64) default NULL, addr_extension_spam varchar(64) default NULL, addr_extension_banned varchar(64) default NULL, addr_extension_bad_header varchar(64) default NULL, warnvirusrecip char(1) default NULL, -- Y/N warnbannedrecip char(1) default NULL, -- Y/N warnbadhrecip char(1) default NULL, -- Y/N newvirus_admin varchar(64) default NULL, virus_admin varchar(64) default NULL, banned_admin varchar(64) default NULL, bad_header_admin varchar(64) default NULL, spam_admin varchar(64) default NULL, spam_subject_tag varchar(64) default NULL, spam_subject_tag2 varchar(64) default NULL, spam_subject_tag3 varchar(64) default NULL, message_size_limit integer default NULL, -- max size in bytes, 0 disable banned_rulenames varchar(64) default NULL, -- comma-separated list of ... -- names mapped through %banned_rules to actual banned_filename tables disclaimer_options varchar(64) default NULL, forward_method varchar(64) default NULL, sa_userconf varchar(64) default NULL, sa_username varchar(64) default NULL ); -- R/W part of the dataset (optional) -- May reside in the same or in a separate database as lookups database; -- REQUIRES SUPPORT FOR TRANSACTIONS; specified in @storage_sql_dsn -- -- MySQL note ( http://dev.mysql.com/doc/mysql/en/storage-engines.html ): -- ENGINE is the preferred term, but cannot be used before MySQL 4.0.18. -- TYPE is available beginning with MySQL 3.23.0, the first version of -- MySQL for which multiple storage engines were available. If you omit -- the ENGINE or TYPE option, the default storage engine is used. -- By default this is MyISAM. -- -- Please create additional indexes on keys when needed, or drop suggested -- ones as appropriate to optimize queries needed by a management application. -- See your database documentation for further optimization hints. With MySQL -- see Chapter 15 of the reference manual. For example the chapter 15.17 says: -- InnoDB does not keep an internal count of rows in a table. To process a -- SELECT COUNT(*) FROM T statement, InnoDB must scan an index of the table, -- which takes some time if the index is not entirely in the buffer pool. -- -- Wayne Smith adds: When using MySQL with InnoDB one might want to -- increase buffer size for both pool and log, and might also want -- to change flush settings for a little better performance. Example: -- innodb_buffer_pool_size = 384M -- innodb_log_buffer_size = 8M -- innodb_flush_log_at_trx_commit = 0 -- The big performance increase is the first two, the third just helps with -- lowering disk activity. Consider also adjusting the key_buffer_size. -- provide unique id for each e-mail address, avoids storing copies CREATE TABLE maddr ( partition_tag integer DEFAULT 0, -- see $partition_tag id bigint unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY, email varbinary(255) NOT NULL, -- full mail address domain varchar(255) NOT NULL, -- only domain part of the email address -- with subdomain fields in reverse CONSTRAINT part_email UNIQUE (partition_tag,email) ) ENGINE=InnoDB; -- information pertaining to each processed message as a whole; -- NOTE: records with NULL msgs.content should be ignored by utilities, -- as such records correspond to messages just being processes, or were lost -- NOTE: instead of a character field time_iso, one might prefer: -- time_iso TIMESTAMP NOT NULL DEFAULT 0, -- but the following MUST then be set in amavisd.conf: $timestamp_fmt_mysql=1 CREATE TABLE msgs ( partition_tag integer DEFAULT 0, -- see $partition_tag mail_id varbinary(16) NOT NULL, -- long-term unique mail id, dflt 12 ch secret_id varbinary(16) DEFAULT '', -- authorizes release of mail_id, 12 ch am_id varchar(20) NOT NULL, -- id used in the log time_num integer unsigned NOT NULL, -- rx_time: seconds since Unix epoch time_iso char(16) NOT NULL, -- rx_time: ISO8601 UTC ascii time sid bigint unsigned NOT NULL, -- sender: maddr.id policy varchar(255) DEFAULT '', -- policy bank path (like macro %p) client_addr varchar(255) DEFAULT '', -- SMTP client IP address (IPv4 or v6) size integer unsigned NOT NULL, -- message size in bytes originating char(1) DEFAULT ' ' NOT NULL, -- sender from inside or auth'd content char(1), -- content type: V/B/U/S/Y/M/H/O/T/C -- virus/banned/unchecked/spam(kill)/spammy(tag2)/ -- /bad-mime/bad-header/oversized/mta-err/clean -- is NULL on partially processed mail -- (prior to 2.7.0 the CC_SPAMMY was logged as 's', now 'Y' is used; -- to avoid a need for case-insenstivity in queries) quar_type char(1), -- quarantined as: ' '/F/Z/B/Q/M/L -- none/file/zipfile/bsmtp/sql/ -- /mailbox(smtp)/mailbox(lmtp) quar_loc varbinary(255) DEFAULT '', -- quarantine location (e.g. file) dsn_sent char(1), -- was DSN sent? Y/N/q (q=quenched) spam_level float, -- SA spam level (no boosts) message_id varchar(255) DEFAULT '', -- mail Message-ID header field from_addr varchar(255) CHARACTER SET utf8mb4 COLLATE utf8_bin DEFAULT '', -- mail From header field, UTF8 subject varchar(255) CHARACTER SET utf8mb4 COLLATE utf8_bin DEFAULT '', -- mail Subject header field, UTF8 host varchar(255) NOT NULL, -- hostname where amavisd is running PRIMARY KEY (partition_tag,mail_id) -- FOREIGN KEY (sid) REFERENCES maddr(id) ON DELETE RESTRICT ) ENGINE=InnoDB; CREATE INDEX msgs_idx_sid ON msgs (sid); CREATE INDEX msgs_idx_mess_id ON msgs (message_id); -- useful with pen pals CREATE INDEX msgs_idx_time_num ON msgs (time_num); -- alternatively when purging based on time_iso (instead of msgs_idx_time_num): -- CREATE INDEX msgs_idx_time_iso ON msgs (time_iso); -- When using FOREIGN KEY contraints, InnoDB requires index on a field -- (an the field must be the first field in the index). Hence create it: -- CREATE INDEX msgs_idx_mail_id ON msgs (mail_id); -- per-recipient information related to each processed message; -- NOTE: records in msgrcpt without corresponding msgs.mail_id record are -- orphaned and should be ignored and eventually deleted by external utilities CREATE TABLE msgrcpt ( partition_tag integer DEFAULT 0, -- see $partition_tag mail_id varbinary(16) NOT NULL, -- (must allow duplicates) rseqnum integer DEFAULT 0 NOT NULL, -- recip's enumeration within msg rid bigint unsigned NOT NULL, -- recipient: maddr.id (dupl. allowed) is_local char(1) DEFAULT ' ' NOT NULL, -- recip is: Y=local, N=foreign content char(1) DEFAULT ' ' NOT NULL, -- content type V/B/U/S/Y/M/H/O/T/C ds char(1) NOT NULL, -- delivery status: P/R/B/D/T -- pass/reject/bounce/discard/tempfail rs char(1) NOT NULL, -- release status: initialized to ' ' bl char(1) DEFAULT ' ', -- sender blacklisted by this recip wl char(1) DEFAULT ' ', -- sender whitelisted by this recip bspam_level float, -- per-recipient (total) spam level smtp_resp varchar(255) DEFAULT '', -- SMTP response given to MTA PRIMARY KEY (partition_tag,mail_id,rseqnum) -- FOREIGN KEY (rid) REFERENCES maddr(id) ON DELETE RESTRICT, -- FOREIGN KEY (mail_id) REFERENCES msgs(mail_id) ON DELETE CASCADE ) ENGINE=InnoDB; CREATE INDEX msgrcpt_idx_mail_id ON msgrcpt (mail_id); CREATE INDEX msgrcpt_idx_rid ON msgrcpt (rid); -- mail quarantine in SQL, enabled by $*_quarantine_method='sql:' -- NOTE: records in quarantine without corresponding msgs.mail_id record are -- orphaned and should be ignored and eventually deleted by external utilities CREATE TABLE quarantine ( partition_tag integer DEFAULT 0, -- see $partition_tag mail_id varbinary(16) NOT NULL, -- long-term unique mail id chunk_ind integer unsigned NOT NULL, -- chunk number, starting with 1 mail_text blob NOT NULL, -- store mail as chunks of octets PRIMARY KEY (partition_tag,mail_id,chunk_ind) -- FOREIGN KEY (mail_id) REFERENCES msgs(mail_id) ON DELETE CASCADE ) ENGINE=InnoDB; -- field msgrcpt.rs is primarily intended for use by quarantine management -- software; the value assigned by amavisd is a space; -- a short _preliminary_ list of possible values: -- 'V' => viewed (marked as read) -- 'R' => released (delivered) to this recipient -- 'p' => pending (a status given to messages when the admin received the -- request but not yet released; targeted to banned parts) -- 'D' => marked for deletion; a cleanup script may delete it -- ===================== -- Example data follows: -- ===================== INSERT INTO users VALUES ( 1, 9, 5, 'user1+foo@y.example.com','Name1 Surname1', 'Y'); INSERT INTO users VALUES ( 2, 7, 5, 'user1@y.example.com', 'Name1 Surname1', 'Y'); INSERT INTO users VALUES ( 3, 7, 2, 'user2@y.example.com', 'Name2 Surname2', 'Y'); INSERT INTO users VALUES ( 4, 7, 7, 'user3@z.example.com', 'Name3 Surname3', 'Y'); INSERT INTO users VALUES ( 5, 7, 7, 'user4@example.com', 'Name4 Surname4', 'Y'); INSERT INTO users VALUES ( 6, 7, 1, 'user5@example.com', 'Name5 Surname5', 'Y'); INSERT INTO users VALUES ( 7, 5, 0, '@sub1.example.com', NULL, 'Y'); INSERT INTO users VALUES ( 8, 5, 7, '@sub2.example.com', NULL, 'Y'); INSERT INTO users VALUES ( 9, 5, 5, '@example.com', NULL, 'Y'); INSERT INTO users VALUES (10, 3, 8, 'userA', 'NameA SurnameA anywhere', 'Y'); INSERT INTO users VALUES (11, 3, 9, 'userB', 'NameB SurnameB', 'Y'); INSERT INTO users VALUES (12, 3,10, 'userC', 'NameC SurnameC', 'Y'); INSERT INTO users VALUES (13, 3,11, 'userD', 'NameD SurnameD', 'Y'); INSERT INTO users VALUES (14, 3, 0, '@sub1.example.net', NULL, 'Y'); INSERT INTO users VALUES (15, 3, 7, '@sub2.example.net', NULL, 'Y'); INSERT INTO users VALUES (16, 3, 5, '@example.net', NULL, 'Y'); INSERT INTO users VALUES (17, 7, 5, 'u1@example.org', 'u1', 'Y'); INSERT INTO users VALUES (18, 7, 6, 'u2@example.org', 'u2', 'Y'); INSERT INTO users VALUES (19, 7, 3, 'u3@example.org', 'u3', 'Y'); -- INSERT INTO users VALUES (20, 0, 5, '@.', NULL, 'N'); -- catchall INSERT INTO policy (id, policy_name, virus_lover, spam_lover, banned_files_lover, bad_header_lover, bypass_virus_checks, bypass_spam_checks, bypass_banned_checks, bypass_header_checks, spam_modifies_subj, spam_tag_level, spam_tag2_level, spam_kill_level) VALUES (1, 'Non-paying', 'N','N','N','N', 'Y','Y','Y','N', 'Y', 3.0, 7, 10), (2, 'Uncensored', 'Y','Y','Y','Y', 'N','N','N','N', 'N', 3.0, 999, 999), (3, 'Wants all spam','N','Y','N','N', 'N','N','N','N', 'Y', 3.0, 999, 999), (4, 'Wants viruses', 'Y','N','Y','Y', 'N','N','N','N', 'Y', 3.0, 6.9, 6.9), (5, 'Normal', 'N','N','N','N', 'N','N','N','N', 'Y', 3.0, 6.9, 6.9), (6, 'Trigger happy', 'N','N','N','N', 'N','N','N','N', 'Y', 3.0, 5, 5), (7, 'Permissive', 'N','N','N','Y', 'N','N','N','N', 'Y', 3.0, 10, 20), (8, '6.5/7.8', 'N','N','N','N', 'N','N','N','N', 'N', 3.0, 6.5, 7.8), (9, 'userB', 'N','N','N','Y', 'N','N','N','N', 'Y', 3.0, 6.3, 6.3), (10,'userC', 'N','N','N','N', 'N','N','N','N', 'N', 3.0, 6.0, 6.0), (11,'userD', 'Y','N','Y','Y', 'N','N','N','N', 'N', 3.0, 7, 7); -- sender envelope addresses needed for white/blacklisting INSERT INTO mailaddr VALUES (1, 5, '@example.com'); INSERT INTO mailaddr VALUES (2, 9, 'owner-postfix-users@postfix.org'); INSERT INTO mailaddr VALUES (3, 9, 'amavis-user-admin@lists.sourceforge.net'); INSERT INTO mailaddr VALUES (4, 9, 'makemoney@example.com'); INSERT INTO mailaddr VALUES (5, 5, '@example.net'); INSERT INTO mailaddr VALUES (6, 9, 'spamassassin-talk-admin@lists.sourceforge.net'); INSERT INTO mailaddr VALUES (7, 9, 'spambayes-bounces@python.org'); -- whitelist for user 14, i.e. default for recipients in domain sub1.example.net INSERT INTO wblist VALUES (14, 1, 'W'); INSERT INTO wblist VALUES (14, 3, 'W'); -- whitelist and blacklist for user 17, i.e. u1@example.org INSERT INTO wblist VALUES (17, 2, 'W'); INSERT INTO wblist VALUES (17, 3, 'W'); INSERT INTO wblist VALUES (17, 6, 'W'); INSERT INTO wblist VALUES (17, 7, 'W'); INSERT INTO wblist VALUES (17, 5, 'B'); INSERT INTO wblist VALUES (17, 4, 'B'); -- $sql_select_policy setting in amavisd.conf tells amavisd -- how to fetch per-recipient policy settings. -- See comments there. Example: -- -- SELECT *,users.id FROM users,policy -- WHERE (users.policy_id=policy.id) AND (users.email IN (%k)) -- ORDER BY users.priority DESC; -- -- $sql_select_white_black_list in amavisd.conf tells amavisd -- how to check sender in per-recipient whitelist/blacklist. -- See comments there. Example: -- -- SELECT wb FROM wblist,mailaddr -- WHERE (wblist.rid=?) AND (wblist.sid=mailaddr.id) AND (mailaddr.email IN (%k)) -- ORDER BY mailaddr.priority DESC; -- NOTE: the SELECT, INSERT and UPDATE clauses as used by the amavisd-new -- program are configurable through %sql_clause; see amavisd.conf-default Some examples of a query: -- mail from last two minutes: SELECT UNIX_TIMESTAMP()-msgs.time_num AS age, SUBSTRING(policy,1,2) as pb, msgrcpt.content AS c, dsn_sent as dsn, ds, bspam_level AS level, size, SUBSTRING(sender.email,1,18) AS s, SUBSTRING(recip.email,1,18) AS r, SUBSTRING(msgs.subject,1,10) AS subj FROM msgs LEFT JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid=sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid=recip.id WHERE msgrcpt.content IS NOT NULL AND UNIX_TIMESTAMP()-msgs.time_num < 120 ORDER BY msgs.time_num DESC; -- clean messages ordered by count, grouped by domain: SELECT count(*) as cnt, avg(bspam_level), sender.domain FROM msgs LEFT JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid=sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid=recip.id WHERE msgrcpt.content='C' GROUP BY sender.domain ORDER BY cnt DESC LIMIT 50; -- top spamy domains with >10 messages, sorted by spam average, -- grouped by domain: SELECT count(*) as cnt, avg(bspam_level) as spam_avg, sender.domain FROM msgs LEFT JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid=sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid=recip.id WHERE bspam_level IS NOT NULL GROUP BY sender.domain HAVING count(*) > 10 ORDER BY spam_avg DESC LIMIT 50; -- sender domains with >100 messages, sorted on sender.domain: SELECT count(*) as cnt, avg(bspam_level) as spam_avg, sender.domain FROM msgs LEFT JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid=sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid=recip.id GROUP BY sender.domain HAVING count(*) > 100 ORDER BY sender.domain DESC LIMIT 100; Upgrading from pre 2.4.0 amavisd-new SQL schema to the 2.4.0 schema requires adding column 'quar_loc' to table msgs, and creating FOREIGN KEY constraint to facilitate deletion of expired records. The following clauses should be executed for upgrading pre-2.4.0 amavisd-new SQL schema to the 2.4.0 schema. Creating a FOREIGN KEY ... ON DELETE CASCADE constraint may (or may not) facilitate deletion of expired records. -- mandatory change: ALTER TABLE msgs ADD quar_loc varchar(255) DEFAULT ''; -- optional, avoids need to purge tables msgrcpt and quarantine explicitly: ALTER TABLE quarantine ADD FOREIGN KEY (mail_id) REFERENCES msgs(mail_id) ON DELETE CASCADE; ALTER TABLE msgrcpt ADD FOREIGN KEY (mail_id) REFERENCES msgs(mail_id) ON DELETE CASCADE; -- the following two ALTERs are not essential; if data type of maddr.id is -- incompatible with msgs.sid and msgs.rid (e.g. BIGINT vs. INT) and MySQL -- complains, don't bother to apply the constraint: ALTER TABLE msgs ADD FOREIGN KEY (sid) REFERENCES maddr(id) ON DELETE RESTRICT; ALTER TABLE msgrcpt ADD FOREIGN KEY (rid) REFERENCES maddr(id) ON DELETE RESTRICT; EXAMPLE of a log/report/quarantine database housekeeping ======================================================== Using a changing partition_tag, perhaps by using an ISO 8601 week number (value 1 to 53) as a partition_tag: $partition_tag = sub { my($msginfo)=@_; sprintf("%02d",iso8601_week($msginfo->rx_time)) }; allows for probably the fastest method of purging old records, e.g.: DELETE FROM msgs WHERE partition_tag >= 13 AND partition_tag <= 23; DELETE FROM msgrcpt WHERE partition_tag >= 13 AND partition_tag <= 23; DELETE FROM quarantine WHERE partition_tag >= 13 AND partition_tag <= 23; DELETE FROM maddr WHERE partition_tag >= 13 AND partition_tag <= 23; Note: using native SQL table partitioning as offered by MySQL, the above may be even faster by dropping entire partitions. Not documented here. Alternatively, purge records from table msgs by their creation time: DELETE FROM msgs WHERE time_num < UNIX_TIMESTAMP() - 21*24*3600; DELETE FROM msgs WHERE time_num < UNIX_TIMESTAMP() - 3600 AND content IS NULL; Optionally certain content types may be given shorter lifetime: DELETE FROM msgs WHERE time_num < UNIX_TIMESTAMP() - 7*24*3600 AND (content='V' OR (content='S' AND spam_level > 20)); ( or equivalently, if a data type of msgs.time_iso is TIMESTAMPS and in amavisd.conf the $timestamp_fmt_mysql is set to true: DELETE FROM msgs WHERE time_iso < UTC_TIMESTAMP() - INTERVAL 21 day; DELETE FROM msgs WHERE time_iso < UTC_TIMESTAMP() - INTERVAL 1 hour AND content IS NULL; DELETE FROM msgs WHERE time_iso < UTC_TIMESTAMP() - INTERVAL 7 day AND (content='V' OR (content='S' AND spam_level > 20)); ) Then delete unreferenced records from tables msgrcpt and quarantine, unless they were already automatically deleted while purging the msgs table and FOREIGN KEY ... ON DELETE CASCADE is in place: DELETE FROM msgrcpt WHERE NOT EXISTS (SELECT 1 FROM msgs WHERE mail_id=msgrcpt.mail_id); DELETE FROM quarantine WHERE NOT EXISTS (SELECT 1 FROM msgs WHERE mail_id=quarantine.mail_id); ( or equivalently: DELETE msgrcpt FROM msgrcpt LEFT JOIN msgs USING(mail_id) WHERE msgs.mail_id IS NULL; DELETE quarantine FROM quarantine LEFT JOIN msgs USING(mail_id) WHERE msgs.mail_id IS NULL; ) Finally delete unreferenced records from table maddr: DELETE FROM maddr WHERE NOT EXISTS (SELECT 1 FROM msgs WHERE sid=id) AND NOT EXISTS (SELECT 1 FROM msgrcpt WHERE rid=id); SOME FURTHER THOUGHTS on a log/report/quarantine database housekeeping ====================================================================== Discarding indexes makes deletion faster; if we expect a large proportion of records to be deleted it may be quicker to discard index, do deletions, and re-create index; for daily maintenance this does not pay off DROP INDEX msgs_idx_sid ON msgs; DROP INDEX msgrcpt_idx_rid ON msgrcpt; DROP INDEX msgrcpt_idx_mail_id ON msgrcpt; Re-create indexes (if they were removed in the first step): CREATE INDEX msgs_idx_sid ON msgs (sid); CREATE INDEX msgrcpt_idx_rid ON msgrcpt (rid); CREATE INDEX msgrcpt_idx_mail_id ON msgrcpt (mail_id); Optionally, optimize tables once in a while: OPTIMIZE TABLE msgs, msgrcpt, quarantine, maddr;