fbpx
Wikipedia

Apache SpamAssassin

Apache SpamAssassin is a computer program used for e-mail spam filtering. It uses a variety of spam-detection techniques, including DNS and fuzzy checksum techniques, Bayesian filtering, external programs, blacklists and online databases. It is released under the Apache License 2.0 and is a part of the Apache Foundation since 2004.

Apache SpamAssassin
Developer(s)Apache Software Foundation[1]
Initial releaseApril 20, 2001; 21 years ago (2001-04-20)
Stable release
4.0.0[2][3]  / 17 December 2022; 3 months ago (17 December 2022)
RepositorySpamAssassin Repository
Written inPerl, C
Operating systemCross-platform
TypeSpam filter
LicenseApache License 2.0
Websitespamassassin.apache.org 

The program can be integrated with the mail server to automatically filter all mail for a site. It can also be run by individual users on their own mailbox and integrates with several mail programs. Apache SpamAssassin is highly configurable; if used as a system-wide filter it can still be configured to support per-user preferences.

History

Apache SpamAssassin was created by Justin Mason, who had maintained a number of patches against an earlier program named filter.plx by Mark Jeftovic, which in turn was begun in August 1997. Mason rewrote all of Jeftovic's code from scratch and uploaded the resulting codebase to SourceForge on April 20, 2001.[4]

In Summer 2004 the project became an Apache Software Foundation project and later officially renamed to Apache SpamAssassin.[5]

The SpamAssassin 3.4.2 release in September 2019 was the first in over three years, but the developers say that "The project has picked up a new set of developers and is moving forward again.".[6]

In December 2019, version 3.4.3 of SpamAssassin was released.

In April, 2021, version 3.4.6 of SpamAssassin was released. It was announced that development of version 4.0.0 would become project's focus.[7]

Methods of usage

Apache SpamAssassin is a Perl-based application (Mail::SpamAssassin in CPAN) which is usually used to filter all incoming mail for one or several users. It can be run as a standalone application or as a subprogram of another application (such as a Milter, SA-Exim, Exiscan, MailScanner, MIMEDefang, Amavis) or as a client (spamc) that communicates with a daemon (spamd). The client/server or embedded mode of operation has performance benefits, but under certain circumstances may introduce additional security risks.

Typically either variant of the application is set up in a generic mail filter program, or it is called directly from a mail user agent that supports this, whenever new mail arrives. Mail filter programs such as procmail can be made to pipe all incoming mail through Apache SpamAssassin with an adjustment to a user's procmailrc file.

Operation

Apache SpamAssassin comes with a large set of rules which are applied to determine whether an email is spam or not. Most rules are based on regular expressions that are matched against the body or header fields of the message, but Apache SpamAssassin also employs a number of other spam-fighting techniques. The rules are called "tests" in the SpamAssassin documentation.

Each test has a score value that will be assigned to a message if it matches the test's criteria. The scores can be positive or negative, with positive values indicating "spam" and negative "ham" (non-spam messages). A message is matched against all tests and Apache SpamAssassin combines the results into a global score which is assigned to the message. The higher the score, the higher the probability that the message is spam.

Apache SpamAssassin has an internal (configurable) score threshold to classify a message as spam. Usually a message will only be considered as spam if it matches multiple criteria; matching just a single test will not usually be enough to reach the threshold.

If Apache SpamAssassin considers a message to be spam, it can be further rewritten. In the default configuration, the content of the mail is appended as a MIME attachment, with a brief excerpt in the message body, and a description of the tests which resulted in the mail being classified as spam. If the score is lower than the defined settings, by default the information about the tests passed and total score is still added to the email headers and can be used in post-processing for less severe actions, such as tagging the mail as suspicious.

Apache SpamAssassin allows for a per-user configuration of its behavior, even if installed as system-wide service; the configuration can be read from a file or a database. In their configuration users can specify individuals whose emails are never considered spam, or change the scores for certain rules. The user can also define a list of languages which they want to receive mail in, and Apache SpamAssassin then assigns a higher score to all mails that appear to be written in another language.

Apache SpamAssassin is based on heuristics (pattern recognition), and such software exhibits false positives and false negatives.

Network-based filtering methods

Apache SpamAssassin also supports:

More methods can be added reasonably easily by writing a Perl plug-in for Apache SpamAssassin.

Bayesian filtering

Apache SpamAssassin reinforces its rules through Bayesian filtering where a user or administrator "feeds" examples of good (ham) and bad (spam) into the filter in order to learn the difference between the two. For this purpose, Apache SpamAssassin provides the command-line tool sa-learn, which can be instructed to learn a single mail or an entire mailbox as either ham or spam.

Typically, the user will move unrecognized spam to a separate folder, and then run sa-learn on the folder of non-spam and on the folder of spam separately. Alternatively, if the mail user agent supports it, sa-learn can be called for individual emails. Regardless of the method used to perform the learning, SpamAssassin's Bayesian test will help score future e-mails based on this learning to improve the accuracy.

Licensing

Apache SpamAssassin is free/open source software, licensed under the Apache License 2.0. Versions prior to 3.0 are dual-licensed under the Artistic License and the GNU General Public License.

sa-compile

sa-compile is a utility distributed with Apache SpamAssassin that compiles a SpamAssassin ruleset into a deterministic finite automaton that allows Apache SpamAssassin to use processor power more efficiently.

Testing Apache SpamAssassin

Apache SpamAssassin is designed to trigger on the GTUBE, a 68-byte string similar to the antivirus EICAR test file. If this string is inserted in an RFC 5322 formatted message and passed through the Apache SpamAssassin engine, Apache SpamAssassin will trigger with a weight of 1000.

See also

Notes

  1. ^ http://svn.apache.org/repos/asf/spamassassin/trunk/CREDITS[bare URL plain text file]
  2. ^ "[ANNOUNCE] Apache SpamAssassin 4.0.0 available". 17 December 2022. Retrieved 17 December 2022.
  3. ^ https://github.com/apache/spamassassin/releases/tag/spamassassin_release_4_0_0; publication date: 14 December 2022; retrieved: 17 December 2022.
  4. ^ "SpamAssassin Prehistory". Apache Foundation. Retrieved 19 December 2018.
  5. ^ "SpamAssassin Project Incubation Status". Apache Foundation. Retrieved 19 December 2018.
  6. ^ "SpamAssassin is back". LWN.net. Retrieved 19 December 2018.
  7. ^ "SpamAssassin: News and Announcements". spamassassin.apache.org. Retrieved 2021-04-12.

References

External links

  • Official website  
  • Apache SpamAssassin Wiki
  • Apache SpamAssassin Rule Updates Wiki Automatically updating Apache SpamAssassin
  • KAM.cf KAM Ruleset for Apache SpamAssassin

apache, spamassassin, computer, program, used, mail, spam, filtering, uses, variety, spam, detection, techniques, including, fuzzy, checksum, techniques, bayesian, filtering, external, programs, blacklists, online, databases, released, under, apache, license, . Apache SpamAssassin is a computer program used for e mail spam filtering It uses a variety of spam detection techniques including DNS and fuzzy checksum techniques Bayesian filtering external programs blacklists and online databases It is released under the Apache License 2 0 and is a part of the Apache Foundation since 2004 Apache SpamAssassinDeveloper s Apache Software Foundation 1 Initial releaseApril 20 2001 21 years ago 2001 04 20 Stable release4 0 0 2 3 17 December 2022 3 months ago 17 December 2022 RepositorySpamAssassin RepositoryWritten inPerl COperating systemCross platformTypeSpam filterLicenseApache License 2 0Websitespamassassin wbr apache wbr org The program can be integrated with the mail server to automatically filter all mail for a site It can also be run by individual users on their own mailbox and integrates with several mail programs Apache SpamAssassin is highly configurable if used as a system wide filter it can still be configured to support per user preferences Contents 1 History 2 Methods of usage 3 Operation 4 Network based filtering methods 5 Bayesian filtering 6 Licensing 7 sa compile 8 Testing Apache SpamAssassin 9 See also 10 Notes 11 References 12 External linksHistory EditApache SpamAssassin was created by Justin Mason who had maintained a number of patches against an earlier program named filter plx by Mark Jeftovic which in turn was begun in August 1997 Mason rewrote all of Jeftovic s code from scratch and uploaded the resulting codebase to SourceForge on April 20 2001 4 In Summer 2004 the project became an Apache Software Foundation project and later officially renamed to Apache SpamAssassin 5 The SpamAssassin 3 4 2 release in September 2019 was the first in over three years but the developers say that The project has picked up a new set of developers and is moving forward again 6 In December 2019 version 3 4 3 of SpamAssassin was released In April 2021 version 3 4 6 of SpamAssassin was released It was announced that development of version 4 0 0 would become project s focus 7 Methods of usage EditApache SpamAssassin is a Perl based application Mail SpamAssassin in CPAN which is usually used to filter all incoming mail for one or several users It can be run as a standalone application or as a subprogram of another application such as a Milter SA Exim Exiscan MailScanner MIMEDefang Amavis or as a client spamc that communicates with a daemon spamd The client server or embedded mode of operation has performance benefits but under certain circumstances may introduce additional security risks Typically either variant of the application is set up in a generic mail filter program or it is called directly from a mail user agent that supports this whenever new mail arrives Mail filter programs such as procmail can be made to pipe all incoming mail through Apache SpamAssassin with an adjustment to a user s procmailrc file Operation EditApache SpamAssassin comes with a large set of rules which are applied to determine whether an email is spam or not Most rules are based on regular expressions that are matched against the body or header fields of the message but Apache SpamAssassin also employs a number of other spam fighting techniques The rules are called tests in the SpamAssassin documentation Each test has a score value that will be assigned to a message if it matches the test s criteria The scores can be positive or negative with positive values indicating spam and negative ham non spam messages A message is matched against all tests and Apache SpamAssassin combines the results into a global score which is assigned to the message The higher the score the higher the probability that the message is spam Apache SpamAssassin has an internal configurable score threshold to classify a message as spam Usually a message will only be considered as spam if it matches multiple criteria matching just a single test will not usually be enough to reach the threshold If Apache SpamAssassin considers a message to be spam it can be further rewritten In the default configuration the content of the mail is appended as a MIME attachment with a brief excerpt in the message body and a description of the tests which resulted in the mail being classified as spam If the score is lower than the defined settings by default the information about the tests passed and total score is still added to the email headers and can be used in post processing for less severe actions such as tagging the mail as suspicious Apache SpamAssassin allows for a per user configuration of its behavior even if installed as system wide service the configuration can be read from a file or a database In their configuration users can specify individuals whose emails are never considered spam or change the scores for certain rules The user can also define a list of languages which they want to receive mail in and Apache SpamAssassin then assigns a higher score to all mails that appear to be written in another language Apache SpamAssassin is based on heuristics pattern recognition and such software exhibits false positives and false negatives Network based filtering methods EditApache SpamAssassin also supports DNS based blacklists and DNS based whitelists Fuzzy checksum based spam detection filters such as the Distributed Checksum Clearinghouse Vipul s Razor and the Cloudmark Authority plugins commercial Hashcash email stamps based on proof of work Sender Policy Framework and DomainKeys Identified Mail URI blacklists such as SURBL or URIBL which track spam websitesMore methods can be added reasonably easily by writing a Perl plug in for Apache SpamAssassin Bayesian filtering EditApache SpamAssassin reinforces its rules through Bayesian filtering where a user or administrator feeds examples of good ham and bad spam into the filter in order to learn the difference between the two For this purpose Apache SpamAssassin provides the command line tool sa learn which can be instructed to learn a single mail or an entire mailbox as either ham or spam Typically the user will move unrecognized spam to a separate folder and then run sa learn on the folder of non spam and on the folder of spam separately Alternatively if the mail user agent supports it sa learn can be called for individual emails Regardless of the method used to perform the learning SpamAssassin s Bayesian test will help score future e mails based on this learning to improve the accuracy Licensing EditApache SpamAssassin is free open source software licensed under the Apache License 2 0 Versions prior to 3 0 are dual licensed under the Artistic License and the GNU General Public License sa compile Editsa compile is a utility distributed with Apache SpamAssassin that compiles a SpamAssassin ruleset into a deterministic finite automaton that allows Apache SpamAssassin to use processor power more efficiently Testing Apache SpamAssassin EditApache SpamAssassin is designed to trigger on the GTUBE a 68 byte string similar to the antivirus EICAR test file If this string is inserted in an RFC 5322 formatted message and passed through the Apache SpamAssassin engine Apache SpamAssassin will trigger with a weight of 1000 See also Edit Free and open source software portalAnti spam techniquesNotes Edit http svn apache org repos asf spamassassin trunk CREDITS bare URL plain text file ANNOUNCE Apache SpamAssassin 4 0 0 available 17 December 2022 Retrieved 17 December 2022 https github com apache spamassassin releases tag spamassassin release 4 0 0 publication date 14 December 2022 retrieved 17 December 2022 SpamAssassin Prehistory Apache Foundation Retrieved 19 December 2018 SpamAssassin Project Incubation Status Apache Foundation Retrieved 19 December 2018 SpamAssassin is back LWN net Retrieved 19 December 2018 SpamAssassin News and Announcements spamassassin apache org Retrieved 2021 04 12 References EditMcDonald Alistair September 27 2004 SpamAssassin A Practical Guide to Integration and Configuration 1st ed Packt Publishing p 240 ISBN 978 1 904811 12 1 Schwartz Alan July 2004 SpamAssassin 1st ed O Reilly Media p 207 ISBN 978 0 596 00707 2 External links EditOfficial website Apache SpamAssassin Wiki Apache SpamAssassin Rule Updates Wiki Automatically updating Apache SpamAssassin KAM cf KAM Ruleset for Apache SpamAssassin Retrieved from https en wikipedia org w index php title Apache SpamAssassin amp oldid 1113211911, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.