Home

Our Services

	About
	Advertising
	Bespoke Software
	Contact Us
	Hosting Services
	Security Consulting
	Terms
	Web Solutions

Members

	BBcode FAQ
	Downloads
	Forums
	Forum FAQ
	Forum Rules
	Groups
	Your Account

Site News

	Forum News
	Stories Archive
	Topics

Tools

	Button Maker
	Domain Registration
	HTTP Return Codes
	Google PR
	MD5 Encrytion Tool
	Search
	Secure Your Email
	Web Links

Manual's

	Memory Terms
	Hex Colour Codes
	PEAR Manual
	phpBB 2.0 D.A.L.
	phpNuke How To
	Web Development

Guides

	Google Guide
	Online Advertising
	Yahoo Guide

Sentinel

We have caught 5883 shameful hackers.

NukeSentinel(tm)

Paypal Referral

Link Exchange

Join our free link exchange

Click Here

Bayesian spam filtering

Online Advertising

Bayesian spam filtering

From Wikipedia the free encyclopedia, by MultiMedia

Home | Up | Next

Bayesian spam filtering is the process of using Bayesian statistical methods to classify documents into categories.

Bayesian filtering was proposed by Sahami et al. (1998) and gained attention in 2002 when it was described in the paper A Plan for Spam by Paul Graham. Since then it has become a popular mechanism to distinguish illegitimate spam email from legitimate email. Many modern mail programs such as Mozilla Thunderbird implement Bayesian spam filtering. Server-side email filters, such as SpamAssassin and ASSP, make use of Bayesian spam filtering techniques, and the functionality is sometimes embedded within mail server software itself.

Advantages

The advantage of Bayesian spam filtering is that it can be trained on a per-user basis.

The spam that a user receives is often related to the online user's activities. For example, a user may have been subscribed to an online newsletter that the user considers to be spam. This online newsletter is likely to contain words that are common to all newsletters, such as the name of the newsletter and its originating email address. A Bayesian spam filter will eventually assign a higher probability based on the user's specific patterns.

The legitimate e-mails a user receives will be tend to be different. For example, in a corporate environment, the company name and the names of clients or customers will be mentioned often. The filter will assign a lower spam probability to emails containing those names.

The word probabilities are unique to each user and can evolve over time with corrective training whenever the filter incorrectly classifies an email. As a result, Bayesian spam filtering accuracy after training is often superior to pre-defined rules.

It can perform particular well in avoiding false negatives, where legitimate email is incorrectly classified as spam. For example, if the email contains the word "Nigeria", which frequently appeared in a long spam campaign, a pre-defined rules filter might reject it outright. A Bayesian filter would mark the word "Nigeria" as a probable spam word, but would take into account other important words that usually indicate legitimate e-mail. For example, the name of a spouse may strongly indicate the e-mail is not spam, which could overcome the use of the "Nigeria."

Some spam filters combine the results of both Bayesian spam filtering and pre-defined rules resulting in even higher filtering accuracy. Recent spammer tactics include insertion of random innocuous words that are not normally associated with spam, thereby decreasing the email's spam score, making it more likely to slip past a Bayesian spam filter.

External links

Guide to Bayesian spam filters: part 1, part 2.

References

(Sahami et al., 1998): M. Sahami, S. Dumais, D. Heckerman, E. Horvitz: A Bayesian approach to filtering junk e-mail, AAAI'98 Workshop on Learning for Text Categorization, 1998.

Online Advertising, made by MultiMedia | Free content and software

This guide is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

You can syndicate our News with backend.php

And our Forums with rss.php

You can also access our feeds via Feedburner Site News and LD Software Forums
© 2009 ld-software.co.uk All Rights Reserved.
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.
Page Generation: 0.63 Seconds