Machine Learning
Everyday Encounters: Spam Detection
May 11, 2017 Kira Jacobsen

Have you ever been waiting to receive a certain email and you finally figure out it has been sitting in your junk folder the whole time? Machine learning has helped make unsolicited emails a thing of the past, but sometimes a few exceptions slip though. 


Spammers are getting smarter with the ways they send emails but so are the algorithms that catch these types of communications. Statistics show that about 40% of emails today are spam which is about 15.4 billion messages a day. Without machine learning and other email filtering techniques we would have a lot more things to sift through day to day.

 

Before machine learning the main way to sort through spam messages was by creating rules. You could setup rules  so that any message with a certain title or certain word in the title would automatically be marked as spam. With machine learning you can create a set of training samples to give the algorithms specifics to learn and then apply these to any new emails that are received. Once they have sorted through all the example emails it is easy to sort new incoming messages into either category without the constant need to continue updating rules. 

 

There are several machine learning techniques that can be used to filter spam messages. Naïve Bayes is a classic technique that is commonly used for this. In this method the frequency of words used in the message are checked and compared to certain words labeled with high probability of being spam. The emails are then rated with an overall total and filtered into your inbox or junk folders accordingly.

 

Having to go through every unwanted message sent to your inbox takes a lot of time but machine learning has helped control this problem. It is a never ending battle as spammers come up with new techniques to get their messages to reach you and machine learning algorithms are trained to deny these messages access to your inbox. 

See more examples of Machine Learning in our Everyday Encounters blog series >>   

Kira Jacobsen
Marketer