Skip to main content
Topic: Database of spam messages? (Read 97 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Database of spam messages?

I was wondering if anyone has around (or knows where to find) a database of spam messages.
I would like to give a try to an idea of mine regarding spam and I'd need some messages (in English preferably) to test it out. O:-)

Once upon a time I had a forum I set up exactly as a spam trap, but i can't remember where it was. xD
Bugs creator.
Features destroyer.
Template killer.


Re: Database of spam messages?

Reply #2
What about Stop Forum Spam?  There used to be an SMF mod tied to it.  A good one too.

Re: Database of spam messages?

Reply #3
I'd be interested in doing some experiments, not implement an existing service. ;)
Bugs creator.
Features destroyer.
Template killer.

Re: Database of spam messages?

Reply #4
I can send you some mails my Google account filtered out as spam. But that's not what you need, right?

Re: Database of spam messages?

Reply #5
This corpus is a decade or so newer than the ones mentioned by @Spuds http://www.cc.gatech.edu/projects/doi/WebbSpamCorpus.html

Some other stuff:
http://csmining.org/index.php/enron-spam-datasets.html
http://www.cs.cmu.edu/%7Eenron/ (even has a 2015 version)

What about Stop Forum Spam?  There used to be an SMF mod tied to it.  A good one too.
I haven't found the time to polish it but I ported it to Elk in February: http://www.elkarte.net/community/index.php?topic=4277.msg30732#msg30732

It's not related to what the OP is asking though. It doesn't do anything with the content of the messages (storing or otherwise). To quote from the readme I wrote:
Quote
[Stop Spammer] Blocks spam registrations by checking nickname, IP, and e-mail against the "Stop Forum Spam" DB.

The first and best defense against spam available in Elkarte is to enable Bad Behavior coupled with the Project Honey Pot http:BL. You can enable Bad Behavior in your administration interface under Configuration ⇒ Security and Moderation ⇒ Bad Behavior, where you can also enable checking against the Project Honey Pot database by adding a http:BL API Key. This will prevent the vast majority of spammers from even as much as browsing your sites, where they waste CPU, bandwidth, and can collect information like usernames insofar as they can be publicly viewed.

Stop Spammer jumps in when a spammer manages to get past this enormously effective line of defense by checking their registration attempt against the Stop Forum Spam database. Only malicious bots that try to spam either through posts or profiles will do this in the first place, so enabling Bad Behavior is quite important.

 

Re: Database of spam messages?

Reply #6
Sorry guys, didn't know what they may, or may not, offer in those regards.  Just trying to help!  ;)