I was wondering if anyone has around (or knows where to find) a database of spam messages.
I would like to give a try to an idea of mine regarding spam and I'd need some messages (in English preferably) to test it out. O:-)
Once upon a time I had a forum I set up exactly as a spam trap, but i can't remember where it was. xD
http://archive.ics.uci.edu/ml/datasets/Spambase but it is a bit dated. Could always log into the ElkArte gmail account and look in the spam folder!
ETA: Also old but ... http://spamassassin.apache.org/old/publiccorpus/
What about Stop Forum Spam? There used to be an SMF mod tied to it. A good one too.
I'd be interested in doing some experiments, not implement an existing service. ;)
I can send you some mails my Google account filtered out as spam. But that's not what you need, right?
This corpus is a decade or so newer than the ones mentioned by
@Spuds http://www.cc.gatech.edu/projects/doi/WebbSpamCorpus.html
Some other stuff:
http://csmining.org/index.php/enron-spam-datasets.html
http://www.cs.cmu.edu/%7Eenron/ (even has a 2015 version)
I haven't found the time to polish it but I ported it to Elk in February: http://www.elkarte.net/community/index.php?topic=4277.msg30732#msg30732
It's not related to what the OP is asking though. It doesn't do anything with the content of the messages (storing or otherwise). To quote from the readme I wrote:
Sorry guys, didn't know what they may, or may not, offer in those regards. Just trying to help! ;)