Abstract
Signature-based collaborative spam detection(SCSD) systems provide a promising solution addressing many problems facing statistical spam filters, the most widely adopted technology for detecting junk emails. In particular, some SCSD systems can identify previously unseen spam messages as such, although intuitively this would appear to be impossible. However, the SCSD approach usually relies on huge databases of email signatures, demanding lots of resource in signature lookup as well as signature database storage, transmission and merging. In this paper, we report our enhancements to two representative SCSD systems. In our enhancements, signature lookups can be performed in O(1), independent of the number of signatures in the database. Space-efficient representation can significantly reduce signature database size, before any data compression algorithm is applied. A simple but fast algorithm for merging different signature databases is also supported. We use the Bloom filter and a novel variant to achieve all this.
Keywords
Collaborative spam detection, Bloom filter,security
CS-TR No 973 Enhancing Signature-based Collaborative Spam Detection with Bloom Filters
School of Computing Science, Newcastle University, Jun 2006
[Abstract]
