Free Sample Article
How Good Are Our Weapons in the Spam Wars?
VOLUME 25, NUMBER 1, SPRING 2006
How Good Are Our Weapons
in the Spam Wars?
BOGDAN HOANCA

Will there ever be an end to “spam wars?” We are bombarded with unsolicited bulk email, we deploy newer and supposedly better ways of filtering spam, and yet spammers continue to devise creative ways to circumvent our defenses. Over the years, the volume of spam has increased steadily, in spite of the continued efforts to contain it. Although spam is difficult to define satisfactorily, we can “focus on the quantity, cost-shifting and intrusiveness of spam, rather than on its commercial nature per se” [1]. In order to understand the spam problem and to explore control methods, it is useful to analyze spam control techniques by looking at spam generation and transmission as a system. We find that not all spam control methods are equal, because each operates at different points in the system, where their leverage is widely different. In the end, no email control technique is likely to solve the spam problem definitively.
Spam – Past and Present At the root of the spam problem is the fact that, historically, the Internet started as an open medium for people to communicate and share information. After the U.S. Congress opened the Internet for electronic commerce in 1995, the proliferation of both users and businesses on the Internet led to a dramatic increase in the volume of email, and above all in the volume of unsolicited bulk email (UBE) and unsolicited commercial email (UCE), also known as spam. The range of opinions on spam in media and academic reports goes from mild annoyance to predictions of doom. At one extreme, “just hit delete” seems like a simple solution. Crews [2] argues that spam email is a minor annoyance. At the other extreme, authors fear that spam will overwhelm users and make them stop using email altogether. By 2015 spam will exceed 95% of all email traffic [3], and people will give up email accounts in frustration. A prominent anti-spam organization, Spamhaus believes that the 95% level will be reached as early as 2006 [4]. It is difficult to put the spam problem in concrete figures. In 1997, America Online (AOL) estimated that 5-30% of its 10 million email messages per day were spam [5]. That amounted to “only” 0.5-3 million spam messages per day. According to company news, the average daily number of spam messages peaked in 2003 at 2.4 billion, but dropped to “only” 1.2 billion by 2004. These figures refer to messages recognized as spam and blocked at the AOL gateway, not messages that actually reached the users’ inboxes [6]. According to the same AOL report, the amount of spam reported by users (spam that actually reached their inboxes) dropped by 75% from 2003 to 2004. Even this seemingly encouraging trend indicates an almost 100,000% increase in spam volume from 1997 to 2004. At the University of Alaska Anchorage where the author is employed, 70% of all email messages received at the enterprise gateway are spam. A recent Spamhaus report quotes spam levels of 75% of email [4]. There are several costs associated with spam. First, the sheer volume of unwanted email is lowering the productivity of email users by an estimated 1.4-3.1% [7], a significant loss. The cost of transporting and delivering spam is not borne by spammers, but by ISP’s and ultimately passed on to end users.1 There are indirect costs as well. Email is now the vehicle for delivering viruses and phishing attacks, which can lead to data loss, financial losses or even identity theft. Also, the use of filtering to reduce spam has led to the risk of false positives where legitimate and sometimes very important emails do not get delivered. In 2002 AOL filters blocked 100 emails from Harvard to successful applicants [8]. Problems with false positives in email filtering have led to a decrease of users’ confidence in email as a communications medium, and in some cases to a higher churn rate among ISP customers. According to a recent report by the Organization for Economic Co-operation and Development [9], spam has greatly magnified effects in developing countries as compared to developed countries, because of higher Internet access costs, more lax security measures in local ISPs, and lower available bandwidth.
|
No solution to spam is likely to be completely satisfactory. |
Not insignificantly, controlling spam may lead to reduced freedom of speech, or even outright censorship, if not properly implemented. The associated costs with this are much more difficult to estimate than other spam costs. To complicate matters further, anonymity is needed in some cases in email based communications, and spam control solutions sometimes lead to a loss of anonymity.
Systems Analysis of Spam A simple model of email transmission is to consider four components along the path of an email message: sender client, sender server, receiving server, and receiving client. Clients and servers here are software and hardware subsystems. Examples of email clients are Outlook Express or Eudora running on a client computer. Examples of email servers are Exchange and Sendmail running on a server computer. Sometimes clients and servers are combined, for example in the case of web based email services (e.g., gmail.google.com), where the client and server are integrated behind a web server. A sender uses their client side to compose a message. The client connects to the sending server and delivers the message in an outgoing queue. From there, the sending server connects to the receiving server, validates the existence of the recipient on the receiving server, and delivers the message. The message is stored on the receiving server until retrieved by its addressee. Eventually, the human recipient uses the client on their side to connect to the receiving server and retrieve the message. Any technique that can reduce the volume of spam at the receiving client will reduce costs associated with productivity loss for the human recipient. However, all costs associated with delivering the message, storing it on the receiving server, and delivering it to the client are borne by the owner of the receiving server, and ultimately passed on to the end user. These costs can be reduced if spam is stopped before it reaches the receiving server. The ideal case is when spam never even leaves the sending client. The spam control techniques we will analyze in the next section operate at various points in the transmission link. The closer to the sending client these techniques stop spam, the more efficient they would be. Things are a bit more complicated when also considering the reaction of the spammer to spam control techniques. Because it takes time to write an email message, users have a finite throughput (number of messages per day) that they can reasonably generate. One would expect the aggregate email traffic to increase almost linearly with the number of users on the Internet. In contrast, the rate of increase of spam traffic is relatively independent of the number of users on the Internet. Spam volume is driven by the spammers’ desire to increase their profits. Spammers can send one copy of a message to a large number of recipients, so the increase in spam volume is limited only by per message costs. But since costs are born by recipients, spammers are in fact limited only by how fast they can acquire lists of email addresses, or how fast they can mount dictionary attacks (where they try every possible letter combination as an address in a domain). Moreover, spammers benefit from economies of scale: the more recipients they target with a given email broadcast, the lower the cost per recipient per message. To actually reduce the volume of spam, a negative feedback loop is needed in the system. If spammers would be made to bear more of the costs, they would be less inclined to send large volumes of spam. The spam control techniques we describe in the next section increase spammers’ costs by either decreasing the throughput rate (by filtering out certain messages) or by directly affecting the hardware or connectivity costs of the spammers. In addition to spam control techniques operating (ideally) as negative feedback loops, there are several positive feedback loops that increase the volume of spam. First of all, the more email spammers send, the more responses they get, and the more profit they get; this drives up the volume of email, with a view to increasing profits. Second, as spam control techniques reduce spam throughput, spammers must send out more messages even to maintain a given success rate. Both of these positive feedback loops increase the volume of spam at the sender side. Spammers’ costs are almost always borne by end users, because spammers often steal hardware and network resources. Spammers use networks of hijacked computers (botnets) as email clients. Also they use open relay servers. Protected email servers allow access only to users that are either valid senders or valid receivers on the system. Open relays, in contrast, allow email communication senders and receivers that are not among the registered users on the machine. Spammers can use an open relay to reach a whole list of users at various destinations or even to fake the sender’s address. The number of open relays had been increasing until the end of 2001, and has remained at around 225,000 since then. More than a third of the open relays are still in the U.S., with much smaller numbers in China, Korea and other countries [10]. Perversely, even though users bear the costs of spamming, the costs of spam to most individual users are relatively low. Thus, users have little incentive to mount a sustained opposition to spammers. At the same time, the aggregate costs to the economy are incredibly high. Some reports estimate the cost of spam in the billion dollar range for the U.S. ([11], quoting a European Commission report no longer available). Only in recent years, with the dramatic surge in spam, governments, businesses, non-profits and individuals have started to take on a more aggressive action to fighting spam.
Legal and Technological Weapons for Fighting the Spam Battle Many recent papers deal with spam fighting techniques [1], [12], [13]. Most of these articles limit themselves to listing techniques and evaluating their pros and cons. By looking at email transmission as a system, we can evaluate the likelihood of success of each technique, based on the leverage it can have in the system.
Legal Means to Control Spam Legislative measures have been among the least effective means for fighting spam. For one, technology evolves quickly, and laws are often slow to keep up. Laws also tend to be too broad based. In the effort to balance spam protection against freedom of speech, legislators must leave large loopholes, enough for spammers to continue almost unperturbed. Spam laws [40] have mainly succeeded in pushing spammers offshore, outside of the jurisdiction of the law, or into using quick attack-and-withdraw tactics from temporary accounts. According to Spamhaus [14] 80% of spam in Europe and North America originates from fewer than 200 spammers operating illegally. Of the four system components (client and server sender, and client and server receiver), the law has the most power over the servers, by regulating ISP’s and other entities that manage the Internet. Spammers are usually not associated with the servers, but tend to steal resources as discussed earlier. Prosecuting individual spammers has had some limited effects on the sending client side, but as mentioned above, the main effect has been to push spammers outside the reach of the law, as opposed to eliminating them.
Spam Filtering Among techniques for fighting spam, the first one to be used historically is probably the least effective one: filtering at the receiving end. Sadly, this technique is also the most widespread, for reasons of ease of use. The first attempts to filter spam were based on discarding email from certain rogue addresses (so called block listing – for addresses that are blocked, – or blacklisting). As spammers change addresses easily or use zombies or botnets, this type of filtering soon proved of limited use, even more so because blacklisting raises concerns about censorship. The converse of blacklisting is whitelisting, where only email from certain addresses is delivered. This solution works well, but does not allow new correspondents. This is clearly impractical for most email users. Filtering can also be done by keywords, whether in the message header or in the body (for example by discarding messages containing the word “Viagra” in the subject line). As spammers adapted to such simple filtering [15], new filtering techniques have been proposed that use more complicated rules, sometimes computer generated, based on Bayesian decision making; most spam filters in use are now based on Bayesian filtering. Filtering that is done on the receiver client is easiest to deploy by an end user, but least effective, because spam has already used up network and storage resources. Filtering on the receiving server can drop spam even before it is stored locally. More importantly, it can make better decisions, by aggregating information across multiple spam recipients. Enterprise solutions can detect messages that are delivered to multiple users and that are likely spam, or can even connect to centralized repositories of spam information. Such repositories can be all computer constructed (based on individual message scores) or can include ratings by trusted human users [16]. The most well known collaborative filtering systems are Vipul’s Razor (razor.sourceforge.net) and Distributed Checksum Clearinghouse (www.rhyolite.com/anti-spam/dcc/). Spammers have adapted to such collaborative techniques by making slight changes in messages from one recipient to another, but collaborative techniques still work by identifying commonalities across messages. As described in the previous section, filtering at the receiving end does nothing about the costs of spam delivery and storage, which are still borne by the receiver. The reduction in spam throughput itself is not sufficient to provide a strong negative feedback loop that will deter spammers from their activities. Instead, there is a positive feedback loop, when spammers send larger spam volumes to compensate for the reduced throughput due to filtering. One often receives several copies of the same spam message at the same time, indicating spammers’ indiscriminate use of various email lists they have acquired. The main reason that false positives (valid messages flagged as spam) are bound to occur, is our ability to communicate using several media. Outside of email contacts, our contacts list will continue to expand, for example through face-to-face or phone meetings. Information about these new contacts is relevant for spam filtering, but is not automatically available to spam control tools. Similarly, changes in likes or dislikes (sudden interest in Viagra for research purposes) are difficult to update into the settings of spam control tools.
Rate Throttling Approaches More effective in reducing overall spam volume are a host of techniques that aim to reduce spam before it can reach the receiving server. Teergrubing [17] is the process of delaying the receipt of a message. As the sender server contacts a teergrubing receiving server to deliver a message, the receiving server delays answering requests. Ideally, this should have a minimal impact on servers sending out messages to single recipients. For spammers sending out large volume of email, teergrubing could greatly slow down their server. The process can be further optimized if the delay is applied only for known or suspected spam sending servers. Other similar ideas include TarProxy [41] and Jackpot. An ingenious proposal (somewhat similar to teergrubing) is to use the spam score from the client filter to influence the protocols that deliver email, by creating a delivery delay proportional with the likelihood that the message is spam [18]. Email messages are delivered using the transmission control protocol (TCP), one of the main protocols supporting the Internet. This protocol allows for handshaking between the receiver side and the transmitter side, to make sure the receiver is not overwhelmed by a transmitter with a much higher information bandwidth. The receiver must confirm the receipt of each individual data packet and may specify to the transmitter how large of a packet to send the next time. Using TCP damping, the receiving server calculates the spam score for an incoming message as the message is delivered and artificially delays confirmation of packets in the message for likely spam candidates. Alternatively, the receiver may specify a very small packet size, which would then subject the transmitter to high overhead and very inefficient transmission. The impact of any one receiver on a transmitter of email would be negligible. On the other hand, the aggregate effect could be a significant slowdown for a sender who is distributing a message to a large pool of recipients who all flag the message as spam. As an added benefit of this technique, the behavior of a sending server might give additional indications of whether a server is spamming or not. Spamming servers are more likely to give up when the delay increases. Legitimate servers would persist in delivering the message anyway. This behavior of spammers, more likely to give up when the going gets tough, has led to a solution called grey listing [19]. The idea is to modify the email server on the receiver end to initially refuse the connection for any incoming email from a source not whitelisted. Spammers traditionally broadcast a burst of email. If there are any delivery problems, they are not likely to return to retry later. In contrast, most legitimate email servers attempt retransmission for up to three days. By combining grey listing, whitelisting, and blacklisting, one author reported an 88% reduction in spam [20]. Information about servers that do not retry to send messages could be shared to allow collaborative detection of spammers. More recent reports seem to indicate spotty performance for grey listing. The technique has high false positive rate: it delays 98% of junk but also 40-50% of good mail [21]. Additionally, greylisting may pose problems with poorly configured legitimate servers that might drop connections [22]. A product based on limiting the spam bandwidth, Symantec’s Mail Security 8100 Series appliance, was reported in Network World [23]. The device connects to the Brightmail servers and retrieves the spam history associated with the sender’s email address. The administrators of the receiving server have the option of independently throttling back message throughput for up to ten different ranges of spam scores. This device operates on the network, even before messages can reach the receiving email servers. Several possible problems arise with all rate throttling approaches. Techniques based on individual messages’ spam score are only as good as the filters that determine these scores, and are challenged by spammers’ modifying their message format. Second, delaying the spammers’ server would also unfairly penalize other legitimate users on that server. This could be a blessing in disguise, if it would force the operators of the compromised server to fix the problem. Finally, TCP damping will require rewriting the code for email clients on the receiver side so that they will be able to use the filtering score to affect the network delay. Rate throttling, if accepted by users and widely adopted, can effectively limit spam, because the feedback loop reaches back to the sending server. From a freedom of speech point of view, rate throttling approaches avoid blacklisting (the equivalent of IP level censorship), and allow the feedback to a message to be democratically decided by the recipients. Under rate throttling pressure from several recipients, the spammer will incur higher costs, to buy more powerful hardware or to use a larger number of zombie computers in their botnet.
Payment Based Spam Control Another class of spam control techniques is based on increasing the costs to spammers, proportional to the volume of email sent. A sender is required to post a financial bond, certifying that they will not spam. Any recipient that believes she has received spam from that sender can request that the sender be debited for a compensation amount. The service is free to email receivers, and costs senders an application fee, an annual licensing fee and a per spam complaint fee (currently $20) that is deducted from the amount of the posted bond and transferred to “an independent, disinterested non-profit organization” [42]. Other companies offer similar approaches. Instead of this fixed payment, Dai and Li [24] propose a dynamic pricing mechanism that would control the flow of spam to maximize the recipient’s utility function. At the client side, a filter would determine a spam score of a message, and based on the spam score would then determine an “attention price” related to the time opportunity of the reader of the message. For relevant messages, the price would be low (or zero), and for higher spam likelihood the price would be higher. The optimum price is only dependent on information available to the recipients, and not on the sender’s payoff (which will most certainly be unknown to the recipient). The message is only delivered to the recipient if the sender agrees to pay the calculated price. The dynamic pricing feature does not account for senders with different needs (those using mobile or low powered devices). Another form of payment, this time in terms of time commitment is what is known as a challenge response technique (also known as Completely Automated Public Turing Test to Tell Computers and Humans Apart (CAPTCHA) [43]. Email recipients would maintain a whitelist and a blacklist. Senders that are not on either list will receive a question that is simple enough for a human to answer, but difficult to answer automatically. A spammer sending out a large number of messages would not be able to respond to such a large number of challenges, and would have to give up the practice. Legitimate users would only receive the challenge the first time they establish a new contact. After that they will presumably be on the contact’s whitelist. Legitimate mass mailings would also be handled by the recipients’ adding the sender’s name to their whitelists (a clear opt in action on the part of the user). The scheme has strong proponents as well as opponents. It has the potential to act directly at the sender client level, thus minimizing the costs associated with spam. One of the weaknesses mentioned by Levine [25]includes the case of two users communicating with each other for the first time. If one party’s message is received by the challenge response of the other party, the result might be a duel between the two challenge response systems. In Levine’s view, challenge response systems are like spam, because they offload work on the others (people not using them). Most troubling, Levine dreads the possibility that spammers would masquerade as people on the receiver’s whitelist, leading to a total loss of confidence in email. Other researchers believe that these problems can be solved [26]. Yet another form of payment is to use what could be called a computational stamp. Proponents of this method suggest that the sender calculates an electronic signature [27], based on the contents of the message, the destination, and the timestamp of the message. The method to calculate the signature would be time consuming (taking up significant CPU time on the sender’s computer, of the order of ten seconds), but verifying the correctness of the signature would be practically instantaneous. This way, a sender of large volume of email would face prohibitive costs in calculating individual signatures for each message to be sent (each destination would require a different signature, so mass mailings would require a long series of single destination transmissions). A casual sender of email will incur a negligible penalty. The computation could even be designed so that results could be used for a constructive purpose, for example the Search for Extraterrestrial Intelligence project (using so called bread pudding protocols [28]). To increase efficiency, recipients would maintain white and blacklists, and would discard mail that does not have the proper signature. Preliminary analysis of this technique seems to indicate that it might not operate as well as expected [29]. Legitimate users could be sending email from a variety of platforms, including older and slower computers, mobile devices with limited computing power, or other types of thin clients. Professional spammers on the other hand will likely be using a server farm with high-speed computers, dedicated hardware, or large botnets, and would have a few orders of magnitude advantage in speed. In deciding on the appropriate weight of the computational stamp, it might be very difficult to come up with a fair scheme that thwarts spammers but does not affect legitimate users with older or lightweight hardware. This scheme has nonetheless a lot of potential, since it has strong influence on the sender. Even if some of the recipients do not require a stamp, the sender will need to use one, or run the risk that his messages will be discarded. The biggest issue with payment based spam control techniques is that they will require changes to the email transmission protocols. In turn this poses problems with the migration from the current email system to the new one. There are financial costs associated with this migration, there will be a learning curve, and there is the danger that people who take longer to convert will have their email dropped.
Information-Hiding-Based Spam Control We include under this heading several types of solutions that rely on information hiding. A possible solution to spam is to hide one’s email address, by using one-time email addresses [30]. Users would generate a temporary alias for the purpose of a particular transaction, and would retire the address if spam starts to arrive via that address. Several companies offer this service (Spamex.com, Sneakemail.com and Spammotel.com among them). Other more complex solutions, require not just significant rewriting of the email client and server code, but a change of the fundamentals of how email works, for example by separating the channel from the message [3]. The openness of the current email system would be replaced by a two-step protocol where the sender first approaches the intended receiver and requests an opportunity to communicate. If and when this is approved, the sender can follow up with the actual messages that are part of the conversation. The scheme would also give early and strong feedback to spammers, if their requests for conversations would be refused. This scheme is reminiscent of the computational stamp and of the challenge response schemes, in that the spammer must prove their credentials before their message is allowed to proceed. Other schemes would force the sender to keep a copy of the message on their server instead of relaying the message to the recipient’s server [31]. A user level spam control technique that prevents spam at the source is the basic advice to keep one’s email address from falling in the hands of spammers in the first place. According to an FTC study, the likelihood to receive spam depends on where the address is posted on the Internet: 100% for email addresses posted in chat rooms (spam received as soon as within eight minutes of posting), 86% for addresses posted on newsgroups and web pages, 50% for free personal Web page services, 27% for message board postings, and only 9% for email service directories [32]. Much spam can be avoided for an email address posted online by listing it in a human only readable form, for example “myemail at domain dot com,” instead of “myemail@domain.com” (procedure called address munging).
Authentication-Based Spam Control Last but not least are various sender authentication protocols. The idea is that people who can be authenticated can also be held accountable for their email practices. Brad Templeton [33] proposes to automatically whitelist senders that can be held accountable, for example because they are under the jurisdiction of anti-spam laws or because they have a reputation to protect. Senders who are not willing to submit to being identified and held responsible will be subject to filtering, or even denied access. Instead of filtering out spam, this approach filters in good email and throws away everything else. The danger is that some spammers may establish a new identity, use it only briefly to spam and then move on to a new identity. According to a recent news report, 16% of the 400 000 messages in a spam archive had been sent by spammers that comply with authentication requirements (Sender ID, explained below) [34]. To avoid this, authentication can be combined with reputation ranking. Email from authenticated but not yet reputable senders would be scrutinized closely. As an alternative to an established reputation, new users may use bonded sender systems to prove their good intentions and to be able to reach their intended recipients in the meantime. Several joint projects between major Internet companies operate in this arena, including Sender Policy Framework [35], Domain Keys Identified Mail (DKIM) supported by Cisco and Yahoo, and Microsoft’s Sender ID. These solutions operate at the receiving server and give higher priority to email from authenticated senders. The receiving server checks whether the sender is a valid server in the domain where it claims to be. This check is based on looking up information in the Domain Name Server (DNS) databases distributed on the Internet (the technique is generally referred to as reverse DNS lookup or reverse MX record lookup, RMX). The direct feedback to the sender server – refused or delayed email if the sender IP address is not authenticated – makes this type of technique very effective in reducing spam. Still, as we explained above, authentication alone is not the perfect solution. Moreover, some legitimate users may see their email dropped if their DNS entry is not properly configured. Finally, recipients may be swamped even by email from authenticated legitimate senders if there are a large number of such senders. This is even truer if the sender’s computer has been hijacked by a computer worm [36].
Combined Solutions Most anti-spam solutions currently in use involve a combination of the techniques mentioned above. Several vendors sell anti-spam appliances that combine multiple techniques, sometimes in a collaborative enterprise-level or distributed fashion, and that allow remote management. The performance of combinations of techniques tends to be superior to that of individual techniques, as expected.
Effectiveness and Limitations After analyzing the techniques above, one may wonder whether any single technique or combination thereof is the silver bullet that will completely eradicate spam. Sadly, in this author’s opinion the answer is no. There are promising techniques, but they also have limitations. The most powerful spam control techniques are those acting directly on the sending server or client; unfortunately, these techniques require widespread changes to email protocols, which take significant expense and time to implement. Additionally, the migration to these techniques needs to accommodate late adopters without shutting them down or putting them at great disadvantage. Another effort that has already worked well is combined legal and educational actions to reduce the number of open relays, in turn reducing the flow of spam. This will continue to be a relatively straightforward and highly effective technique. Additionally, authentication-based solutions have been and continue to be highly effective. In this direction, Brad Templeton’s solution of combining accountability and restrictions [33] is very promising. Most of the spammers are organizations that avoid accountability, and they would be subjected to restrictions on the amount of traffic they can generate. Email providers, including Google and Yahoo!, have had very good results in using sender authentication-based spam control. While these solutions are and will continue to be successful in reducing the volume of spam, no solution to spam is likely to be completely satisfactory. Two extreme cases demonstrate the range of issues faced by spam control techniques. One the most challenging cases for spam control techniques is that of a businessperson communicating with her clients mainly via email. For a highly paid business person, the opportunity loss associated with false negatives (spam that has not been filtered out) could be considerable. On the other hand, false positives could be even more damaging. The user cannot use whitelisting, because that would filter out any new clients. Authentication-based techniques would be acceptable, although prohibiting for clients from a small startup that has no reputation and little extra resources to post a sender bond. Freedom of speech poses another, more formidable challenge to all the spam solutions we proposed. A whistle blower broadcasting a message via email would like to remain anonymous, hence would not be willing to be authenticated at all. He would need to send the message to a large number of recipients who are unlikely to have signed up to receive it. Using email relays to preserve anonymity was the solution in the past, but most of those services are being shut down. If freedom of speech is important for email users, something needs to be done to avoid treating the whistle blower’s anonymous mailing as spam; none of the existing techniques allows that. Apparently, technology is not yet available to allow all the benefits of email (openness, potential for anonymity, almost instantaneous communication), and at the same time to allow users to filter spam effectively. Even if such technologies were available, as technology evolves, so does spam.
Future Outlook As new anti-spam techniques are implemented, spammers will continue to invent new ways of bypassing and attacking anti-spam software. A spam fighting technique that may seem perfect today will eventually be overcome by attackers. As email senders are held more accountable for their actions, users will be spending more resources on proving their accountability, and attackers will spend more resources on breaking into users’ accounts to use them to send spam. This way, the burden of dealing with spam will continue to shift towards less well protected organizations and towards less technology savvy Internet users. Ultimately, as email becomes more spam resistant, spammers will move on to other less well protected communications technologies. This is already happening. Spam over Internet Messaging (spim) is received by 30% of the IM users in the US, according to a report by the Pew Internet and American Life Project [37]. That amounts to 17 million spim victims. The first criminal case in connection to the CAN-SPAM act and spim activities occurred on February 16, 2005, when Anthony Greco, 18, of Cheektowaga, New York was arrested in Los Angeles. Allegedly, he sent more than 1.5 million spim messages to users of MySpace.com, then contacted the company and tried to negotiate exclusive rights to continue sending spim to site users. He threatened that he would otherwise tell other spammers how to send spim [38]. Another emerging spam problem is on cell phones, where users may receive spam text messages. Eighty percent of wireless users acknowledged having received spam on their mobile phone, according to an industry study [39]. Apparently, 83% of telecommunications industry respondents see spam to mobile phones as a threat for the immediate future. Some phones are set to beep when text messages arrive, making this particularly annoying for recipients of high volume of spam. To add insult to injury, some wireless companies charge users per message received. There is no software currently available to filter spam on wireless phones, although some wireless companies are starting to place spam filters on their network and to prosecute spammers. In the end, email based spam is only one chapter in the never ending story of technology evolving and changing the face of crime. As we perfect spam fighting devices and techniques, we will continue to reduce the incidence of spam, but we will never manage to eradicate it. Spam is already shifting from email to other communications media (IM and cell phones) and may become more of a threat for those new media. Ultimately, only a society without any crime will be free of technology supported crime. The effort in eradicating spam should include not only deploying technology, but also more efforts on educating people and addressing crime in a broader sense, beyond just computer and communications crime.
Author Information
Bogdan Hoanca is Assistant Professor of Computer and Information Systems, College of Business and Public Policy, University of Alaska Anchorage, 3211 Providence Dr., Room BEB 307J, Anchorage, AK 99508; email: afbh@uaa.alaska.edu.
References [1] P. Dougan, “Legal and technical responses to unsolicited commercial e-mail (‘spam’),” 2001; http://www.strath.ac.uk/Other/staffclub/web2law/spam.pdf, accessed Aug. 1, 2005. [2] C.W. Crews, Jr., “Why canning "spam" is a bad idea,” Cato Policy Analysis, no. 408, July 26, 2001; http://www.cato.org/pubs/pas/pa408.pdf, accessed Aug. 1, 2005. [3] B. Whitworth and E. Whitworth, “Spam and the social technical gap,” IEEE Computer, vol. 37, no. 10, pp. 38-45, Oct. 2004. [4] “Increasing spam threat from proxy hijackers;” http://www.spamhaus.org/news.lasso?article=156, accessed Aug. 1, 2005. [5] M. Krochmal, “Spammer says "Uncle" to AOL;” http://content.techweb.com/wire/story/TWB19971218S0007, accessed Aug. 1, 2005. [6] “America Online announces breakthroughs in fight against spam,” Dec. 27, 2004; http://media.timewarner.com/media/newmedia/cb_press_view.cfm?release_num=55254331, accessed Aug. 1, 2005. [7] Spam: The serial ROI killer, Res. note E50, June 2004; http://www.nucleusresearch.com/research/e50.pdf, accessed Aug. 1, 2005. [8] L. Walker, “Weeding en the garden of good e-mail“, Washington Post, pp. E.01, Jan. 31, 2002. [9] “Spam issues in developing countries, OECD, Rep. JT00185109, May 26, 2005; http://www.oecd.org/dataoecd/5/47/34935342.pdf, accessed Aug. 1, 2005. [10] Statistical data on the Open Relay Database web site; http://ordb.org/statistics/. [11] Quick FAQ, Coalition against Unsolicited Commercial Email; http://www.cauce.org/about/faq.shtml, accessed Aug. 1, 2005. [12] S. de Freitas and M. Levene, “Spam on the Internet: Is it here to stay or can it be eradicated?” Joint Information Systems Committee, Techwatch Pap, TS-04-01; http://www.jisc.ac.uk/uploaded_documents/ACF11A8.pdf, accessed Aug. 1, 2005. [13] S. Hird, “Technical solutions for controlling spam,” in Proc. AUUG 2002, Melbourne, Australia, Sept. 4-6, 2002; http://security.dstc.edu.au/papers/technical_spam.pdf, accessed Aug. 1, 2005. [14] ROKSO, Register of Known Spam Operations; http://www.spamhaus.org/rokso/index.lasso, accessed Aug. 1, 2005. [15] J. Graham-Cumming, “The Spammer’s Compendium,” last updated on Apr. 15, 2005; http://www.jgc.org/tsc/, accessed Aug. 1, 2005. [16] A. Gray and M. Haahr, “Personalised, collaborative spam filtering,” in Proc. First Conf. on Email and Anti-Spam (CEAS 2004), Mountain View, CA, 2004; http://www.ceas.cc/papers-2004/132.pdf, accessed Aug. 1, 2005. [17] L. Donnerhacke, Teergrubing FAQ; http://www.iks-jena.de/mitarb/lutz/usenet/teergrube.en.html, accessed Aug. 1, 2005. [18] K. Li, C. Pu, and M. Ahamad, "Resisting SPAM delivery by TCP damping" in Proc. First Conf. on Email and Anti-Spam (CEAS 2004), Mountain View, CA, 2004; http://www.cs.uga.edu/~kangli/src/ceas2004_kangli.pdf, accessed Aug. 1, 2005. [19] E. Harris, “The next step in the spam control war: Greylisting;” http://projects.puremagic.com/greylisting/whitepaper.html, accessed Aug. 1, 2005. [20] A. Jones, “Greylisting performance;” http://users.aber.ac.uk/auj/spam/greyperf.shtml, accessed Aug. 1, 2005. [21] R.D. Twining, M.M. Williamson, M. Mowbray, and M. Rahmouni, “Email prioritization: Reducing delays on legitimate mail caused by junk mail,” Hewlett-Packard, Rep. HPL-2004-5R1, 2004; http://www.hpl.hp.com/techreports/2004/HPL-2004-5R1.pdf, accessed Aug. 1, 2005. [22] J. Levine, “Experiences with greylisting,” presented at Second Conf. on Email and Anti-Spam, CEAS 2005; http://ceas.cc/papers-2005/120.pdf, accessed Aug. 1, 2005. [23] J. Snyder, “Symantec slows spam at the edge,” Network World, Apr. 11, 2005. [24] R. Dai and K. Li “Shall we stop all unsolicited email messages?” in Proc. First Conference on Email and Anti-Spam (CEAS 2004), Mountain View, CA, 2004; http://www.ceas.cc/papers-2004/189.pdf, accessed Aug. 1, 2005. [25] J. Levine, Email communication posted online; http://www.politechbot.com/p-04746.html, accessed Aug. 1, 2005. [26] B. Templeton, “Proper principles for Challenge/Response anti-spam systems;” http://www.templetons.com/brad/spam/challengeresponse.html, accessed Aug. 1, 2005. [27] C. Dwork, and M. Naor, “Pricing via processing or combatting junk mail,” in Advances in Cryptology - CRYPTO '92, E.F. Brickell, Ed., 1992, pp.139- 147. [28] M. Jakobsson, and A. Juels, “Proofs of work and bread pudding protocols,” in Proc. IFIP TC6 and TC 11 Joint Working Conf. on Communications and Multimedia Security (CMS’99),1999. [29] B. Laurie and R. Clayton, “’Proof-of-Work’ proves not to work”, version 0.2, September 2004; http://www.cl.cam.ac.uk/users/rnc1/proofwork2.pdf, accessed Aug. 1, 2005. [30] J.-M. Seigneur and C.D. Jensen, “Privacy recovery with disposable email addresses,” IEEE Security & Privacy Mag., vol. 1, no. 6, pp. 35-39, 2003. [31] Z. Duan, Y. Dong, and K. Gopalan, “DiffMail: A differentiated message delivery architecture to control spam;” http://www.cs.fsu.edu/research/reports/TR-041025.pdf, accessed Aug. 1, 2005. [32] Prepared Statement of the Federal Trade Commission on “Unsolicited Commercial Email” before the Committee on Commerce, Science and Transportation, U.S. Senate, Washington, DC, May 21, 2003; http://www.ftc.gov/os/2003/05/spamtestimony.pdf, accessed Aug. 1, 2005. [33] B. Templeton, “Best way to end spam;” http://www.templetons.com/brad/spam/endspam.html, accessed Aug. 1, 2005. [34] T. Claburn, “Spammers hijack Sender ID,” Information Week, Sept. 9, 2004 [35] M. W. Wong, “Important considerations for implementers of SPF and/or Sender ID;” http://www.maawg.org/about/whitepapers/spf_sendID/, accessed Aug. 1, 2005. [36] B. Watson, “Beyond identity: Addressing problems that persist in an electronic mail system with reliable sender identification,” in Proc. First Conf. on Email and Anti-Spam (CEAS 2004), Mountain View, CA, 2004; http://www.ceas.cc/papers-2004/140.pdf, accessed Aug. 1, 2005. [37] Pew Internet Report, “The advent of spim;” http://www.pewinternet.org/PPF/p/1052/pipcomments.asp, accessed Aug. 1, 2005. [38] “Against Internet Messaging Company and sending more than 1.5 million spam messages,“ USAO/CDCA, press release; http://www.usdoj.gov/usao/cac/pr2005/034.html, accessed Aug. 1, 2005. [39] “First empirical global spam study indicates more than 80 percent of mobile phone users receive spam;” http://www.bmdwireless.com/main.php?content=newsflash_02200509, accessed Aug. 1, 2005. [40] D.E. Sorkin, Spam Laws, http://www.spamlaws.com/. [41] Open Source Technology Group, 2006; http://sourceforge.net/projects/tarproxy. [42] Return Path, Inc., 2006; www.bondedsender.com. [43] School of Computer Science, Carnegie-Mellon University, 200-2005; http://www.captcha.net/.
1The average size of a spam message is relatively small, usually no more than 10 KB. In contrast, many legitimate messages contain attachments, some of considerable size (up to MB of data). The actual percentage of traffic (in bytes) associated with spam is thus lower than the figures reported here. Although user annoyance and wasted time is proportional to the number of spam messages (70% of total email traffic), the bandwidth wasted by spam is proportional to the spam volume, which is lower than 70% of the total email traffic.
© 2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or ressale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from IEEE.
Previous Articles:
|