This must sound aggressive but it’s 100% true. We all complain nowadays about the amount of spam landing on our inbox. My mailserver’s traffic is more like 80% spam to 20% legit emails. This is a huge problem for any mail provider and, in the end, the end user. So, i decided to make a small experiment. I wrote a small Perl crawler (i might discuss this on a later post) and put it out to collect e-mails. This was a quest to find out how easily can a spammer collect addresses to send their stupid spam to. I decided to make it ridiculously easy. I wouldn’t look for emails “disguised” in one way or another. I would be looking for plain ones. For instance, i wouldn’t be looking for “foo [at] bar.com”, which is a common disguise nowadays, but for “foo@bar.com”.
So, i started coding. About half an hour later i had the little bugger ready. About thirty lines of code. That easy. Now, all i needed, was a place to start searching from. The way it would operate was like a spider. Starting from one site, it would search for emails and then find references to other pages/sites and then visit those doing the same. I decided to go with the most “promising” category, the bloggers, such as me and you people. So, the starting page was a blogging submission catalog. From there the little spider would find it’s way to thousands of blogs.
I fired it up and kept an eye on it for an hour or so, keeping it out of “trouble”. By that i mean pages with no meaning, in depth searching of “nothing” and other creepy things alike. After the hour the results where a bit discouraging for my little experiment. But that was because the spider has just indexed the blog submission directory pretty well and it had just built a base to work on. So, i let it run and went to bed.
The next day the results were stunning. It had run for well over 10 hours. Are you ready for this? It had visited over 100.000 pages and the little bastard had collected more than a 1.000 emails! Can you imagine? One in 100 pages contains a plain text email. One that a little dumb ass robot can pick up! And i meanΒ business people. The emails where real one alright. The were names of people on Gmail, service and domain emails, support etc. Now, i know that on all the email providers, very sensitive filters are active, but don’t forget, spammers are creative and bypass those filters every day! Go check your email and you will get what i mean π
I decided to let it run for one more day see if the rhythm would keep up, building a solid spamming base this way. It run for another 24 hours or so. The results? Well, it had visited a total of 370.000+ pages and managed to collect well over 4.000 emails! Yes, impressive i know. The rhythm was a bit elevated by i think that’s within the margin of statistical error. Now, imagine training the robot to understand email forms like the tricky one i mentioned above. The numbers would be much much higher, i bet.
So, i decided to make this post. Do you remember why you shouldn’t forward chain emails? This is a more basic one. Stop adding your email as plain text to your website / blog or you’re doomed to get spam. A good question would be, “how can i add my email then?”. Well, a few pointers i could summarize are:
- Do not add your email as plain text. Open up paint, add it there and put it in your website as an image. That should cut off the chances of it being picked up by 99%.
- If you are a company stop having email like “info@mycompany.com” or “support@mycompany.com”. These are easily guessed by spammers and all they have to do is try various combinations out. Change “info” to “information” and “support” to “cust-support”. You get the idea i guess.
- When giving your email on forums and other social websites make sure you tick the “keep private” if it’s there as an option.
In general, do not post your email publicly, unless you have to. And in that case take special precautions, for instance put it in an image.
I hope this post has got you thinking a little, and next time you start writing your email using your keyboard (and not a pen and paper that is) you will think twice before hitting submit.
Photo credit by jeffsmallwood
I completely agree with you. Though most of the times I take care of not writing down my email ID in plain text, sometimes I don’t, sheer laziness to be blamed here.
I never write my email ID publicly. In my blog , I use a contact form.
Anyways, nice experiment.
Well, I have a link to my email on my website and I’m not stupid π But I also don’t complain about getting spam.
It’s a gmail account (not my main one) that forwards to another account. I want to make it as easy as possible for potential clients to contact me. And people are lazy – they just want to click the link and be done. A lot more people have contacted me via the email address than through my contact form.
Gmail catches almost all of the spam so it really isn’t a problem. I’m just not going to lose potential business because of spammers.
But then I also have a phone number on there… people use that more that the contact form too.
@Kim: Well at least you are aware of the problem and decide to go this way on the trade off. But you have taken precautions (not your main email). You see people complaining about spam and on the other hand giving out their mail like there is nothing to it.
Nice idea stratos..
Even I’m very keen of not posting my email address in public….
Though another good way I feel…is by using gmail method…
like test+mail@gmail.com
test.mail@gmail.com
for public domain …and then redirected them to a separate folder by making mail rule…
this help me to keep my inbox super clean and any important mail never goes out of my sight.
@Harsh: That sounds like a nice way to avoid spam… Thanks for dropping by guys and gals… π
Wow, nice experiment, even i try avoiding writing my email directly on any web page.
I’m quite interested in seeing the source code of your crawler (I personally love programming, and was told to check out perl, so this would be pretty cool to me). It is pretty easy to find and collect emails, I’ve done crawlers in JAVA before and I can just imagine it being a piece of cake to create a spam bot to collect emails for you.
Anyways, I hope this was just an experiment for you and that you don’t become a full out spammer :-D. Pretty interesting results though.
@Funny Stuff: Well the code is not sophisticated at all. Writing it in Perl is much much much easier that writing it in Java. I don’t have the structure but i have the flexibility. Anyway i might write a post about it on the coming days… And no, this was just an experiment. I won’t turn into a spammer π
Oh thanks for the post!!
I expose my email to every body online..
Will use some types a contact form to get comments!!
Thanks again mate!!
Of course it’s true. Spammers don’t magicaly aquire your email address. I suggest having one address for all the net based stuff (this one will be filled with spam) and the other one for your interpersonal contants (provided that you won’t post the address anywhere the chances of spammers getting hold of it are not high).
@Gry: Well that’s one tactics to be kept. But the thing is that the majority of the users just don’t appreciate their own privacy. That results on spamming to them and their friends, the the friends’ friends etc…
This is really great information. I thought when I went over to gmail I would no longer have to deal with spam…boy was I wrong. I thought google had all the answers, but I guess some things are even beyond them at this point.
Well spamming is something beyond machines in my opinion. It’s creatively created every day to avoid spam detection… There are many means to protect the end user but there is no ultimate solution for it. The only thing we can do is stop exposing our email that much. Other than that just “delete” π Thanks for dropping by!
Would those type of scripts be able to pick up the email in ‘contact form’ plugins?
@Sire: The main reason why people started using contact forms (besides giving an easy way of communicating) was to avoid exposing the email… When someone uses a contact form, the text is sent to the webserver which knows which mail to use. So, the mail is never exposed to the client π Protecting the form from spamming it though (CAPTCHA for instance) is another thing…
Thanks for that mate. I will now check my blogs to make sure I remove any email links I may have left on there