Email address obfuscation

From Market Ruler Help
Revision as of 20:54, 4 February 2011 by Admin (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Email Obfuscation or Email Hiding is an attempt to hide a valid email address on a web page from being scraped to avoid receiving spam.

Email Obfuscation differs from email encryption in that obfuscation conceals the actual email address from web scraping, while email encryption conceals the contents of an email message in transit.

Email Obfuscation falls into a few categories:

  • Image: Emails are embedded in an image which may be manipulated to prevent OCR
  • Encoded: Emails are encoded using JavaScript and displayed using some form of encryption
  • Puzzle: Emails are written using a simple "trick" given as textual instructions, such as "remove NOSPAM"
  • Complete: Emails are not accessible without submitting a form, or completing a CAPTCHA

All of the above methods are used instead of the internet standard mailto: URL scheme which is used in a link tag in HTML:

<a href="mailto:support@example.com">Contact Support</a>

Plain Email

To fully understand how email obfuscation works, the following HTML:

<a href="mailto:support@example.com">Contact Support</a>

... will be obfuscated using the techniques shown above.

Image Email Obfuscation

A simple method of communicating an email address to a person is via an image which contains rasterized text of the email address. The web page visitor then needs to type the text directly into their email program in order to send a mail to the email address.

Similarly, Adobe Flash can be used to display an email address.

Advantages

Disadvantages

  • Site visitor copying the email address by sight is error prone
  • OCR technologies are sufficiently advanced to make it easy to determine the email address by automatic methods
  • Copy and Paste do not work with Image Email Obfuscation

Encoded Email Obfuscation

This method uses Character encodings in HTML, or JavaScript to hide or otherwise obscure an email address. A simple technique is to simply use HTML Entities to encode the email address:

<a href="&#109;a&#105;l&#116;&#111;:&#115;&#105;&#109;&#112;&#108;&#101;&#64;&#101;&#120;&#97;&#109;&#112;&#108;&#101;&#46;&#99;&#111;&#109;">Contact Support</a>

Alternate methods use JavaScript to generate the email address using a simple form of encryption.

<script type="text/javascript">
var e = unescape("Af%25mwjkB%27rfnqyt%3FxzuutwyEj%7Dfruqj3htr%27CHtsyfhy%25XzuutwyA4fC");
var i,p='';for(i=0;i<e.length;i++){p+=String.fromCharCode(((e.charCodeAt(i)-37)%240)+32);}
document.write(p);
</script>

The above code will generate a tag identical to the plain email.

Advantages

  • Wide variety of tools to enable encoding to be done automatically on the server.
  • Behaves identically to standard email links

Disadvantages

  • Character encoded email is easy to decrypt
  • JavaScript encoded email does not operate correctly without JavaScript enabled in the visitor's web browser

Puzzle Email

A puzzle email is a primitive form of CAPTCHA which requires the user to solve a simple puzzle to determine the email address:

<a href="mailto:supNOSPAMport@exaNOSPAMmple.com">Contact Support</a> (Remove NOSPAM)

Another example:

Contact Support: support -at- example -dot- com

Simply enough, a site visitor needs to read the additional text and perform some simple manipulation or substitution to determine the actual email address.

Advantages

  • Very easy to implement with little technology
  • Usable on public sites

Disadvantages

  • May be difficult to do for some users
  • Errors may occur in copying email address

Complete Email Obfuscation

This method involves programming in a web server scripting language. The process is:

  • An email address is displayed as a link in an incomplete form, such as: support@...
  • Visitor clicks the link and visits a page which collects message information (From, Subject, Body) and may require the visitor to complete a CAPTCHA
  • Upon successfully filling out the form, the web server sends the email on behalf of the site visitor

Advantages

  • Email addresses are protected completely

Disadvantages

  • Additional steps make make visitors use it less
  • More complex implementation than other methods
  • May be subject to code injection attacks on a web server if server-side code is not properly secured.

See also