Protecting Your Web Site

A determined thief will find ways to steal your stuff. They will disable Javascript, search their browser caches, perform screen captures, and use hacking tools to get what they want. Nothing in this article is guaranteed to work 100% of the time. However, a combination of techniques will slow down the determined thieves, stop the less-intelligent thieves, and possibly remind them that they are stealing your intellectual property.

The Google Quality Guidelines for Webmasters clearly states "Don't create multiple pages, subdomains, or domains with substantially duplicate content." Search engine rankings can fall because your content has been duplicated, either by someone stealing your content, or you having duplicate web sites. If your web site has exclusive content, then other sites can only link to your web site. External links pointing to a single web site with exclusive content will have a higher search engine ranking than the same number of links pointing to many sites with stolen or duplicated web content. Some SEOs believe Google penalizes Page Rank when it finds duplicate content. The Duplicate Content Penalty can mean the removal of your web site from the search engine index (delisting), and lower the overall rank of your web pages. If stolen web content is indexed first, the site with the original web content can be penalized with a lower ranking, or omitted from the search results. Obviously, this can affect revenue for business sites.

Copyright Notice:

In many countries, including the US, UK, and EU, your web site and web content is protected without an official copyright notice. However, other nations require the official copyright notice. Some nations also require the notice "All Rights Reserved." Including the official complete copyright notice with the optional "All Rights Reserved" notice on all web pages offers limited protection in countries that accepted the international copyright treaties.

Copyright Registration:

In the UK, http://www.copyrightservice.co.uk keeps an independent record of your copyrighted material.

In the United States, if you are serious about protecting your hard work you'll need to register your web site copyright with the U.S. Copyright Office. Complete information for copyrighting your web site is available at http://www.copyright.gov/circs/circ66.pdf.

Your web site is a published literary work. Your application must contain two deposits of your web site. You have a choice to include representative web pages or paper copies of every page of your web site. You can also include the entire web site on a cd-rom or other electronic format. Obviously, if you include every page of your web site, you have the best proof that plagiarism occurred. If you only registered a representative sample, it will be much harder to prove plagiarism occurred.

You can not file a lawsuit for copyright infringement unless you first obtain a Certificate of Registration or a denial of Registration from the Copyright Office. The copyright will become effective the date the deposit and application is received by the Copyright Office, not the date they complete processing your application (could be months after the received date).

If your copyright is registered before an infringement occurs, you can recover attorneys' fees and statutory damages. With the ability to claim statuatory damages, you do not have to prove lost profits or the infringer's profits because the court can award up to $30,000 for each infringed copyrighted work. Statutory damages of up to $150,000 plus attorney fees can be awarded if the infringement was willful.

If your web site or literary work was not registered before the infringement occurred, you can not obtain statutory damages or attorneys' fees. You still have the right to obtain an injunction, damages for your lost profits, and the infringer's profits.

Obtaining a Certificate of Registration for your web site will allow you to recover statutory damages and attorneys' fees if future infringements occur. You can obtain a certified copy of your deposit from the Copyright office if it is needed for litigation. Most of the time, your deposit copy, with your Certificate of Registration issued by the Copyright Office is sufficient for litigation.

Protecting Your Pages:

Check to make sure the permissions on your site folders are set correctly. Occasionally check your log files for any attempted hacks.

robots.txt File:

Well-behaved spiders should follow the instructions in your robots.txt file. Banning specific web robots from your web site can reduce your website bandwidth. Log files can be used to identify the different robots that visit your site. If a robot is doing something on your web site that you do not like, or is from a country that you prefer not to index your site, you can ban those robots. Information about known bots is available at http://www.jafsoft.com/searchengines/webbots.html and http://www.robotstxt.org/wc/active.html. If some content does get into a spidered index by accident, you can request that it be removed.

.htaccess file:

Some countries have deserved reputations for fraud and content theft. Your software may have absolutely no sales potential in some countries. It makes no sense to waste your bandwidth, and make it easier for visitors from those countries to steal your content. You may also want to ban visitors that are referred from rogue web sites that list cracks and serial numbers.

Web servers follow your instructions placed in .htaccess files. .htaccess files can be placed in any directory on your web site. The .htaccess file can be used to password protect folders, ban specific web robots, ban visitors with specific IP addresses and countries, allow users with specific IP addresses, stop directory listings, ban specific download software, and redirect visitors to other web pages and web sites.

An .htaccess file placed in your image directory can prevent images from being displayed on a different site. This is called hot linking, and uses your bandwidth. A text line can be included to display an alternate image not located in the protected image directory. Or you could replace your stolen web content with a different image indicating the theft.

HTML Meta Tags:

The following example placed in the <head> portion of an HTML web page instructs all web robots to not index or analyze the web page for links:

<meta name="ROBOTS" content="NOINDEX, NOFOLLOW" />. Not all robots understand this HTML tag.

This code placed in the <head> portion of an HTML web page stops well-behaved web bots from archiving your content: <meta name="GOOGLEBOT" content="NOARCHIVE" />

Images larger than 200 by 200 display an image toolbar allowing the visitor to save the image (IE 6 and greater). Add this code to the <head> section of your page to disable the image toolbar (.htaccess can also be used). The image could still be stolen by using screen-capture software.

For all images use: <meta http-equiv="imagetoolbar" content="no" />

For individual images use: <image src="mypicture.jpg" gallerying="no" />

Use Absolute Links for HTML Pages:

An absolute link is the complete URL, for example http://www.yourdomain.com/folder/page.htm. If you use relative links, it is very easy for a plagiarist to copy your web site or individual web pages to a new domain. Absolute links would require the plagiarist to work harder to remove or change all of your absolute links. Remember, a plagiarist is lazy. If the plagiarist fails to change or remove all of your links, your web stats could alert you to your stolen web content.

Using absolute links does require more work on your part, but there is another benefit. Sometimes search engines index your site using yourdomain.com instead of www.yourdomain.com, resulting in fewer internal links. A loss of internal links will likely cause your search engine ranking to drop. Since external site links always use www.yourdomain.com, it makes sense to also use absolute internal links to keep your total link count high.

Naming Files and Folders:

Avoid obvious folder names like secret, personal, private, and protected. Avoid obvious file names like customerorders.txt and creditcardnumbers.txt. Automated hacking tools look for common names. If you have confidential information on the web server, make sure it is encrypted. If you have some images, consider using the ALT tags to add common mispellings to your web pages. It can help your SEO, and help idetify your stolen web content. If the thief performs any search and replace, it is more probable that the thief will miss that misspelled text.

Index.htm:

Put a file called "index.htm" or "index.html" in every directory on your web site. This prevents thieves from viewing other files located in the same directory. Htaccess can also be used to prevent the directory listing. Do not use obvious names for your confidential password, email, and order directories and filenames.

Using PHP:

When you install a new CMS (Content Management System) or bulletin board system like phpBB, change every default setting you can. With PHP, set display_errors to 0. Error messages give too many clues to hackers. If you use PHP, the Register Globals directive should be turned off. Why? If Register Globals is enabled, then adding ?authorized=1 to the query will let an attacker break in.

Passwords:

A web-site can be password protected by both Javascript and .htaccess If you are protecting one area of your site, validate the user's login credentials every time. Cookies and input forms are too easy to forge. Htaccess information is available at www.javascript.com/howto/htacess.shtml .htaccess is more secure than Javascript

Data Input:

Make sure all data input that isn't validated is deleted immediately. Even information in a cookie can be manipulated by a hacker. Form data should be submitted as part of a POST, not GET, and use no cache tags. The user's information will not be displayed when the back button is pressed on the browser.

Scripts:

Check scripts you use on your web site for security holes. Enter your script name and the word 'security' into search engines to try to find fixes or alternative safe scripts. Rename common scripts before installing them. Check regularly for updates and patches. Many scripts have a lot of extra features you do not need. Either remove or turn off those extra features. Use the administration section to change file permissions as necessary.

Disable Right-Click:

Javascript can be used to disable the mouse right-click. Selections on the pop-up menu can be used to view your source code and save your images. Disabling the right-click - although not effective, reminds the thief that you own the site's copyright. However, disabling the right-click also disables other menu choices like add to favorites. Smart users can still use the View Source to see what you've done, or enter javascript:void(document.contextmenu=null) in the address bar to break your protection. Thieves can browse their cache to get your images and use an offline browse to download your web pages. Thieves can use a screen capture program to save your images.

The following script disables the mouse right-click.
<script>
   document.oncontextmenu=function() { return false; }
</script>

HTML Compression:

An HTML compression program removes line returns from your HTML code. A free version of HTML Shrinker is available at www.thepluginsite.com/products/htmlshrinker/

HTML Encryption:

The HTML page is scrambled with a script added at the beginning to unscramble the code so the browser can display the content. View source will only show the scrambled version. However, search engines will not able to read your scrambled version either. CGIScript at www.scriptsearch.com disables right-clicking and encrypts your code so that it can't be saved from a browser or viewed.

Use www.dynamicdrive.com/dynamicindex9/encrypter.htm to paste a section of your HTML code, encrypt it, and copy it back to your HTML editor.

CSS Options:

To prevent printing insert these commands in the main stylesheet:

media print{
body {display:none;}
}

To prevent text from being selected, the text can be placed inside <div></div> tags

<div onselectstart="return false;" unselectable="on;" style="-moz-user-select: none;">
Text that can not be selected.
</div>

Place a CSS layer over the top of an image. When the visitor right-clicks on the image, they'll actually be clicking on the layer instead of the image.

To disable the clipboard, add the following code inside the <body></body> tags:

onload=setinterval("window.clipboarddata.cleardata ()", 20)

The disabled clipboard remains active until that browser window is closed, affecting other programs.

Using Frames:

Place your site in an invisible frame. Create a frameset for your main page with no-right click and menubar free with this code:

<frameset rows="1%,99%" border="0" framespacing="0" frameborder="0">
<frame src="invisible.html" name="invisible" scrolling="no">
<frame src="contents.html" name="main">
</frameset>

Image Protection:

Convert your images to flash movies. However, a flash decompiler can decrypt your flash applets. Flash applets can check their serving location. Check the serving location at random times. Google indexes SWF files.

Secure Image from Artistscope is a Java based application that encrypts your images so they can only be served from a specified URL. Since the images are part of the application, traditional methods of stealing your images will not work. www.artistscope.net/secure_image/

Photographers, Artists, and Illustrators - don't include links to the high resolution versions of your work. If you do, make sure your watermark is clearly displayed.

Split your images into pieces, with the browser reassembling to display the original image.

Paint Shop Pro and Photoshop can use a Digimarc plugin. Personal information about the author can be added to the image file. This information can be read by any DIgimarc enabled imaging program to prove you are the creator. A basic subscription costs $79.00 for one year with the ability to watermark up to 1000 images.

Watermarks:

Add watermarks or text captions to your images. A watermark can be added directly to a copy of the image or the watermark can be saved with a transparent background for a layer.

<div style="position:absolute; left:10px; top:15px; width:210px; height:210px; background-image: url(mypicture.jpg); layer-background-image: url(mypicture.jpg);"><img src="watermark.gif" height=210 width=210></div>

The software iWatermark from www.scriptsoftware.com can add text or graphical watermarks to an entire folder of images at one time. This software is available for $20.

ASP member DiVision Software at www.batchphoto.com offers BatchPhoto software for $29.95. This software can add text and image watermarks, comments and date information to a batch of photos. This software can also perform additional photo manipulation (convert image formats, rename, touch-up, and add effects).

Available Software

HTML Protector software from www.antssoft.com provides a variety of ways to automate the protection of your web content and coding. The software also includes some protection methods not discussed in this article. The software is available for $39.95.

ASP member Andreas Wulf at www.aw-soft.com offers HTML Guard software for $15.00. This software also automates many HTML protection methods.

Terry Jepson
www.wiscocomputing.com