navigation
a webmaster learning community
     Home    Register     Search      Help      Login    
Sponsors

Shopping Cart Software
Ecommerce software integrated into Frontpage, Dreamweaver and Golive templates. No monthly fees and available in ASP and PHP versions.

Website Templates
We also have a wide selection of Dreamweaver, Expression Web and Frontpage templates as well as webmaster tools and CSS layouts.

Frontpage website templates
Creative Website Templates for FrontPage, Dreamweaver, Flash, SwishMax

Search Forums
 

Advanced search
Recent Posts

 Todays Posts
 Most Active posts
 Posts since last visit
 My Recent Posts
 Mark posts read

Microsoft MVP

 

How can I prevent hotlinking to PDF files

 
View related threads: (in this forum | in all forums)

Logged in as: Guest
Users viewing this topic: none
Printable Version 

All Forums >> Web Development >> Server Issues >> How can I prevent hotlinking to PDF files
Page: [1]
 
Kitka

 

Posts: 2507
Joined: 1/31/2002
From: Australia
Status: offline

 
How can I prevent hotlinking to PDF files - 1/27/2006 0:01:36   
One site of ours contains hundreds of brochures and hefty user manuals (all pdfs) intended mainly for clients' use.

I have the directories banned via robots.txt which deters the casual surfer, but it seems that someone somewhere (I can't find out who) is hotlinking to many of them, and costing us bandwidth.

All of them arrive with no referrer - which seems to be standard for pdf downloads, even if the user downloads from a page in our site. So I can't ban the "no-referrer" requests via .htaccess. Therefore they get the pdfs and never see any page in our site.

Does anyone have any suggestions as to how to protect the files from hotlinking, but still enable any people legitimately visiting our site to download them with little difficulty?

< Message edited by Kitka -- 1/27/2006 0:59:24 >


_____________________________

Kitka
**It is impossible to make anything foolproof because fools are so ingenious.**

womble

 

Posts: 5461
Joined: 3/14/2005
From: Living on the edge
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/27/2006 5:00:05   
Have them in a secured log-in area? Or is that making things too complicated?

_____________________________

~~ "A cruel god ain't no god at all" ~~
:)

(in reply to Kitka)
Kitka

 

Posts: 2507
Joined: 1/31/2002
From: Australia
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/27/2006 5:27:24   
quote:

Or is that making things too complicated?


Mmnn, not complicated, but maybe clarification is in order.

Client hires out a broad range of very specialised and expensive equipment. Potential hirees need access to brochures (to decide if equipment meets their needs) and subsequently user manuals (to know how to use it).

Our client wants all brochures and manuals for both current equipment and ex-hire equipment available as he feels it enhances his business's reputation. He doesn't understand that people can access them at his cost but without ever being aware of his business / services.

Brochures and manuals need to be easily available to genuine visitors to the site - but not hotlinked to sites unknown. Clients are not static - they vary from day to day.

Visitor log in would be fine, but it needs to be generic - e.g. User: Guest, Password: Anon. But how would I implement this, such that it admits visitors who access files only from our site, as opposed to from someone else's site?? In other words, how do I force them to be aware of the company providing the manuals/ brochures, and prevent downloads that do not originate from a product page in our site.

<edit> Hosted on apache/linux, not windoze </edit>

< Message edited by Kitka -- 1/27/2006 5:40:20 >


_____________________________

Kitka
**It is impossible to make anything foolproof because fools are so ingenious.**


(in reply to womble)
Nicole

 

Posts: 2802
Joined: 9/15/2004
From: Nambucca / Kempsey, Australia
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/27/2006 6:02:54   
Kitka,

I don't know the answer to your questions, but have you considered searching specific unique key phrases or document titles in an effort to try and find out who it might be? Also I wondered if Copyscape could be used to see who might be plagerising any document title or key phrases?

Hope that helps.

Nicole

_____________________________

Nambucca Valley & Kempsey Web Design | NixDesign
Get Netscape Navigator 9

(in reply to Kitka)
caz

 

Posts: 3470
Joined: 10/10/2001
From: Somewhere south of Chester, UK
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/27/2006 6:38:05   
You could put password security on the individual pdfs and display that password on the html pages on your site so that the only people able to open the pdfs are those who have followed the correct route to do so.

I am assuming that you have Acrobat or similar to make pdfs. :)

_____________________________

Do not meddle in the affairs of cats, for they are subtle and will dance, or more on your keyboard.
Cheshire cat. www.doracat.co.uk

I remember when it took less than 4hrs to fly across the Atlantic.

(in reply to Nicole)
Kitka

 

Posts: 2507
Joined: 1/31/2002
From: Australia
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/27/2006 7:09:33   
Caz: I have Acrobat (v5), and could probably add a password, although all these pdfs have been downloaded from various manufacturers sites, so do not originate with us.

But supposing I place a password on the files and mention it on each product page, what is to stop the hotlinkers from displaying that password on their site?

The hotlinkers have found the files by coming to our site in the first place, so would have seen the password - the files themselves are not listed in reputable search engines - and most disreputable ones have been banned via .htaccess.

Nicole: I have searched Google, Yahoo and MSN for all links to our site and also even more specific terms relating to the pdfs in question - all of which turned up nothing suspicious. Yet there is a constant stream of isolated accesses, mainly from Europe, all with no referrer. Copyscape only helps if web page text has been duplicated, this problem involves only links. Many thanks for the suggestion though :)

_____________________________

Kitka
**It is impossible to make anything foolproof because fools are so ingenious.**


(in reply to caz)
caz

 

Posts: 3470
Joined: 10/10/2001
From: Somewhere south of Chester, UK
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/27/2006 7:14:44   
quote:

what is to stop the hotlinkers from displaying that password on their site?


You could keep changing the passwords, but that is a lot of hassle for you I know- have you tried looking in the Adobe User to User forums for answers to this?

_____________________________

Do not meddle in the affairs of cats, for they are subtle and will dance, or more on your keyboard.
Cheshire cat. www.doracat.co.uk

I remember when it took less than 4hrs to fly across the Atlantic.

(in reply to Kitka)
golfer

 

Posts: 1730
Joined: 1/5/2005
From: Bath, Wiltshire, UK
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/27/2006 8:28:34   
Is it feasible to have a section on his web page which directs the visitor to a secure area but show in that section a password that needs to be typed in in order to access the information.

That way it may cause the hotlinks will be broken and the documents will only be available to his site visitors.

Hope I'm not spouting rubbish here:)

_____________________________

Ian

'You'll miss me when I've gone'

(in reply to caz)
caz

 

Posts: 3470
Joined: 10/10/2001
From: Somewhere south of Chester, UK
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/27/2006 8:40:58   
Alternatively you could do the hotlinking- to the manufacturers' sites for the pdfs :) But your client probably would not go for that if he wants to look like a one stop shop, as it were.

You could wrap each pdf in another pdf which says something like "Brought to you by <company name> and here is the password that you need to open the document/manual..." Y ou would link to the passworded manual from within the containing pdf. A bit like a Russian doll :)

I had a shufti around the Adobe forums, but didn't find anything apart from password security. But I did come acroos this site about preventing deep linking, if it's of any use to you. http://wordworx.com/


< Message edited by caz -- 1/27/2006 9:10:37 >


_____________________________

Do not meddle in the affairs of cats, for they are subtle and will dance, or more on your keyboard.
Cheshire cat. www.doracat.co.uk

I remember when it took less than 4hrs to fly across the Atlantic.

(in reply to golfer)
jeepless

 

Posts: 213
Joined: 12/20/2003
From: Smack in the middle of USA
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/27/2006 10:22:23   
One solution I've often seen used to discourage hotlinking of any files is to rename the file in question, then subsitute a bogus file that uses the old file name. So your original file called "my.pdf" might become "onlymy.pdf", then in place of the old file add another PDF file called "my.pdf". Perhaps this new file could be a single page PDF file containing in big bold letters, "This document was stolen from XYZ Company", or it might include the actual link to the correct file. Then when the other site hotlinks to this "my.pdf" file, their visitors will get the bogus file with the stolen message or a link to the correct file. And chances are good it will take some time before the other website realizes what you did.

You could also just rename your current file so their visitors will get a broken link, and that would save the bandwidth, but it's likely the other website will catch on rather soon. Not breaking their link may very well "hide" what you did for quite a while.

It's a game of "cat-and-mouse", but it works.

Hope that helps...


_____________________________

The problem with designing a system that's foolproof is that designers underestimate complete fools.

(in reply to caz)
caz

 

Posts: 3470
Joined: 10/10/2001
From: Somewhere south of Chester, UK
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/27/2006 10:29:15   
I think that Kitka has rather a lot of files to work with, all the same that's an idea.

_____________________________

Do not meddle in the affairs of cats, for they are subtle and will dance, or more on your keyboard.
Cheshire cat. www.doracat.co.uk

I remember when it took less than 4hrs to fly across the Atlantic.

(in reply to jeepless)
Kitka

 

Posts: 2507
Joined: 1/31/2002
From: Australia
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 1/31/2006 8:26:07   
Many thanks for the ideas (especially jeepless')- but most are impracticable because of the large number of files in question. Until I asked my question here, I had been dealing with it by changing the name of the folder containing the pdfs - but whoever is doing it, now appears to be checking our site regularly from a bookmark (so no referrer) and adjusting their links. They seem to be located in Sweden, not that it matters much.

Many thanks for the link to that article Caz. Was very helpful in precisely describing my problem but didn't give any easy answers. Although it did explain why I haven't been able to trace the links (and hence the culprit) in the SEs - it is being done with Javascript <doh>. Links from Javascript don't send a referrer and SEs can't read them.

I'm searching Google now to see if there are any free anti-leech scripts for apache around. I found one called Hotlink Reverser, but it costs US$99. :) There are a number of references to using PHP to dynamically deliver links but I haven't found anything readymade yet.

_____________________________

Kitka
**It is impossible to make anything foolproof because fools are so ingenious.**


(in reply to caz)
sal.scozzari

 

Posts: 4
Joined: 2/14/2006
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 2/14/2006 17:10:12   
A little chunk of server code might do the trick.

Replace your hotlinks with link buttons, and store the PDF files somewhere inaccessible from outside. When the button is clicked, the server explicitly loads the PDF file and streams it into the response.

The files can reside on some folder outside the virtual folder hierarchy, or better yet in a database. Effectively, there is no URL that explicitly resolves to any of your docs.

Hope this helps.

(in reply to Kitka)
Kitka

 

Posts: 2507
Joined: 1/31/2002
From: Australia
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 2/14/2006 17:25:47   
Hi Sal,

Welcome to OutFront!

Your idea sounds wonderful, except I don't understand how to implement it.

quote:

store the PDF files somewhere inaccessible from outside


How do I do that?

quote:

Effectively, there is no URL that explicitly resolves to any of your docs.


If there is no URL, how do I link the button to the PDF? Could you give an example of the coding please?

_____________________________

Kitka
**It is impossible to make anything foolproof because fools are so ingenious.**


(in reply to sal.scozzari)
sal.scozzari

 

Posts: 4
Joined: 2/14/2006
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 2/15/2006 11:15:01   
I'm assuming you're running ASP.NET; if you're not, then hopefully these suggestions will translate to whatever development environment you're using.

Let's say your server is hosting your web site "http://superwebsite" in the virtual folder "C:\Inetpub\wwwroot\superwebsite". You could store your docs in "C:\superwebsite\Docs". There is no way to navigate to that folder, and there is no direct URL that references that folder.

So, now no-one can get at these documents directly, but you want authentic visitors to your site to get at them. Instead of using hyperlinks (or whatever control that generates the <a href...> thingy ), place a button on your web page.

In the click event for the button, do something like this:

// Read the file into a buffer
string sf = @"C:\superwebsite\Docs\supermanual.pdf";
FileStream fs = new FileStream(sf, FileMode.Open, FileAccess.Read);
BinaryReader br = new BinaryReader(fs);

byte[] df = binReader.ReadBytes((int)fs.Length);

br.Close();
fs.Close();

// Transmit the buffer in the response
Response.Expires = 0;
Response.Buffer = true;
Response.ClearContent();
Response.ClearHeaders();
Response.ContentType = "application/pdf";
Response.BinaryWrite( df );
Response.Flush();
Response.End();

That's it. When your visitors click the button, the server responds with the requested document. Hope this helps.

(in reply to Kitka)
Kitka

 

Posts: 2507
Joined: 1/31/2002
From: Australia
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 2/15/2006 21:27:11   
Whoa, sounds like great stuff, but your suggestion is assuming knowledge that is way above my head!

quote:

I'm assuming you're running ASP.NET;


No, our sites are hosted on Linux/Apache.

All web-accessible files are stored in this folder on the remote server:

/home/username/public_html/

I don't really understand how to create a virtual folder. I have been able to create a folder called pdfs in the root directory (for our account) which is /home/username/ - is that what you meant?

However, I am now all at sea wondering how to call the pdfs contained in that directory from a button on a page. As I don't have access to ASP - what language is most appropriate?

Many thanks for your assistance with this :)

_____________________________

Kitka
**It is impossible to make anything foolproof because fools are so ingenious.**


(in reply to sal.scozzari)
Ranger Bob

 

Posts: 4
Joined: 6/6/2006
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 6/6/2006 0:09:07   
Kitka,

I thought this stuff would be over my head but it's not. I had same problem with people hyperlinking to images, PDF's, and such directly on my website. I also run Apache so this works, solved the problem easily enough here:

http://ranger-bob.net/?p=423 { More Info }.

Basically,

In your '/{homedir}/pdf' folder -- create the following file called '.htaccess' and write it in there. Place an appropriate 'leech.gif' image in your root of website to say that off-site links are not allowed -- and a include a URL where to go find the real download section.

Here is how I set up my .htaccess file. Basically, enter the URL's of site you allow to access the content directly.


RewriteEngine On
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://ranger-bob.net/ [NC]
RewriteCond %{HTTP_REFERER} !^http://lance-taylor.net/ [NC]
RewriteCond %{HTTP_REFERER} !^http://edmontonobservers.net/ [NC]
RewriteCond %{REQUEST_URI} !^/leech.gif [NC]
RewriteRule \.(mp4|MP4|swf|SWF|avi|AVI|bmp|BMP|mp3|MP3|pdf|PDF|zip|ZIP|wmv|WMV|mov|MOV|gif|GIF|jpg|JPG)$ http://ranger-bob.net/leech.gif [R]



As a Test -- Try to click on this hyperlinked PDF document to see what happens.

http://ranger-bob.net/eog/Comet.PDF

You will get the 'leech.gif' image displayed instead. Effectively prevents directly hyper-linking to content on your website. Easy.. peasy. If you are still stumped.. jest email me.

Cheers!

Ranger Bob



< Message edited by Ranger Bob -- 6/6/2006 0:23:35 >

(in reply to Kitka)
Ranger Bob

 

Posts: 4
Joined: 6/6/2006
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 6/6/2006 0:18:24   
Oh, now if you want to find out the "WHO' is hyperlinking aspect.. that was easy with this piece of software. I'm using their 30 day demo right now. Great stuff.. likely register it soon.

http://www.weblogexpert.com

Easy install, then just point it to the '/Apache/log/access.log' file on your server. Reveals some amazing stuff it does.

In fact, it's how I found this website and your question.. just tracing back some of the URLS that have paid me a visit over time.

(in reply to Ranger Bob)
Kitka

 

Posts: 2507
Joined: 1/31/2002
From: Australia
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 6/6/2006 0:38:21   
Many thanks Bob, but regrettably that method simply doesn't work in this instance. It is great for images, but not PDFs - because frequently, even with links from within the site hosting the file, no referrer is sent when a PDF is requested.

If you check your raw logs, you'll see that I managed to download your four page PDF with ease and no tricks employed. It is titled: "The Discovery of Comet Machholz".

And if you check your stats package, while it should show you my IP address, you will have no idea where I found the link, because there will be no referrer sent.

This was my major problem, and hence the call for assistance.

However, the good news (for me) is that I solved my problem by using Maxmind GeoIP Country lite (which is free) and PHP includes. On the manuals download page I have a PHP conditional script which calls a different include according to the country the visitor's IP belongs to. So I serve a page that links to the PDFs locally for Australian and New Zealand visitors and a different one that links to the manufacturer's PDFs on servers elsewhere - if I could find them, and I did find 99%. (Thanks Caz for the suggestion! :) )

_____________________________

Kitka
**It is impossible to make anything foolproof because fools are so ingenious.**


(in reply to Ranger Bob)
Ranger Bob

 

Posts: 4
Joined: 6/6/2006
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 6/6/2006 0:47:25   
Hyup, you got me there. I am still trying to figure out how to block all anonymous access (i.e. proxy) to my website also.

216.232.5.247 - - [05/Jun/2006:15:41:36 -0600] "GET /eog/Comet.PDF HTTP/1.1" 200 122880 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:43 -0600] "GET /eog/Comet.PDF HTTP/1.1" 200 327680 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:48 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 212992 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:48 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 24576 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:49 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 24576 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:49 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 24576 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:50 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 32768 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:42:01 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 524288 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"


If the inbound link does not have a referall.. they get in. Which is usually the hacker spambots etc. That be my own quest these days right now.

When you test the link above in this url (Anti-Leech) test you''ll see what you should get.

http://www.xentrik.net/htaccess/linktest.php



< Message edited by Ranger Bob -- 6/6/2006 1:09:02 >

(in reply to Kitka)
Kitka

 

Posts: 2507
Joined: 1/31/2002
From: Australia
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 6/6/2006 1:07:49   
quote:

Interesting indeed.. cause I assumed this line of code was handling the 'no refferal'.

RewriteCond %{HTTP_REFERER} !^$


That line actually specifically "allows" access if there is no referrer sent. Which is 100% necessary because of the fact I mentioned above, that even people clicking on a link to a PDF from within a site, will not send a referrer. Why? I have no idea - referrers are normally sent for html, image, php files etc, but PDFs, favicons and SWF files rarely if ever have a referrer.

If you didn't have that line to allow blank referrers, you'd find you were blocking genuine visitors to your own site from downloading the protected files.

quote:

Will see what else I can dig up on the net.. sounds like we both have the same problem to solve.


The best method is as detailed by sal in message 15 above. But I was unable to find a free PHP version of it and I know next to nothing about writing my own scripts.

_____________________________

Kitka
**It is impossible to make anything foolproof because fools are so ingenious.**


(in reply to Ranger Bob)
Ranger Bob

 

Posts: 4
Joined: 6/6/2006
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 6/6/2006 1:10:33   
Cool. Will check into it.... thanks! (OBTW, yer pretty quick on the reply too.) :)

And here I was just doing an RTFM.

http://httpd.apache.org/docs/1.3/misc/rewriteguide.html

< Message edited by Ranger Bob -- 6/6/2006 1:18:29 >

(in reply to Kitka)
Kitka

 

Posts: 2507
Joined: 1/31/2002
From: Australia
Status: offline

 
RE: How can I prevent hotlinking to PDF files - 6/6/2006 1:30:09   
quote:

OBTW, yer pretty quick on the reply too.


:)

Good luck in your hunt, and do let us know if you find something worthwhile that deals with the problem. :)

_____________________________

Kitka
**It is impossible to make anything foolproof because fools are so ingenious.**


(in reply to Ranger Bob)
Page:   [1]

All Forums >> Web Development >> Server Issues >> How can I prevent hotlinking to PDF files
Page: [1]
Jump to: 1





New Messages No New Messages
Hot Topic w/ New Messages Hot Topic w/o New Messages
Locked w/ New Messages Locked w/o New Messages
 Post New Thread
 Reply to Message
 Post New Poll
 Submit Vote
 Delete My Own Post
 Delete My Own Thread
 Rate Posts