|
| |
|
|
Kitka
Posts: 2507 Joined: 1/31/2002 From: Australia Status: offline
|
How can I prevent hotlinking to PDF files - 1/27/2006 0:01:36
One site of ours contains hundreds of brochures and hefty user manuals (all pdfs) intended mainly for clients' use. I have the directories banned via robots.txt which deters the casual surfer, but it seems that someone somewhere (I can't find out who) is hotlinking to many of them, and costing us bandwidth. All of them arrive with no referrer - which seems to be standard for pdf downloads, even if the user downloads from a page in our site. So I can't ban the "no-referrer" requests via .htaccess. Therefore they get the pdfs and never see any page in our site. Does anyone have any suggestions as to how to protect the files from hotlinking, but still enable any people legitimately visiting our site to download them with little difficulty?
< Message edited by Kitka -- 1/27/2006 0:59:24 >
_____________________________
Kitka **It is impossible to make anything foolproof because fools are so ingenious.**
|
|
|
|
Kitka
Posts: 2507 Joined: 1/31/2002 From: Australia Status: offline
|
RE: How can I prevent hotlinking to PDF files - 1/27/2006 5:27:24
quote:
Or is that making things too complicated? Mmnn, not complicated, but maybe clarification is in order. Client hires out a broad range of very specialised and expensive equipment. Potential hirees need access to brochures (to decide if equipment meets their needs) and subsequently user manuals (to know how to use it). Our client wants all brochures and manuals for both current equipment and ex-hire equipment available as he feels it enhances his business's reputation. He doesn't understand that people can access them at his cost but without ever being aware of his business / services. Brochures and manuals need to be easily available to genuine visitors to the site - but not hotlinked to sites unknown. Clients are not static - they vary from day to day. Visitor log in would be fine, but it needs to be generic - e.g. User: Guest, Password: Anon. But how would I implement this, such that it admits visitors who access files only from our site, as opposed to from someone else's site?? In other words, how do I force them to be aware of the company providing the manuals/ brochures, and prevent downloads that do not originate from a product page in our site. <edit> Hosted on apache/linux, not windoze </edit>
< Message edited by Kitka -- 1/27/2006 5:40:20 >
_____________________________
Kitka **It is impossible to make anything foolproof because fools are so ingenious.**
|
|
|
|
Nicole
Posts: 2802 Joined: 9/15/2004 From: Nambucca / Kempsey, Australia Status: offline
|
RE: How can I prevent hotlinking to PDF files - 1/27/2006 6:02:54
Kitka, I don't know the answer to your questions, but have you considered searching specific unique key phrases or document titles in an effort to try and find out who it might be? Also I wondered if Copyscape could be used to see who might be plagerising any document title or key phrases? Hope that helps. Nicole
_____________________________
Nambucca Valley & Kempsey Web Design | NixDesign Get Netscape Navigator 9
|
|
|
|
caz
Posts: 3470 Joined: 10/10/2001 From: Somewhere south of Chester, UK Status: offline
|
RE: How can I prevent hotlinking to PDF files - 1/27/2006 7:14:44
quote:
what is to stop the hotlinkers from displaying that password on their site? You could keep changing the passwords, but that is a lot of hassle for you I know- have you tried looking in the Adobe User to User forums for answers to this?
_____________________________
Do not meddle in the affairs of cats, for they are subtle and will dance, or more on your keyboard. Cheshire cat. www.doracat.co.uk I remember when it took less than 4hrs to fly across the Atlantic.
|
|
|
|
jeepless
Posts: 213 Joined: 12/20/2003 From: Smack in the middle of USA Status: offline
|
RE: How can I prevent hotlinking to PDF files - 1/27/2006 10:22:23
One solution I've often seen used to discourage hotlinking of any files is to rename the file in question, then subsitute a bogus file that uses the old file name. So your original file called "my.pdf" might become "onlymy.pdf", then in place of the old file add another PDF file called "my.pdf". Perhaps this new file could be a single page PDF file containing in big bold letters, "This document was stolen from XYZ Company", or it might include the actual link to the correct file. Then when the other site hotlinks to this "my.pdf" file, their visitors will get the bogus file with the stolen message or a link to the correct file. And chances are good it will take some time before the other website realizes what you did. You could also just rename your current file so their visitors will get a broken link, and that would save the bandwidth, but it's likely the other website will catch on rather soon. Not breaking their link may very well "hide" what you did for quite a while. It's a game of "cat-and-mouse", but it works. Hope that helps...
_____________________________
The problem with designing a system that's foolproof is that designers underestimate complete fools.
|
|
|
|
caz
Posts: 3470 Joined: 10/10/2001 From: Somewhere south of Chester, UK Status: offline
|
RE: How can I prevent hotlinking to PDF files - 1/27/2006 10:29:15
I think that Kitka has rather a lot of files to work with, all the same that's an idea.
_____________________________
Do not meddle in the affairs of cats, for they are subtle and will dance, or more on your keyboard. Cheshire cat. www.doracat.co.uk I remember when it took less than 4hrs to fly across the Atlantic.
|
|
|
|
sal.scozzari
Posts: 4 Joined: 2/14/2006 Status: offline
|
RE: How can I prevent hotlinking to PDF files - 2/14/2006 17:10:12
A little chunk of server code might do the trick. Replace your hotlinks with link buttons, and store the PDF files somewhere inaccessible from outside. When the button is clicked, the server explicitly loads the PDF file and streams it into the response. The files can reside on some folder outside the virtual folder hierarchy, or better yet in a database. Effectively, there is no URL that explicitly resolves to any of your docs. Hope this helps.
|
|
|
|
Kitka
Posts: 2507 Joined: 1/31/2002 From: Australia Status: offline
|
RE: How can I prevent hotlinking to PDF files - 2/14/2006 17:25:47
Hi Sal, Welcome to OutFront! Your idea sounds wonderful, except I don't understand how to implement it. quote:
store the PDF files somewhere inaccessible from outside How do I do that? quote:
Effectively, there is no URL that explicitly resolves to any of your docs. If there is no URL, how do I link the button to the PDF? Could you give an example of the coding please?
_____________________________
Kitka **It is impossible to make anything foolproof because fools are so ingenious.**
|
|
|
|
sal.scozzari
Posts: 4 Joined: 2/14/2006 Status: offline
|
RE: How can I prevent hotlinking to PDF files - 2/15/2006 11:15:01
I'm assuming you're running ASP.NET; if you're not, then hopefully these suggestions will translate to whatever development environment you're using. Let's say your server is hosting your web site "http://superwebsite" in the virtual folder "C:\Inetpub\wwwroot\superwebsite". You could store your docs in "C:\superwebsite\Docs". There is no way to navigate to that folder, and there is no direct URL that references that folder. So, now no-one can get at these documents directly, but you want authentic visitors to your site to get at them. Instead of using hyperlinks (or whatever control that generates the <a href...> thingy ), place a button on your web page. In the click event for the button, do something like this: // Read the file into a buffer string sf = @"C:\superwebsite\Docs\supermanual.pdf"; FileStream fs = new FileStream(sf, FileMode.Open, FileAccess.Read); BinaryReader br = new BinaryReader(fs); byte[] df = binReader.ReadBytes((int)fs.Length); br.Close(); fs.Close(); // Transmit the buffer in the response Response.Expires = 0; Response.Buffer = true; Response.ClearContent(); Response.ClearHeaders(); Response.ContentType = "application/pdf"; Response.BinaryWrite( df ); Response.Flush(); Response.End(); That's it. When your visitors click the button, the server responds with the requested document. Hope this helps.
|
|
|
|
Ranger Bob
Posts: 4 Joined: 6/6/2006 Status: offline
|
RE: How can I prevent hotlinking to PDF files - 6/6/2006 0:09:07
Kitka, I thought this stuff would be over my head but it's not. I had same problem with people hyperlinking to images, PDF's, and such directly on my website. I also run Apache so this works, solved the problem easily enough here: http://ranger-bob.net/?p=423 { More Info }. Basically, In your '/{homedir}/pdf' folder -- create the following file called '.htaccess' and write it in there. Place an appropriate 'leech.gif' image in your root of website to say that off-site links are not allowed -- and a include a URL where to go find the real download section. Here is how I set up my .htaccess file. Basically, enter the URL's of site you allow to access the content directly.
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://ranger-bob.net/ [NC]
RewriteCond %{HTTP_REFERER} !^http://lance-taylor.net/ [NC]
RewriteCond %{HTTP_REFERER} !^http://edmontonobservers.net/ [NC]
RewriteCond %{REQUEST_URI} !^/leech.gif [NC]
RewriteRule \.(mp4|MP4|swf|SWF|avi|AVI|bmp|BMP|mp3|MP3|pdf|PDF|zip|ZIP|wmv|WMV|mov|MOV|gif|GIF|jpg|JPG)$ http://ranger-bob.net/leech.gif [R]
As a Test -- Try to click on this hyperlinked PDF document to see what happens. http://ranger-bob.net/eog/Comet.PDF You will get the 'leech.gif' image displayed instead. Effectively prevents directly hyper-linking to content on your website. Easy.. peasy. If you are still stumped.. jest email me. Cheers! Ranger Bob
< Message edited by Ranger Bob -- 6/6/2006 0:23:35 >
|
|
|
|
Ranger Bob
Posts: 4 Joined: 6/6/2006 Status: offline
|
RE: How can I prevent hotlinking to PDF files - 6/6/2006 0:18:24
Oh, now if you want to find out the "WHO' is hyperlinking aspect.. that was easy with this piece of software. I'm using their 30 day demo right now. Great stuff.. likely register it soon. http://www.weblogexpert.com Easy install, then just point it to the '/Apache/log/access.log' file on your server. Reveals some amazing stuff it does. In fact, it's how I found this website and your question.. just tracing back some of the URLS that have paid me a visit over time.
|
|
|
|
Kitka
Posts: 2507 Joined: 1/31/2002 From: Australia Status: offline
|
RE: How can I prevent hotlinking to PDF files - 6/6/2006 0:38:21
Many thanks Bob, but regrettably that method simply doesn't work in this instance. It is great for images, but not PDFs - because frequently, even with links from within the site hosting the file, no referrer is sent when a PDF is requested. If you check your raw logs, you'll see that I managed to download your four page PDF with ease and no tricks employed. It is titled: "The Discovery of Comet Machholz". And if you check your stats package, while it should show you my IP address, you will have no idea where I found the link, because there will be no referrer sent. This was my major problem, and hence the call for assistance. However, the good news (for me) is that I solved my problem by using Maxmind GeoIP Country lite (which is free) and PHP includes. On the manuals download page I have a PHP conditional script which calls a different include according to the country the visitor's IP belongs to. So I serve a page that links to the PDFs locally for Australian and New Zealand visitors and a different one that links to the manufacturer's PDFs on servers elsewhere - if I could find them, and I did find 99%. (Thanks Caz for the suggestion! )
_____________________________
Kitka **It is impossible to make anything foolproof because fools are so ingenious.**
|
|
|
|
Ranger Bob
Posts: 4 Joined: 6/6/2006 Status: offline
|
RE: How can I prevent hotlinking to PDF files - 6/6/2006 0:47:25
Hyup, you got me there. I am still trying to figure out how to block all anonymous access (i.e. proxy) to my website also.
216.232.5.247 - - [05/Jun/2006:15:41:36 -0600] "GET /eog/Comet.PDF HTTP/1.1" 200 122880 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:43 -0600] "GET /eog/Comet.PDF HTTP/1.1" 200 327680 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:48 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 212992 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:48 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 24576 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:49 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 24576 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:49 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 24576 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:41:50 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 32768 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
216.232.5.247 - - [05/Jun/2006:15:42:01 -0600] "GET /eog/Comet.PDF HTTP/1.1" 206 524288 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
If the inbound link does not have a referall.. they get in. Which is usually the hacker spambots etc. That be my own quest these days right now. When you test the link above in this url (Anti-Leech) test you''ll see what you should get. http://www.xentrik.net/htaccess/linktest.php
< Message edited by Ranger Bob -- 6/6/2006 1:09:02 >
|
|
|
|
Kitka
Posts: 2507 Joined: 1/31/2002 From: Australia Status: offline
|
RE: How can I prevent hotlinking to PDF files - 6/6/2006 1:07:49
quote:
Interesting indeed.. cause I assumed this line of code was handling the 'no refferal'. RewriteCond %{HTTP_REFERER} !^$ That line actually specifically "allows" access if there is no referrer sent. Which is 100% necessary because of the fact I mentioned above, that even people clicking on a link to a PDF from within a site, will not send a referrer. Why? I have no idea - referrers are normally sent for html, image, php files etc, but PDFs, favicons and SWF files rarely if ever have a referrer. If you didn't have that line to allow blank referrers, you'd find you were blocking genuine visitors to your own site from downloading the protected files. quote:
Will see what else I can dig up on the net.. sounds like we both have the same problem to solve. The best method is as detailed by sal in message 15 above. But I was unable to find a free PHP version of it and I know next to nothing about writing my own scripts.
_____________________________
Kitka **It is impossible to make anything foolproof because fools are so ingenious.**
|
|
New Messages |
No New Messages |
Hot Topic w/ New Messages |
Hot Topic w/o New Messages |
Locked w/ New Messages |
Locked w/o New Messages |
|
Post New Thread
Reply to Message
Post New Poll
Submit Vote
Delete My Own Post
Delete My Own Thread
Rate Posts
|
|
|