Secure, fast, low-resource file delivery

Here’s the problem. I have a handful of files between 500MB-800MB that I need to host up on my website. I need to get these files to people that belong to a certain vBulletin community, and only people of that vBulletin community. Furthermore, the people of that community aren’t the sharpest knives in the drawer, so it has to be easy. I have limited resources, which include a shared LAMP webserver, and a 150/150 VPS.

The requirements? Just a few:

  • Must be able to track users to sessions to monitor and log abuse
  • Links generated should be anti-leech and secured with one-time use tokens
  • Links generated can only be accessible from this vBulletin community
  • Links must be easy to use

The limitations? It’s like being handcuffed to a telephone pole:

Things I’ve tried, and failed to work:

  • Doing the classic PHP readfile() and writing to the buffer
    • Far too slow
    • Persistant PHP session isn’t nice to shared hosting
    • readfile() in lighttpd and Apache cause the process to consume massive amounts of memory for large files, doesn’t work in VPS
    • More than 5 users or so and the load spikes to high heaven. Server /suicides
  • Using lighttpd and mod_secdownload
    • Far faster than PHP readfile()
    • Will only work on VPS as shared hosting is Apache only
    • If you don’t have enough RAM (which I don’t on the VPS) – the process with explode and hang. I can’t kill it, and the server refuses connections until the watchdog resets the process or until I reboot the VPS. Server /suicides again.
  • Using mod_rewrite in conjunction with rewritemap and prg in Apache
    • Sounds perfect – you can create tokens, and then revoke them with the rewritemap program, then use mod_rewrite to hide path source!
    • …Until you realize that you need to modify your apache.conf to specify the rewritemap program (requires root or sudo, not happening on either VPS or shared)
    • Your rewrite program is executed on server start, and is persistent – which is not allowed on shared hosting anyways. I /suicide
  • Apache with mod_xsendfile
    • Can’t install custom modules on Apache in shared or VPS hosting
    • Apache is still a fat hog of a server, and using Apache to send binary files is about as efficient as using Humvees for mass transit.
  • Dynamic mod_rewrite definitions in .htaccess files
    • Use a manual lockfile with a SQL database to dynamically generate .htaccess mod_rewrite definitions
    • Each rewrite entry for a unique URL a time limit
    • Each successful request for token updates both the database and the .htaccess file
    • This seems like the best solution – it will work with the most hosting environments too
    • This behavior is basically emulating mod_secdownload for lighttpd in Apache with mod_rewrite

The last entry (emulating mod_secdownload in Apache with mod_rewrite) looks good, and is probably going to be what I’m going to use – I’ll be posting the results here as soon as I finish coding it. Weeeee!

    Continue Reading

    Fears of swine flu worse than swine flu itself

    Why is everyone absolutely going nuts. Each day, 2700 people die of malaria, but it doesn’t make national news. To date, less than 30 people have died of swine flu, oh, sorry, H1N1, and out of that 30  or so, only one has died in the US, and it wasn’t a US citizen to begin with. It was an infant that came from Mexico.

    Continue Reading

    Stats on Dreamhost

    If you’re running a CMS with dynamic redirection using mod_rewrite on DreamHost, you may run into a problem trying to access your statistics – since the .htaccess file specifies files and links, and not server configuration redirects like the /stats/ directory, you can be in a bit of trouble. When you try to access the /stats/ virtual directory, the url will get rewritten, your CMS will get confused, and you will get an error instead of seeing your precious website statistics. I had been using Google Analytics, but that requires that your website’s code be loaded in order for the stats to be generated. This does not catch hotlinking or deeplinking into images or other media. 

    I had been searching for weeks on how to write a mod_rewrite exception, but anyone that knows mod_rewrite code knows it will make you want to jump off a cliff. It’s honestly almost as bad as writing a config file for sendmail or bind before they introduced the structued layout.

    Today I had a thought, since this problem is nearly unique to DreamHost, why not search there for potential solutions to the problem? 10 minutes of digging in the DreamHost wiki and I had my solution – Making stats accesible with htaccess – this simple rule inserted before your permalink rules allows a catch and exception for the /stats/ virtual directory:

    RewriteEngine On
    RewriteBase /
    RewriteCond %{REQUEST_URI} ^/(stats|failed_auth\.html).*$ [NC]
    RewriteRule . – [L]

    Continue Reading

    Dreamhost Site Speeding-Up-Ness

    For those of you on shared hosting with Dreamhost – we all know they’re oversold up to their eyeballs. While the prices are great and the proposed services offered are even greater, their servers are overloaded and very slow. Very very slow. I took a dive into their wiki today and found this gem: PHP FastCGI 

    After a few seconds of careful mucking in the .htaccess file, and writing a new spawn script for the PHP process, I have sped up a few of my websites by a noticible amount. WordPress wasn’t exactly the most efficient CMS either, so this in combination with WP-SuperCache has made the loading times…liveable…for now.

    Normally, Apache server spawns a child process for each request on the website (overly simplified) – each one of those child processes then reads your PHP code and makes the HTML output that you see on the website. It works great, except when you have eleventy billion people on your website, or in my case, an overloaded web server. The process of spawning a child process, then forcing that child process to intepret its own php code is very resource intensive. So what do we do to fix it? 

    The hack above makes two discrete child processes that process php code. Those two are it – which is great because instead of spawning, starting, processing, stopping, despawning each time a visitor hits the page, you just do the processing, which makes things go much quicker. The downside is the website may not scale well to unexpected spikes in traffic, like a Digg or Slashdot. But I have that covered as well – chances are the content you see on my is all static HTML. There is a script that runs in the background that runs all the php code beforehand to generate the pages. When a visitor comes along, all that is served along is the static html page with no hits to the php intepretor. Only when the content changes does the php get hit again. Great for now, but I hope to get a more powerful server in the future.

    Continue Reading

    Nintendo DS and Access Point Configuration

    If you have a Nintendo DS\DS Lite\DSi and you are trying to connect to your access point – you may have a problem where access point shows up in the list, but you cannot connect to it. Look in the advanced settings for your access point – and look for an options labeled Basic Rate – make sure it is set to 1-2Mbps!

    Continue Reading

    SEO By Hand With WordPress

    So among the sites I have spawned (GenesisDriven.com, AndrewPeng.net, Over-Boost.net) I’ve been reading on how to do SEO (Search Engine Optimization) to make it faster and easiar for Google and other search engines to index my site and make it searchable. According to this page:

    Display a three-digit number. The URL for each article must contain a unique number consisting of at least three digits. For example, we can’t crawl an article with this URL: http://www.google.com/news/article23.html. We can, however, crawl an article with this URL: http://www.google.com/news/article234.html. Keep in mind that if the only number in the article consists of an isolated four-digit number that resembles a year, such as http://www.google.com/news/article2006.html, we won’t be able to crawl it. 

    And so I set about setting the permalink structure for mod_rewrite to this structure:

    /posts/%year%/%monthnum%/%day%%post_id%-%postname%.html

    Which is great, because WordPress will automatically redirect any of the standard permalink structures to your custom one, so people linking in will still have their links work. Content like pictures still use the same link structure so thats not anything to worry about. See where I prefixed %post_id% with %day% – this allows early posts with post_id’s with less than three digits to get indexed too.

    Continue Reading

    You’re Gonna Love My Nuts

    Who is this guy? Any where does he get all his energy from? I get the impression that he only has this kind of energy when he’s filming; he probably spends the rest of the day laying in a drunken stupor on the floor of the bar throwing bottles at people. But wait, no he isn’t; he’s actually one of the good guys – a victim of Scientology. No, I’m not making this up – I read the story of how Scientology ruined his life here after a friend suggested it in #hardocp. Read it – and get to brush up on Vince Offer, his real name, and how Scientology ruined his life. No. I’m not kidding. Link here.

    Turns out, Headset Vince is actually one of the good guys, a hero. We need to stop mocking him and start celebrating him. And we need to buy his towels and nut choppers and his DVDs.

    Continue Reading

    Making coffee with the Aerobie AeroPress, Among Other Things

     

    The result of being unbelievely bored. Video taken with a Logitech Quickcam Pro 9000, heavily post proccessed with VirtualDub. Interesting story, I was looking for a camera replacement for my aging Quickcam Messenger. One night the whole family takes a trip out to do some shopping. I decide to go to the Circuit City to check out some of the closeout deals. They still had absolutely insane prices, $60 for a Everquest expansion that was out nearly 6 years ago? $650 for a computer in “As-is” condition with missing parts? No thanks, I go next door to the Target. What do I see? A clearance boxed Logitech Quickcam Pro 9000 sitting on the shelf for the incredible price of $49.99 – at least $20 cheaper than online, and no shipping costs either. According to Cowboy Frank’s Webcam Review site, this was one of the best webcams on the market right now. I bought it. Screw you, Circuit City.

    Continue Reading