
When you run a big online service (code-wise in this example) there are always a number of bugs. Most of these bugs are minor issues that don’t affect the functionality of the service- for example, on the Secure Delivery forums I know there is a slight table misalignment in the forum navigation bar that makes it 10px too far to the left. These kinds of bugs impact no one adversely, and to be honest I don’t think most would even notice.
Every now and then one of our users will uncover what we call a “show stopper”, or critical bug that immediately stops them from being able to use the service. These are the most severe sort of bug- they are the ones that lead to canceled accounts if you don’t work quickly and decisively to fix them ASAP.
Luckily, our number of show stopper bugs has been very few- I think we have had 3 or 4 total reported, and all were fixed within 24 hours. Typically when we encounter a confirmed show stopper bug that prevents a subscriber from using the service we refund their entire month of subscription fees as a way of apologizing for the inconvenience and thanking them for working with us to resolve their issue. This seems to work well, and our attrition rate on people that report errors is very low.
Unfortunately, every now and then you get a show stopper bug that you are unable to reproduce. “The Bug From Hell” (TBFH) was one of these bugs. It involved large product uploads not being saved to the product database even after giving the user the proper feedback and letting them complete the setup process.
In Secure Delivery each product has a little icon that shows its current readiness status to be sold- most products have a green check mark under the status column (or “green light”) and are 100% good to go. If there is a problem during product creation Secure Delivery flags products with problems with a yellow exclamation point icon, or what we call a “yellow light”. This shows the user that the product is not ready to be sold and usually results from bailing out of the product creation process before entering all the required information, loss of connectivity to the Internet during product setup, etc.
Well, TBFH was not saving the uploads and causing a “yellow light” to show up for that product’s status. Let me tell you, the most maddening thing a customer can ever see after uploading their big ass product for 30 minutes is a yellow light when they think they are done.
TBFH only affected the 0.3% of our users who uploaded big products, and it was completely random at that. Sometimes everything worked great (we have 700MB ISO files successfully stored in our database), and sometimes it would just grind the server to a halt. By big uploads I mean in excess of 90MB when our average product size is somewhere around 12MB. Unfortunately, that 0.3% were also paying customers, so it usually resulted in someone paying for Secure Delivery getting peeved (free accounts are limited to 50MB products, so they could never experience the error).
The worst thing about TBFH was that we knew it existed, but we were unable to reproduce (and fix) it until now. Long story short, last night we were able to successfully identify the cause of the bug and fix it after a subscriber reported the issue to us. After identifying the issue we were able to tweak a couple files and we have been busy testing the fix this morning. Shortly we will push the bug fix to production and update the customer on it’s status (we don’t typically push updates at night- we like to be awake if something breaks).
On the plus side, the tweak made to correct The Bug From Hell will improve overall performance for everyone. It’s not often that there is something wrong that we can’t fix with Secure Delivery, even if it affects only a small minority of our users. I’ve actually lost sleep over this one, so let me tell you- after testing the fix last night I slept like a baby :)
In an hour or so TBFH will be history and I feel great!


Woohoo! That’s great to hear :D I’ve always loved a happy ending…
Hi Jason,
It’s been a while! I love the pic you used for the article… very fitting! Glad to hear that you got the bug fixed and saved the day! Talk to you soon,
Luc