Back From a Retreat

Hi everyone,

Just got back from a retreat and found that the daily verses on facebook also went into a ‘retreat’!

The verses and images are back online.  Read on if you want to know the gritty details.

There were three main failures leading to the ‘retreat’ of the verses.

1. New web hosting environment
2. Change in imgur apis for public albums
3. Automatic suspension of monitoring services

1. New web hosting environment
I switched to a new web hosting company a few days before going for my retreat.

Mistake #1.

Never make major changes before going on a retreat.  (I also uploaded an update of the Android app back then … leading to app settings not saved etc … another story!)

The old web hoster was demanding an increase of hosting fees (200% ~ 300%) and it was due in Oct.  With the new hosting company, the hosting fees are ‘locked’ in perpetually.  What this means is that it will honour those prices and promise not to hike the prices but allow existing customers to renew at their signon rates.

A part of me secretly thinks that it means they will honour it until they go bust.

Another reason for switching, a technical one, is the numerous and frequent unscheduled downtime of the web server.  I am perfectly fine with scheduled maintainence / upgrade server downtime.  But when the server just goes up and down and have poor response time, it means that this company is signing on more sheeps than their barn can hold.

Time to go to another farm.  If I’m going to get fleeced, I want to make sure I get adequate leg-room and good chomp to boot.

Effectively, a hosting company just have to email me before they power down their servers for maintenance, and I’m a happy camper.  I don’t care if it is because they want to upgrade to a new shiny SSD drive or just to watch customers’ sites go down.  If you keep me posted, that’s good enough.

The catch is that, you cannot do that for failures or traffic load.  Not easily anyway.  So decent web hosting companies should do just fine, not the shady run by night ones.

With the new hoster, all is shiny and good.  At some point, they decided to enforce a php safe_mode lock down.  Good practice.  Except it breaks many other legitimate apps.

This led to multiple warnings and errors logged in the server and disabled the tweeter code.

Failure #1.

2. Change in imgur apis for public albums

Imgur is a very useful image hosting service company.   It is mostly free to users to upload and to view the images online, with the option to go pro (read: cough up the silver!).

Like most online services, it has an api (Application Programming Interface) for other mobile, web or desktop apps to talk to it and consume the image hosting service.  So instead of having a human being go to its site, copy and paste the image link etc, you get an app to do it.

Effectly, your app talks to the imgur server app.  But there need a way for them to understand each other.  Apps are pretty dumb, in that you cannot simply get two apps
to talk without first establishing how they should talk.  They need a way to Interface, hence Application Programming Interface.  It’s like trade lingo, except it’s for apps, and in this case, specifically for apps to talk to imgur.

Imgur decides how other apps talk to its server apps.  All went well until I think mid or late August or so.  Imgur decided to speak French.  Ok, maybe not French, but a slight change in the URL format.

The bane of consuming web services.  Until apps that you find in your CDROM (anyone still use that??) and install into your pc, web services / apps can, will and do change.  Sometimes without notice.  Sometimes with notice when you are away on a retreat!!

So, that broke the interface for the image retrieval code.

This error also led to time outs, leading to the next problem.

Failure #2.


3. Automatic suspension of monitoring services

The facebook app that I wrote used to run on a cron service to send out daily verses to users according to their timezone location in the world.  One of the previous web hosting company decided that cron jobs are dangerous and routinely disabled the cron jobs without notice.  (I was running the cron jobs 24 x 2 times a day.  Once per hour and another as backup to complete the job if the earlier one failed somehow).

This was switched to an external monitoring service.  I piggy back onto the site monitoring and get it to monitor my server app and effectively call it 24 x 2 times a day!

So far so good.

When the above two failures persisted over a few hours, the monitoring service would suspend that monitor request.  The good thing about this service is that it keeps you notified through email notifications.

The bad thing is that after receiving thousands of notifications, I set a gmail filter to mark it as read and archive them.

Also, once a request is suspended, it no longer update you further.

Oh, and did I mention that I was on a retreat?

 

So, now everything is back online.

Hmmm … I think there is some Dharma learning to glean from the above … but I’m heading for lunch now.  Why don’t you post your learning below? 🙂