
The price of reliability is the pursuit of simplicity.
Azure is an approach to provide reliable applicaions instead of trying to provide reliable servers.
Challenges to reliable software:
Challenge - State
Answer
- go horizontal, go stateless
- store state in azure storage. (Microsoft believes in this so much they have built it into the defaults)
- session state provider available for azure.
Challenge - Unreliable Components - can't trust that a single instance of an app or role will be up all the time. Create app in starbucks model. You don't wait at cash register for you coffee.
Answer
- build loosely coupled
- use azure queues for messaging
Challenge - Varying Load
- your app need to be able to handle varying loads.
Answer
- add a thermostat to your application. Have it change behavior in different loads.
Challenge - hardware failures.
Answer
- use local storage as a cache.
- Retry on transient failures.
- be idempotent.
Don't launch a dos attack on yourself.
Be predictable.
Don't have any code in your shutdown path.
Know when to throttle and shed load to keep your users happy. Twitter turned off reply feature when had scalability problems a while back. Queue batch jobs to run when utilization is low.

Build knob in application that taylors your app to do what it needs. Some actions are important to be exact others just need to be apporximate or close.
Handling updates to application:
Need to deal with both code an data changes.
Update code and data one at a time. Which ever you update has to be compatible with both versions of the other.
Use version numbers in our table schema.
Down time during upgrades:
Widows azure has rolling upgrades.

We will have precise control to roll out updates over update domains.
Use pacific ocean as time for upgrades. When sun is over the ocean there is the least amount of Internet traffic.

Mars rover needed a patch because it kept rebooting. Funny thought about the farthest delivered patch.
Use the local azure to debug and test.
Separate code and config.
Use azure config service and comfig update service.
Implement lots of logging!!!
You can turn on and turn off logging during production by updating the config file.
Tag data with unique id to track data across the system.
How get notified when something bad happens? - azure has an alert system to keep you informed. Email, Im, and phone (text message).
The big red button - panic button. Build it into your app so that you can roll back to a the last working build.

Last bit of advice: KISS