Incident Response Across Non-Software Industries with Emil Stolarsky
Published October 15, 2017
22 min
    Add to queue
    Copy URL
    Show notes
    What can software learn from industries like aerospace, transportation, or even retail during national disasters? This week’s podcast is with Emil Stolarsky and was recorded live after his talk on the subject at Strangeloop 2017. Interesting points from the podcast include several stories from Emil’s research, including the origin of the checklist, how Walmart pushed decision making down to the store level in a national disaster, and where the formalized conversation structure onboard aircraft originated. The podcast mentions several resources you can turn to if you want to learn more and wraps with some of the ways this research is affecting incident response at Shopify. Why listen to this podcast: * Existing industries like aerospace have built a working history of how to resolve issues; it can be applicable to software issues as well. * Crew Resource Management helps teams work together and take ownership of problems that they can solve, instead of a command-and-control mandated structure. * Checklists are automation for the brain. * Delegating authority to resolve system outages removes bottlenecks in processes that would otherwise need managerial sign off. * When designing an alerting system, make sure it doesn’t flood with irrelevant alerts and that there’s clear observability to what is going wrong. More on this: Quick scan our curated show notes on InfoQ You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. Subscribe: Like InfoQ on Facebook: Follow on Twitter: Follow on LinkedIn: Want to see extented shownotes? Check the landing page on InfoQ:
        0:00:00 / 0:00:00