2020-002-Liz Fong-Jones discusses blog post about Honeycomb.io Incident Response
Published January 23, 2020
|
36 min
    Download
    Add to queue
    Copy URL
    Show notes

    Ms. Berlin's appearance on #misec podcast - https://www.youtube.com/watch?v=Cj2IF0zn_BE with @kentgruber and @quantissIA

    Blog post: 

    https://www.honeycomb.io/blog/incident-report-running-dry-on-memory-without-noticing/

     

    What is Honeycomb.io?

    From the site: 

    “Honeycomb is a tool for introspecting and interrogating your production systems. We can gather data from any source—from your clients (mobile, IoT, browsers), vendored software, or your own code. Single-node debugging tools miss crucial details in a world where infrastructure is dynamic and ephemeral. Honeycomb is a new type of tool, designed and evolved to meet the real needs of platforms, microservices, serverless apps, and complex systems.”

     

    What are SLOs and how do you establish them? Are they anything like SLA (Service level agreements)?

     

    Can you give us an idea of timeline? Length of time from issue to IR to resolution? 



    Are the dashboards mentioned in the blogs post your operations dashboard?

    [nope! hashtag no-dashboards]

     

    Leading and lagging indicators ( IT and infosec call them detection and mitigation indicators)

        https://kpilibrary.com/topics/lagging-and-leading-indicators

     

    How important is telemetry (or meta-telemetry, since it’s telemetry on telemetry, if I’m reading it right --brbr) in making sure you can understand issues?

     

    Do you have levels of escalation? How do you define those?

     

    When you declared an emergency, how did brainstorming help with addressing the issues? Do that help your org see the way to a proper fix?

        Did you follow any specific methodology? Did you have a warroom or web conference?

       

     

    Communications:

    https://twitter.com/lizthegrey/status/1192036833812717568

     

    Can being over transparent be detrimental? 

     

    Communication methods in an IR:

        Slack

        Phone Tree

        Ticket system

        Emails

       

        What does escalation look like for Ms. Berlin? Mr. Boettcher?  (stories or examples?)

     

    Confirmation bias (or “it’s never in our house”) fallacy

        “I’ve seen and been a part of that, very prevalent in IT” --brbr

        Especially when the bias is based on previous outages/issues

     

    From the blog: “We quickly found ourselves locked in a state of confirmation bias…”



    Root Cause Analysis:

        Once you diagnosed the issue, how quickly was a fix pushed out?

        What kind of documentation or monitoring was generated/added to ensure this won’t happen again?

     

    Check out our Store on Teepub! https://brakesec.com/store

    Join us on our #Slack Channel! Send a request to @brakesec on Twitter or email bds.podcast@gmail.com

    #Brakesec Store!:https://www.teepublic.com/user/bdspodcast

    #Spotifyhttps://brakesec.com/spotifyBDS

    #RSShttps://brakesec.com/BrakesecRSS

    #Youtube Channel:  http://www.youtube.com/c/BDSPodcast

    #iTunes Store Link: https://brakesec.com/BDSiTunes

    #Google Play Store: https://brakesec.com/BDS-GooglePlay

    Our main site:  https://brakesec.com/bdswebsite

    #iHeartRadio App:  https://brakesec.com/iHeartBrakesec

    #SoundCloudhttps://brakesec.com/SoundcloudBrakesec

    Comments, Questions, Feedback: bds.podcast@gmail.com

    Support Brakeing Down Security Podcast by using our #Paypalhttps://brakesec.com/PaypalBDS OR our #Patreon

    https://brakesec.com/BDSPatreon

    #Twitter@brakesec @boettcherpwned @bryanbrake @infosystir

    #Player.FM : https://brakesec.com/BDS-PlayerFM

    #Stitcher Network: https://brakesec.com/BrakeSecStitcher

    #TuneIn Radio App: https://brakesec.com/TuneInBrakesec

      15
      15
        0:00:00 / 0:00:00