Confuseddev and prodenvironmentsLost proddata(even abit)CloudregiondownCouldn’treach theprimarycontactForgotto testbackupHit restore,regrettedimmediatelyRestoredsuccessfully—into prodby mistakeDR testfailedGotlocked outmid-recoveryBackupran... butdidn’t includethe databaseWrote aDR planno onereadBackuptape wascorruptedPower cameback... andthen wentout again“Itworkedin dev”SearchedTeams/WhatsApp/Slackfor the DR stepsPracticedfailoverCreateda backupstrategyRan afailover,forgot thefirewall rulesWoke upmidnightto standbycallTestedDR... inprod byaccidentLogged intothe wrongcloudaccountGot calledduringdinnerAlertfatigueDR testpassed...because noone actuallytested anythingDependencyfailedsilentlyDBmigrationfailedLoggedincident... tothe wrongteamFix requiredphysicalaccess (noone hadkeys)Ignored analert thatwas realthis timeVendorsaid,“That’s notcovered”Appliedthe wrongconfig toprodSpent 2hoursdebugging—then found itwas a typoDeployedduring amajorincidentOncallduring aholidayFoundpasswordsin a stickynoteFound thebackup inthe wrongformatRestoredfrombackupRan a DRstimulationgameDR planincluded aretiredemployeeThe “hotsite” wasactuallycoldDiscoveredthe backupdrive wasfullRealized youwererestoring thewrong day’sbackupNetworkoutageRealized theDR testbrokesomethingelseDeploymentbroke prodSaw amysteriouscron joblabeled “donot delete”Team usedfive differentdefinitions of“RTO”AccidentallydeleteddataMisreada severityalertGot called mid-flight (tried totroubleshootover airplaneWi-Fi)Didn’thave abackupCalledvendorsupport—hitvoicemailDiscoveredhalf the infrawas neverdocumentedStarted aDR drill—no oneshowed upSystem alertmissedbecause alertrule was toospecificRecoverytook >1dayBackuppasswordwaschanged butnot sharedAlertfalsepositiveRanchaos testin prodNorunbookavailableExternalservicewentdownUnreachableDNSDid apost-mortemSomeoneunpluggedthe “do nottouch” serverConflictingrecoveryinstructionsCustomscriptfailed withno logsFoundcriticalsystem on apersonallaptopConfuseddev and prodenvironmentsLost proddata(even abit)CloudregiondownCouldn’treach theprimarycontactForgotto testbackupHit restore,regrettedimmediatelyRestoredsuccessfully—into prodby mistakeDR testfailedGotlocked outmid-recoveryBackupran... butdidn’t includethe databaseWrote aDR planno onereadBackuptape wascorruptedPower cameback... andthen wentout again“Itworkedin dev”SearchedTeams/WhatsApp/Slackfor the DR stepsPracticedfailoverCreateda backupstrategyRan afailover,forgot thefirewall rulesWoke upmidnightto standbycallTestedDR... inprod byaccidentLogged intothe wrongcloudaccountGot calledduringdinnerAlertfatigueDR testpassed...because noone actuallytested anythingDependencyfailedsilentlyDBmigrationfailedLoggedincident... tothe wrongteamFix requiredphysicalaccess (noone hadkeys)Ignored analert thatwas realthis timeVendorsaid,“That’s notcovered”Appliedthe wrongconfig toprodSpent 2hoursdebugging—then found itwas a typoDeployedduring amajorincidentOncallduring aholidayFoundpasswordsin a stickynoteFound thebackup inthe wrongformatRestoredfrombackupRan a DRstimulationgameDR planincluded aretiredemployeeThe “hotsite” wasactuallycoldDiscoveredthe backupdrive wasfullRealized youwererestoring thewrong day’sbackupNetworkoutageRealized theDR testbrokesomethingelseDeploymentbroke prodSaw amysteriouscron joblabeled “donot delete”Team usedfive differentdefinitions of“RTO”AccidentallydeleteddataMisreada severityalertGot called mid-flight (tried totroubleshootover airplaneWi-Fi)Didn’thave abackupCalledvendorsupport—hitvoicemailDiscoveredhalf the infrawas neverdocumentedStarted aDR drill—no oneshowed upSystem alertmissedbecause alertrule was toospecificRecoverytook >1dayBackuppasswordwaschanged butnot sharedAlertfalsepositiveRanchaos testin prodNorunbookavailableExternalservicewentdownUnreachableDNSDid apost-mortemSomeoneunpluggedthe “do nottouch” serverConflictingrecoveryinstructionsCustomscriptfailed withno logsFoundcriticalsystem on apersonallaptop

Disasters Bingo - Call List

(Print) Use this randomly generated list as your call list when playing the game. There is no need to say the BINGO column name. Place some kind of mark (like an X, a checkmark, a dot, tally mark, etc) on each cell as you announce it, to keep track. You can also cut out each item, place them in a bag and pull words from the bag.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
  1. Confused dev and prod environments
  2. Lost prod data (even a bit)
  3. Cloud region down
  4. Couldn’t reach the primary contact
  5. Forgot to test backup
  6. Hit restore, regretted immediately
  7. Restored successfully—into prod by mistake
  8. DR test failed
  9. Got locked out mid-recovery
  10. Backup ran... but didn’t include the database
  11. Wrote a DR plan no one read
  12. Backup tape was corrupted
  13. Power came back... and then went out again
  14. “It worked in dev”
  15. Searched Teams/WhatsApp/Slack for the DR steps
  16. Practiced failover
  17. Created a backup strategy
  18. Ran a failover, forgot the firewall rules
  19. Woke up midnight to standby call
  20. Tested DR... in prod by accident
  21. Logged into the wrong cloud account
  22. Got called during dinner
  23. Alert fatigue
  24. DR test passed... because no one actually tested anything
  25. Dependency failed silently
  26. DB migration failed
  27. Logged incident... to the wrong team
  28. Fix required physical access (no one had keys)
  29. Ignored an alert that was real this time
  30. Vendor said, “That’s not covered”
  31. Applied the wrong config to prod
  32. Spent 2 hours debugging—then found it was a typo
  33. Deployed during a major incident
  34. Oncall during a holiday
  35. Found passwords in a sticky note
  36. Found the backup in the wrong format
  37. Restored from backup
  38. Ran a DR stimulation game
  39. DR plan included a retired employee
  40. The “hot site” was actually cold
  41. Discovered the backup drive was full
  42. Realized you were restoring the wrong day’s backup
  43. Network outage
  44. Realized the DR test broke something else
  45. Deployment broke prod
  46. Saw a mysterious cron job labeled “do not delete”
  47. Team used five different definitions of “RTO”
  48. Accidentally deleted data
  49. Misread a severity alert
  50. Got called mid-flight (tried to troubleshoot over airplane Wi-Fi)
  51. Didn’t have a backup
  52. Called vendor support—hit voicemail
  53. Discovered half the infra was never documented
  54. Started a DR drill—no one showed up
  55. System alert missed because alert rule was too specific
  56. Recovery took >1 day
  57. Backup password was changed but not shared
  58. Alert false positive
  59. Ran chaos test in prod
  60. No runbook available
  61. External service went down
  62. Unreachable DNS
  63. Did a post-mortem
  64. Someone unplugged the “do not touch” server
  65. Conflicting recovery instructions
  66. Custom script failed with no logs
  67. Found critical system on a personal laptop