Doing that amount of work wont be necessary as EMR by design would issue a new worker or task node if they fail. For the Master node, you have access to all detailed Cloud Watch services as mentioned earlier and one solution would be to set up a Lambda service that provisions a new cluster if your Master Node fails.
Your data persistence is a separate issue but for the most part and by default EMR storage resides on S3 buckets (available to all zones and highest data persistence).