The above video is the recorded webinar session on the topic ‘Real-time Analytics with Apache Storm’, held on 26th July’14.
Apache Storm is a open source, distributed real-time computation system for processing fast, large streams of data. With Storm and MapReduce running together in Hadoop on YARN, a Hadoop cluster can resourcefully process a full range of workloads from real-time to batch.
Real-Time Analytics with Apache Storm – Topics covered in the Presentation:
Introduction to Apache Storm & importance of Real-Time processing
How Apache Storm overcomes Hadoop’s shortcomings?
Real world applications of Apache Storm.
What makes Storm ideal for real-time processing?
Architecture of a Storm cluster.
How Storm and Hadoop fits together?
Data ingesting techniques in Storm.
Managing Hadoop and Storm cluster with Apache Ambari.
Presentation:
Characteristics of Storm that makes it Ideal for Real-Time Data Processing:
Fast – Processes one million 100 byte messages per second per node
Scalable – Parallel calculations that run across a cluster of machines
Fault-tolerant – Automatic restart when a worker or node dies.
Reliable – Guarantees to process each unit of data at least once or exactly once.
Easy to Operate – Standard configurations suitable for production from day one.
Feel free to drop us a line for any clarifications.