23 Jul 2014

Real-Time Analytics with Apache Storm

The above video is the recorded webinar session on the topic ‘Real-time Analytics with Apache Storm’, held on 26th July’14.

Apache Storm is a open source, distributed real-time computation system for processing fast, large streams of data. With Storm and MapReduce running together in Hadoop on YARN, a Hadoop cluster can resourcefully process a full range of workloads from real-time to batch.

Real-Time Analytics with Apache Storm – Topics covered in the Presentation:

Introduction to Apache Storm & importance of Real-Time processing
How Apache Storm overcomes Hadoop’s shortcomings?
Real world applications of Apache Storm.
What makes Storm ideal for real-time processing?
Architecture of a Storm cluster.
How Storm and Hadoop fits together?
Data ingesting techniques in Storm.
Managing Hadoop and Storm cluster with Apache Ambari.

Presentation:

Characteristics of Storm that makes it Ideal for Real-Time Data Processing:

Fast – Processes one million 100 byte messages per second per node
Scalable – Parallel calculations that run across a cluster of machines
Fault-tolerant – Automatic restart when a worker or node dies.
Reliable – Guarantees to process each unit of data at least once or exactly once.
Easy to Operate – Standard configurations suitable for production from day one.

Feel free to drop us a line for any clarifications.

Related Posts:

Apache Storm Use Cases

What is Apache Storm all about?

Real-Time Analytics with Apache Storm

Real-Time Analytics with Apache Storm – Topics covered in the Presentation:

Playlist & Videos

Related Blogs