Apache Storm is a free and open source, distributed real-time computation system for processing fast, large streams of data. Storm adds reliable real-time data processing capabilities to Apache Hadoop 2.x. Its effective stream processing capabilities are trusted by Twitter and Yahoo for quickly extracting insights from their Big Data.
Video on Apache Storm:
This video is the recorded session of the webinar on the topic “Introduction to Real Time Analytics using Apache Storm”, which was held on 17th May’14.
Presentation on Apache Storm:
The video covers the following topics:
- Introduction to Apache Storm & what is the fuss about Real-Time processing?
- What Hadoop can’t do and how does Storm come to the rescue?
- Use Cases of Apache Storm.
- Key features and Architecture of a Storm cluster.
- How does Storm and Hadoop fit together?
- Data ingesting techniques in Storm.
- Managing your Hadoop and Storm cluster with Apache Ambari.
In case of any queries feel free to mention them in the Comments section and we will clarify your doubts.
Summary of the Apache Storm Video:
In this video, some Storm use cases, growing trends, projects in curriculum, overview of key components and other important aspect of Apache Storm has been discussed.
A Look at the Use Cases:
Telecommunication – Silent Roamers Detection:
In telecommunication, Storm plays an important role in ‘Silent Roamer Detection’. This is vital for telecom providers as it is necessary to have information on users who haven’t used their service in spite of being registered with them. With the previous outdated analytics, this was not possible but with Storm, telecom providers have access to real-time analysis that makes a big difference to the telecom providers. For this very reason, telecom providers are updating themselves with Apache Storm.
Banking – Fraud Transaction Detection:
Real-time analytics are imperative for banks and Storms fits the requirement perfectly. Similar mechanism which are variants of Storm, some being its competitors are being used by several banks for this purpose. Though Storm is at an early stage, various banks in Middle East have implemented Storm almost and year and half ago. This shows the potential and capabilities of Apache Storm. As far as the rest of the world goes, Apache Storm is being used at least in POC environment.
Retail – Popular Retailers like Sears and Walmart:
Dynamic pricing is backbone of retail and Sears and Walmart have their own specific dynamic pricing modules for this purpose.
Social Networking –Twitter:
In twitter, the trends are anlayzed from the tweets. Twitter is an excellent example of Storm’s real-time use case.
About the course:
Apache storm is simple to learn and more focused on projects comprised in module 5 and 6. The last two modules and in fact, the overall curriculum of the Apache Storm course aims to provide more hands-on experience.
As far as the course is concerned, it not mandatory to have Hadoop knowledge, but it would help to learn better. Since Storm is poly glot in nature, it can be written in any language. Apache Storm is written in Clojar, since it’s a rare language as such, Java will be used for the projects discussed in the curriculum.
Trending Apache Storm:
According to Google trend, the search word ‘Apache storm’ has witnessed a sudden surge since Jan 2013. Earlier, Storm was an independent project but recently Storm was taken under incubation by Apache foundation. The sudden surge indicates clearly about Storm’s potential.
The video also includes overview of the following topics by the instructor:
- Storm components – Overview of nimbus, zookeeper and supervisor nodes
- Storm topology
- Why storm is ideal for real-time processing
- How is storm implemented in Hadoop 2.0 framework
Companies using Apache Storm: