The Hadoop framework performed insufficient authentication and authorization of both users and services. This allows any user to impersonate other user, receive blocks directly from Datanodes by bypassing NameNode and snooping of data packets sent by Datanodes to client. The framework did not perform mutual authentication and allowed malicious network user to imitate cluster services. This is where Kerberos comes in to the picture. Let’s look at how a simple security flow is.
What is Kerberos?
Kerberos is a network authentication protocol designed to provide strong authentication for client/server applications by means of secret-key cryptography. Kerberos ensures the highest level of security to network resources. The Kerberos protocol name is based on the three- headed dog figure from Greek mythology known as Kerberos.
Components of Kerberos:
Kerberos comprises of 3 components; Key Distribution Center (KDC), Client User and Server with the desired service to access. The KDC performs 2service functions:
- Authentication Service (AS)
- Ticket-Granting Service (TGS)
As shown in the above figure, three exchanges occurs when the client accesses a server:
- AS Exchange
- TGS Exchange
- Client/Server (CS) Exchange
Master the art of data engineering and revolutionize the way organizations process, store, and analyze data with Data Engineer Certification Program.
Hadoop Security Design With Kerberos:
The new Hadoop security design makes use of Delegation Tokens, Job Tokens and Block Access Tokens in Kerberos. Each of these tokens is similar in structure.
- Delegation Tokens – Used for clients to communicate with the NameNode to gain access to HDFS data.
- Block Access Tokens – Used to secure communication between the NameNode and DataNodes to implement HDFS file system permissions.
- The Job Token – Used to secure communications between the MapReduce engine, Task Tracker and individual tasks.
How Does Kerberos Work?
Instead of client sending password to application server, a Request Ticket is placed from authentication server and the Ticket along with the encrypted request is sent to application server (by Jeff at dresshead website). Now, how to request tickets without repeatedly sending credentials? This is done through Ticket granting ticket (TGT).
Take your data analysis skills to the next level with our cutting-edge Big Data Certification Course.
Got a question for us? Mention them in the comments section and we will get back to you.
Hadoop Administration Training