top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

One time Kerberos exception on secured Hadoop cluster

+1 vote
334 views

On a kerberos based Hadoop cluster, a kinit is done and then oozie command is executed. This works every time (thus no setup issues), except once it failed with following error.

Error: AUTHENTICATION : Could not authenticate, GSSException: No valid credentials provided (Mechanism level: Generic error (description in e-text) (60) - PROCESS_TGS).

Any thoughts on what could cause the transient failure? Would any updates on node (e.g. Java etc.) cause such issue? Cluster is working fine with kerberos.

posted Feb 27, 2015 by anonymous

Looking for an answer?  Promote on:
Facebook Share Button Twitter Share Button LinkedIn Share Button

Similar Questions
+3 votes

From the documentation + code, "when kerberos is enabled, all tasks are run as the end user (e..g as user "joe" and not as hadoop user "mapred") using the task-controller (which is setuid root and when it runs, it does a setuid/setgid etc. to Joe and his groups ). For this to work, user "joe" linux account has to be present on all nodes of the cluster."

In a environment with large and dynamic user population; it is not practical to add every end user to every node of the cluster (and drop user when end user is deactivated etc.)

What are other options get this working ? I am assuming that if the users are in a LDAP, can using the PAM for LDAP solve the issue. Any other suggestions?

+1 vote

Recently I have set up Kerberos security for a Hadoop cluster and added a few data nodes to it. While running hdfs balancer, I found that Kerberos ticket is expired and balancer stop.

The Kerberos ticket has 1day lifetime with 7days max renewable lifetime. Are there any options to automatically renew the ticket while running balancer?

Or should I re-start it everyday?

0 votes

I've been trying to secure block data transferred by HDFS. I added below to hdfs-site.xml and core-site xml to the data node and name node and restart both.

 dfs.encrypt.data.transfer
 true

 hadoop.rpc.protection
 privacy

When I try to put a file from the hdfs command line shell, the operation fails with "connection is reset" and I see following from the datanode log:

"org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected encryption handshake from client a /172.31.36.56:48271. Perhaps the client is running an older version of Hadoop which does not support encryption"

I am able to reproduce this on two different deployments. I was following https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Authentication , but didn't turn on kerberos authentication. No authentication works in my environment. Can this be the reason the handshake fails?

+2 votes

Does anyone knows how to ‘capture’ the exception which actually failed the job running on Mapper or Reducer at runtime? It seems Hadoop is designed to be fault tolerant that the failed jobs will be automatically rerun for a certain amount of times and won’t actually expose the real problem unless you look into the error log?

In my use case, I would like to capture the exception and make different response based on the type of the exception.

+2 votes

Let we change the default block size to 32 MB and replication factor to 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each. Can it be possible? How to accomplish it?

...