top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

Can anyone provide me steps to grant a particular user access equivalnent to that of "hdfs" user in hadoop.

0 votes
95 views

The reason behind this is I want to have my custom user who can create anything on the entire hdfs file system (/).
I tried couple of links however, none of them were useful. Is there any way by adding/modifying some property tags I can do that ?

posted Jul 7, 2015 by anonymous

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

1 Answer

+1 vote

The "hdfs" user (more specifically, whatever user launched the NameNode process) is the HDFS super-user. The super-user has full access to the file system and also administrative operations. You can declare additional users to be super-users by setting property dfs.permissions.superusergroup in hdfs-site.xml. The default value of this property is "supergroup".

dfs.permissions.superusergroup supergroup The name of the group of super-users.

Any user you add to group "supergroup" (or whatever custom group you use if you decided to change dfs.permissions.superusergroup) will be treated as an HDFS super-user.

It's important to keep in mind that this grants both full file system access and full administrative access. That means the user would be able to call sensitive operations like "hdfs dfsadmin -safemode enter". If this isn't appropriate, then you might explore using file system permissions and ACLs to implement your requirements on the file system only.

More details are in the documentation here:
http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html

answer Jul 7, 2015 by Deepti Singh
Similar Questions
0 votes

I was trying to implement a Hadoop/Spark audit tool, but l met a problem that I can't get the input file location and file name. I can get username, IP address, time, user command, all of these info from hdfs-audit.log. But When I submit a MapReduce job, I can't see input file location neither in Hadoop logs or Hadoop ResourceManager.

Does hadoop have API or log that contains these info through some configuration ?If it have, what should I configure?

+2 votes

Is there a way to mount HDFS directly on a Linux and Windows client? I believe I read something about there being some limitations, but that there is possibly a FUSE solution. Any information on this (with links to a how-to) would be greatly appreciated.

+2 votes

YARN application would benefit from maximal bandwidth on HDFS reads. But I'm unclear on how short-circuit reads are enabled. Are they on by default?

Can our application check programmatically to see if the short-circuit read is enabled?

+1 vote

Can anyone please explain what we mean by STREAMING DATA ACCESS IN HDFS.

Data is usually copied to HDFS and in HDFS the data is splitted across DataNodes in blocks.
Say for example, I have an input file of 10240 MB(10 GB) in size and a block size of 64 MB. Then there will be 160 blocks.

These blocks will be distributed across DataNodes in blocks. Now the Mappers will read data from these DataNodes keeping the DATA LOCALITY FEATURE in mind(i.e. blocks local to a DataNode will be read by the map tasks running in that DataNode).

Can you please point me where is the "Streaming data access in HDFS" is coming into picture here?

...