top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

How to migrate Apache Hadoop 1.x HDFS to Apache Hadoop 2.x HDFS

+2 votes
534 views

Apache Hadoop includes HDFS Federation.
Does anyone know how to migrate Apache Hadoop 1.x HDFS to Apache Hadoop 2.x HDFS?

I am getting the following error:

$ bin/hdfs start namenode --config $HADOOP_CONF_DIR -upgrade -clusterId 
Error: Could not find or load main class start 
posted Dec 3, 2013 by Kumar Mitrasen

Looking for an answer?  Promote on:
Facebook Share Button Twitter Share Button LinkedIn Share Button

Similar Questions
+4 votes

I want to know while upgrading/migrating from Apache Hadoop 1.x to 2.x(MRv2YARN) in a production cluster of several nodes is there any *ANTICIPATED DOWNTIME* that one needs to be aware of?

+3 votes

I am trying to access a hadoop 1 installation via the hadoop 2.2.0 command line tools. I am wondering if this is possible at all?

From hadoop 1 I get:

$ hadoop fs -ls hdfs://127.0.0.1:9000/
Found 2 items
drwxr-xr-x - cs supergroup 0 2014-02-01 08:18 /tmp
drwxr-xr-x - cs supergroup 0 2014-02-01 08:19 /user

From hadoop 2.2.0 I get:

$ hadoop fs -ls hdfs://127.0.0.1:9000/
ls: Failed on local exception: java.io.EOFException; Host Details : 
local host is: "i7/127.0.1.1"; destination host is: "localhost":9000;

I am trying to find this information via a web-search, but up to now no success.

+2 votes

The setup consist of hadoop 1.0.1 and hbase 0.94.x. Loading raw data into hdfs and then into hbase consumes good amount of time for 10tb of raw data (using hadoop shell - copyFromLocal and pig script to load hbase).

  1. Moving to hadoop 2.x will benefit performing better is my question. If yes please provide relevent links or docs which expains how it is achieved.

  2. I do not need sorting my data while loading into hbase so what are the ways i can disable sort ta Mapper and at Reducer is my 2nd question.

Any Suggestions??

+1 vote

Assume I have a machine on the same network as a hadoop 2 cluster but separate from it.

My understanding is that by setting certain elements of the config file or local xml files to point to the cluster I can launch a job without having to log into the cluster, move my jar to hdfs and start the job from the clusters hadoop machine.

Does this work? What Parameters need I sat? Where is the jar file? What issues would I see if the machine is running Windows with cygwin installed?

+2 votes

Is it possible to consolidate two small data volumes (500GB each) into a larger data volume (3TB)?

I'm thinking that as long as the block file names and metadata are unique, then I should be able to shut down the datanode and use something like tar or rsync to copy the contents of each small volume to the large volume.

Will this work?

...