How to migrate Apache Hadoop 1.x HDFS to Apache Hadoop 2.x HDFS

699 views

Apache Hadoop includes HDFS Federation.
Does anyone know how to migrate Apache Hadoop 1.x HDFS to Apache Hadoop 2.x HDFS?

I am getting the following error:

$ bin/hdfs start namenode --config $HADOOP_CONF_DIR -upgrade -clusterId 
Error: Could not find or load main class start

posted Dec 3, 2013 by Kumar Mitrasen

Looking for an answer? Promote on:

Similar Questions

+4 votes

Hadoop upgrading/migrating downtime from Apache Hadoop 1.x to 2.x

I want to know while upgrading/migrating from Apache Hadoop 1.x to 2.x(MRv2YARN) in a production cluster of several nodes is there any *ANTICIPATED DOWNTIME* that one needs to be aware of?

+3 votes

Is it possible to access a hadoop 1 file system (hdfs) via the hadoop 2.2.0 command line tools?

I am trying to access a hadoop 1 installation via the hadoop 2.2.0 command line tools. I am wondering if this is possible at all?

From hadoop 1 I get:

$ hadoop fs -ls hdfs://127.0.0.1:9000/
Found 2 items
drwxr-xr-x - cs supergroup 0 2014-02-01 08:18 /tmp
drwxr-xr-x - cs supergroup 0 2014-02-01 08:19 /user

From hadoop 2.2.0 I get:

$ hadoop fs -ls hdfs://127.0.0.1:9000/
ls: Failed on local exception: java.io.EOFException; Host Details : 
local host is: "i7/127.0.1.1"; destination host is: "localhost":9000;

I am trying to find this information via a web-search, but up to now no success.

+2 votes

How hadoop 2.x perform better than hadoop 1.x

The setup consist of hadoop 1.0.1 and hbase 0.94.x. Loading raw data into hdfs and then into hbase consumes good amount of time for 10tb of raw data (using hadoop shell - copyFromLocal and pig script to load hbase).

Moving to hadoop 2.x will benefit performing better is my question. If yes please provide relevent links or docs which expains how it is achieved.
I do not need sorting my data while loading into hbase so what are the ways i can disable sort ta Mapper and at Reducer is my 2nd question.

Any Suggestions??

+1 vote

What configuration parameters cause a Hadoop 2.x job to run on the cluster?

Assume I have a machine on the same network as a hadoop 2 cluster but separate from it.

My understanding is that by setting certain elements of the config file or local xml files to point to the cluster I can launch a job without having to log into the cluster, move my jar to hdfs and start the job from the clusters hadoop machine.

Does this work? What Parameters need I sat? Where is the jar file? What issues would I see if the machine is running Windows with cygwin installed?

+2 votes

HDFS - Consolidate 2 small volumes into 1 large volume

Is it possible to consolidate two small data volumes (500GB each) into a larger data volume (3TB)?

I'm thinking that as long as the block file names and metadata are unique, then I should be able to shut down the datanode and use something like tar or rsync to copy the contents of each small volume to the large volume.

Will this work?

...

How to migrate Apache Hadoop 1.x HDFS to Apache Hadoop 2.x HDFS

Your comment on this post:

Your answer

Preview