1)
Hadoop 1:
Supports MapReduce (MR) processing model only. Does not support non MR tools
Hadoop 2:
Allows to work in MR as well as other distributed computing models like Spark, Hama, Giraph, Message Passing Interface) MPI & HBase coprocessors.
2)
Hadoop 1:
MR does both processing and cluster-resource management.
Hadoop 2:
YARN (Yet Another Resource Negotiator) does cluster resource management and processing is done using different processing models.
3)
Hadoop 1:
Has limited scaling of nodes. Limited to 4000 nodes per cluster
Hadoop 2:
Has better scalability. Scalable up to 10000 nodes per cluster.
4)
Hadoop 1:
Works on concepts of slots – slots can run either a Map task or a Reduce task only.
Hadoop 2:
Works on concepts of containers. Using containers can run generic tasks.
5)
Hadoop 1:
A single Namenode to manage the entire namespace.
Hadoop 2:
Multiple Namenode servers manage multiple namespace.
6)
Hadoop 1:
Has Single-Point-of-Failure (SPOF) – because of single Namenode- and in case of Namenode failure, needs manual intervention to overcome.
Hadoop 2:
Has feature to overcome SPOF with a standby Namenode and in case of Namenode failure, it is configured for automatic recovery.
7)
Hadoop 1:
MR API is compatible with Hadoop 1x. A program written in Hadoop1 executes in Hadoop1x without any additional files.
Hadoop 2:
MR API requires additional files for a program written in Hadoop1x to execute in Hadoop2x.
8)
Hadoop 1:
Has a limitation to serve as a platform for event processing, streaming and real time operations.
Hadoop 2:
Can serve as a platform for a wide variety of data analytics-possible to run event processing, streaming and real time operations.
9)
Hadoop 1:
A Namenode failure affects the stack.
Hadoop 2:
The Hadoop stack – Hive, Pig, HBase etc. are all equipped to handle Namenode failure.
10)
Hadoop 1:
Does not support Microsoft Windows
Hadoop 2:
Added support for Microsoft windows
Hadoop 1:
Supports MapReduce (MR) processing model only. Does not support non MR tools
Hadoop 2:
Allows to work in MR as well as other distributed computing models like Spark, Hama, Giraph, Message Passing Interface) MPI & HBase coprocessors.
2)
Hadoop 1:
MR does both processing and cluster-resource management.
Hadoop 2:
YARN (Yet Another Resource Negotiator) does cluster resource management and processing is done using different processing models.
3)
Hadoop 1:
Has limited scaling of nodes. Limited to 4000 nodes per cluster
Hadoop 2:
Has better scalability. Scalable up to 10000 nodes per cluster.
4)
Hadoop 1:
Works on concepts of slots – slots can run either a Map task or a Reduce task only.
Hadoop 2:
Works on concepts of containers. Using containers can run generic tasks.
5)
Hadoop 1:
A single Namenode to manage the entire namespace.
Hadoop 2:
Multiple Namenode servers manage multiple namespace.
6)
Hadoop 1:
Has Single-Point-of-Failure (SPOF) – because of single Namenode- and in case of Namenode failure, needs manual intervention to overcome.
Hadoop 2:
Has feature to overcome SPOF with a standby Namenode and in case of Namenode failure, it is configured for automatic recovery.
7)
Hadoop 1:
MR API is compatible with Hadoop 1x. A program written in Hadoop1 executes in Hadoop1x without any additional files.
Hadoop 2:
MR API requires additional files for a program written in Hadoop1x to execute in Hadoop2x.
8)
Hadoop 1:
Has a limitation to serve as a platform for event processing, streaming and real time operations.
Hadoop 2:
Can serve as a platform for a wide variety of data analytics-possible to run event processing, streaming and real time operations.
9)
Hadoop 1:
A Namenode failure affects the stack.
Hadoop 2:
The Hadoop stack – Hive, Pig, HBase etc. are all equipped to handle Namenode failure.
10)
Hadoop 1:
Does not support Microsoft Windows
Hadoop 2:
Added support for Microsoft windows
No comments:
Post a Comment