YARN vs Mesos


A very good discussion on the same topic is present on Quora

http://www.quora.com/How-does-YARN-compare-to-Mesos


Mesos is a meta, framework scheduler rather than an application scheduler like YARN
 

Besides the above link following additional (updated) info i found which you might find useful.

There might be many other things as open source community moves very fast and this post also might be very old while you are reading.

With changes in Capacity scheduler now Yarn can support CPU also as resource scheduler. See JIRA YARN-2 for details.

Yarn now has support for cgroups in containers. A very good related blog post

Storm on Yarn can now directly used

Starting 0.6 Spark on Yarn is now offically supported

GSOC project to add security to Mesos related to adding security features to Mesos which its lacking currently and Yarn has that via Kerberos. Wiki on Mesos security website

Lastly  papers

Google Omega
http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf

This paper is based on research done in Amplabs and Google for next generation schedulers on parallel infrastructures.

Mesos
http://bnrg.cs.berkeley.edu/~adj/publications/paper-files/nsdi_mesos.pdf

YARN
http://www.socc2013.org/home/program/a5-vavilapalli.pdf



It classifies the schedulers into following types


Monolithic schedulers use a single, centralized scheduling algorithm for all jobs (our existing
scheduler is one of these).

Two-level
schedulers have a single active resource manager that offers compute resources to multiple parallel, independent “scheduler frameworks”, as in Mesos and Hadoop-on-Demand (HPC)

The paper classifies Yarn as Monolithic scheduler and Mesos onto Two level scheduler.

It is an interesting read and also raises one question for Yarn

I quote

It might appear that YARN is a two-level scheduler, too. In YARN, resource requests from per-job
application masters are sent to a single global scheduler in the resource master , which allocates resources on various machines, subject to application-specified constraints. But the application masters provide job-management services, not scheduling, so YARN is effectively a monolithic scheduler architecture.
At the time of writing, YARN only supports one resource type (fixed-sized memory chunks). Our experience suggests that it will eventually need a rich API to the resource mastin order to cater for diverse application requirements, including multiple resource dimensions, constraints, and placement choices for failure-tolerance.

Although YARN application masters can request resources on particular machines,it is unclear how they acquire and maintain the state needed to make such placement decisions. 

Google seems to be drifting away from Yarn , unlike its counterpart Yahoo



Quoting Hortonworks from


Architecturally how does YARN compare with Mesos?
Conceptually YARN and Mesos address similar requirements. They enable organizations to pool and share horizontal compute resources across a multitude of workloads. YARN was architected specifically as an evolution of Hadoop 1.x. YARN thus tightly integrates with HDFS, MapReduce and Hadoop security.

No comments:

Post a Comment

Please share your views and comments below.

Thank You.