QUICK REVIEW

[Paper Review] M3R: Increased performance for in-memory Hadoop jobs

Avraham Shinnar, David Cunningham|arXiv (Cornell University)|Aug 21, 2012

Cloud Computing and Resource Management13 references56 citations

TL;DR

M3R is a high-performance, in-memory Hadoop MapReduce engine that accelerates iterative, memory-resident workloads by eliminating disk I/O and leveraging in-memory data structures. It supports existing Hadoop jobs unchanged while delivering up to 45x speedup on sparse matrix operations by optimizing data layout, reducing serialization overhead, and enabling direct cache access via X10-based execution.

ABSTRACT

Main Memory Map Reduce (M3R) is a new implementation of the Hadoop Map Reduce (HMR) API targeted at online analytics on high mean-time-to-failure clusters. It does not support resilience, and supports only those workloads which can fit into cluster memory. In return, it can run HMR jobs unchanged -- including jobs produced by compilers for higher-level languages such as Pig, Jaql, and SystemML and interactive front-ends like IBM BigSheets -- while providing significantly better performance than the Hadoop engine on several workloads (e.g. 45x on some input sizes for sparse matrix vector multiply). M3R also supports extensions to the HMR API which can enable Map Reduce jobs to run faster on the M3R engine, while not affecting their performance under the Hadoop engine.

Motivation & Objective

To address the performance bottleneck of Hadoop MapReduce on iterative, in-memory workloads by eliminating disk I/O and leveraging in-memory execution.
To enable existing Hadoop jobs—including those from Pig, Jaql, and SystemML—to run unchanged on a faster, in-memory engine.
To optimize data layout and serialization to reduce overhead and improve cache efficiency in memory-bound workloads.
To demonstrate that significant performance gains are achievable without sacrificing API compatibility or portability.

Proposed method

M3R implements the Hadoop MapReduce API using X10, a language with native support for distributed, in-memory computation and efficient serialization.
It replaces Hadoop’s disk-based shuffle with in-memory key-value storage, reducing I/O and enabling faster data access across nodes.
The engine uses a custom serialization protocol and direct cache access to avoid intermediate serialization/deserialization steps, improving performance for iterative algorithms.
It integrates with existing Hadoop toolchains by patching runtime components (e.g., SystemML) to retrieve data directly from the in-memory cache.
The system supports extensions to the HMR API that further accelerate performance on M3R without degrading Hadoop performance.
It uses X10’s lightweight concurrency model and distributed object model to manage data movement and computation across nodes efficiently.

Experimental results

Research questions

RQ1Can a high-performance, in-memory Hadoop MapReduce engine deliver significant speedups on iterative, memory-resident workloads without breaking backward compatibility?
RQ2How does eliminating disk I/O and optimizing data serialization impact performance in in-memory MapReduce workloads?
RQ3To what extent can existing Hadoop toolchains (e.g., SystemML, Pig, Jaql) be accelerated by running on an in-memory engine without modification?
RQ4What architectural changes are required to support in-memory execution while maintaining compatibility with the Hadoop MapReduce API?

Key findings

M3R achieved up to 45x performance improvement on sparse matrix-vector multiplication compared to Hadoop, especially on larger input sizes.
For global non-negative matrix factorization, M3R reduced running time from over 1200 seconds on Hadoop to under 30 seconds at scale.
Linear regression performance on M3R was significantly faster than on Hadoop, with execution times dropping from over 1600 seconds to under 400 seconds for large datasets.
Page rank execution on M3R showed consistent speedups, reducing time from over 800 seconds on Hadoop to under 200 seconds for large graphs.
The performance gains were attributed to reduced I/O, optimized serialization, and direct in-memory data access via cache integration.
SystemML workloads saw substantial gains, with performance improvements amplified by M3R’s ability to avoid redundant serialization and leverage in-memory data structures.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.