Data Parallel Computing - Research Paper by Adtyshkhr

TOP ESSAY WRITING SERVICES REVIEWS


Rank
Service
General
Prices

1
  • Since 2008
  • Free revisions
  • Money-back guaranty
  • Up to 5% for the first order. Up to 15% for the orders starting from 2nd

from $9.97/pp

visit site

2
  • Since 2009
  • Free title page, revisions
  • Discount policy
  • Satisfaction guarantee
  • More than 100 000 orders delivered

from $9.97/pp

visit site

3
  • Since 2010
  • PhD holding authors only
  • SMS notifications & VIP support
  • Discount policy

from $22/pp

visit site

4
  • Since 2010
  • 24/7 support team
  • More than 500 writers
  • Money-back guaranty
  • Up to 15% discounts

from $9.97/pp

visit site

 

My Account
Anti Essays

Welcome

Anti Essays offers essay examples to help students with their essay writing.

Sign Up

Data Parallel Computing Essay

Open Document

Below is an essay on "Data Parallel Computing" from Anti Essays, your source for research papers, essays, and term paper examples.

Distributed Aggregation for Data-Parallel Computing: Interfaces and Implementations
Yuan Yu
Microsoft Research 1065 La Avenida Ave. Mountain View, CA 94043 [email protected]

Pradeep Kumar Gunda
Microsoft Research 1065 La Avenida Ave. Mountain View, CA 94043 [email protected]

Michael Isard
Microsoft Research 1065 La Avenida Ave. Mountain View, CA 94043 [email protected]

ABSTRACT
Data-intensive applications are increasingly designed to execute on large computing clusters. Grouped aggregation is a core primitive of many distributed programming models, and it is often the most efficient available mechanism for computations such as matrix multiplication and graph traversal. Such algorithms typically require nonstandard aggregations that are more sophisticated than traditional built-in database functions such as Sum and Max. As a result, the ease of programming user-defined aggregations, and the efficiency of their implementation, is of great current interest. This paper evaluates the interfaces and implementations for user-defined aggregation in several state of the art distributed computing systems: Hadoop, databases such as Oracle Parallel Server, and DryadLINQ. We show that: the degree of language integration between userdefined functions and the high-level query language has an impact on code legibility and simplicity; the choice of programming interface has a material effect on the performance of computations; some execution plans perform better than others on average; and that in order to get good performance on a variety of workloads a system must be able to select between execution plans depending on the computation. The interface and execution plan described in the MapReduce paper, and implemented by Hadoop, are found to be among the worst-performing choices.

1. INTRODUCTION

Many data-mining computations have as a fundamental subroutine a “GroupBy-Aggregate” operation. This takes a dataset, partitions its records into groups according...

Show More


Related Essays

Citations

MLA Citation

"Data Parallel Computing". Anti Essays. 24 Nov. 2017

<http://anties.wressy.com/free-essays/Data-Parallel-Computing-423242.html>

APA Citation

Data Parallel Computing. Anti Essays. Retrieved November 24, 2017, from the World Wide Web: http://anties.wressy.com/free-essays/Data-Parallel-Computing-423242.html