Home | Repositories | Statistics | About



Subject: Hadoop, MapReduce, HBase, Pig, parallel algorithms, distributed algorithms


Year: 2015


Type: Proceeding article



Title: Simplifying parallel implementation of algorithms on Hadoop with Pig Latin


Author: Zdravevski, Eftim
Author: Lameski, Petre
Author: Kulakov, Andrea
Author: Filiposka, Sonja
Author: Trajanov, Dimitar



Abstract: In this paper we present a general technique for parallelizing regular algorithms with the tools the Hadoop ecosystem offers: MapReduce, HDFS, HBase and Pig. This framework can be applied for parallelizing algorithms for feature selection, clustering, machine learning etc. It consists of several steps: load the datasets in HDFS, apply some transformations if they are needed, store the datasets in HBase, and implement the algorithm in Pig with the help of User Defined Functions.


Publisher:


Relation: CIIT



Identifier: oai:repository.ukim.mk:20.500.12188/21384
Identifier: http://hdl.handle.net/20.500.12188/21384



TitleDateViews
Simplifying parallel implementation of algorithms on Hadoop with Pig Latin201517