Category Archives: Hadoop

Hadoop on Linux and Windows

Hadoop is a framework that allows for the distributed processing of large data sets across the cluster of computers using simple programming language models like Java.  It is part of the Apache project sponsored by the Apache Software Foundation. It is designed to scale up from single server to thousands of machines .i.e. it can be run on single node or cluster of nodes. Each node in hadoop cluster can execute a computation logic and store data.