Sunday, 2 February 2014

Hadoop and Ubuntu - step 1

Hadoop and Ubuntu - step 1 for setting up use of Hadoop using Ubuntu OS

There are basically two methods of using Hadoop:
1. Configure Hadoop on Windows - this involves use of Hadoop setup, Cygwin tool, Java and Eclipse.
I have configured this and initially configure it on my laptop, however, when I tried to perform the same configuration on another machine (to be used as another Hadoop node), Cygwin broke down.
As a result, I was not able to complete the whole set up of Hadoop.

2. Configure Hadoop on Linux - Because of the above experience, I decided to go with the Linux based OS for Hadoop.

Using a Linux based OS is best approach for below reasons:
1. Hadoop is designed for Linux based systems (yes, it is)
2. Hadoop requires SSH which is simple to configure in Linux (requires Cygwin in windows - Cygwin basically gives a feel and experience of Linux on Windows)
3. It is a system which is naturally more secure on secure OSs, exibit A - Linux.
STEP 1 - Choose and configure (Linux) OS of choice on Machine of Choice 00

I chose Ubuntu - freely and easily available, good GUI based support for heavy Windows user

I chose to perform the installation on Virtual Box - Open Source Virtualization tool by Oracle.

Down and install Virtual Box on your machine.

Download and install Ubuntu in the Virtual Box as a Guest OS

Get ready for further set up for Hadoop on Ubuntu

This completes step 1.

You can also learn about usage of Hadoop and about Hadoop architecture on