0% found this document useful (0 votes)
405 views

Jerusalem College of Engineering: ACADEMIC YEAR 2021 - 2022

The document summarizes the objectives and experiments of a Cloud Computing Laboratory course for computer science students. The objectives are to develop web applications in the cloud, learn cloud application design and development, implement parallel programming using Hadoop, and configure compute engines in the cloud. The 8 listed experiments include installing virtual machines, creating web apps on Google App Engine, simulating clouds using CloudSim, transferring files between virtual machines, and installing a single node Hadoop cluster to run simple applications. The course aims to teach skills like cloud application deployment, parallel processing, and private cloud setup and management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
405 views

Jerusalem College of Engineering: ACADEMIC YEAR 2021 - 2022

The document summarizes the objectives and experiments of a Cloud Computing Laboratory course for computer science students. The objectives are to develop web applications in the cloud, learn cloud application design and development, implement parallel programming using Hadoop, and configure compute engines in the cloud. The 8 listed experiments include installing virtual machines, creating web apps on Google App Engine, simulating clouds using CloudSim, transferring files between virtual machines, and installing a single node Hadoop cluster to run simple applications. The course aims to teach skills like cloud application deployment, parallel processing, and private cloud setup and management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

JERUSALEM COLLEGE OF ENGINEERING

(An Autonomous Institution)


(Approved by AICTE, Affliated to Anna University
Accredited by NBA and NAAC with ‘A’ Grade)
Velachery Main Road, Narayanapuram, Pallikaranai, Chennai – 600 100

CLOUD COMPUTING LABORATORY


[CS8711]

RECORD NOTE BOOK


ACADEMIC YEAR 2021 -2022
IV YEAR/VII SEMESTER
REGULATION 2019

DEPARTMENT OF COMPUTER SCIENCE AND


ENGINEERING
Vision of the Department
The Department of computer science and engineering is dedicated to be a center of excellence, in
producing graduates as ethical engineers, innovative researchers, dynamic entrepreneurs and
globally competitive technocrats.
Mission of the Department
 To craft students to be competent professionals with value based education, innovative
teaching and practices.
 To enhance student‘s soft skill, personality and ethical responsibilities by augmenting in- plant
training, value added courses and co curricular activities.
 To facilitate the student as researchers by widening their professional knowledge through
continuous learning and innovative projects.
 To produce dynamic entrepreneur through interaction with network of alumni industry and
academia and extracurricular activities.
Program Educational Objectives (PEOs)

PEOs Program Educational Objectives


Graduates will apply engineering basics, laboratory and job oriented experiences to devise
PEO1
and unravel engineering problems in computer science engineering domain.
Graduates will be multi faceted researcher and experts in fields like computing,
PEO2
networking, artificial intelligence, software engineering and data science.
Graduates will be dynamic entrepreneur and service oriented professional with ethical and
PEO3
social responsibility.
Graduates will ingress and endure in core and other prominent organization across the
PEO4
globe and will foster innovation

Program Specific Objectives (PSOs)

PSOs Program Specific Objectives


The ability to understand, analyze and to develop the design related to real-time system
such as IOT, Secured automated systems, machine vision , computer vision and cognitive
PSO1
computing with various complexities , providing orientation towards green computing
environment .
The ability to apply standard practices and strategies in software project development
PSO2
using open-ended programming environments to deliver a quality product.
The ability to innovate, introduce and produce socially relevant products to facilitate
PSO3 transformation of society into a digitally empowered knowledge economy, thereby to
chart a successful career with a new dimension to entrepreneurship.
Engineering Graduates will be able to:

POs Program Outcomes


Engineering Knowledge: Apply knowledge of mathematics, science,
PO1 engineering fundamentals and an Engineering Specialization to the
solution of complex engineering problems.
Problem Analysis: Identify, formulate, review research literature and analyze
PO2 complex engineering problems reaching substantiated conclusions using first
principles of mathematics, natural sciences, and engineering sciences.
Design / Development of solutions: Design solutions for complex engineering
problems and design system components or processes that meet specified needs with
PO3
appropriate consideration for public health and safety, cultural, societal, and
environmental considerations.
Conduct Investigations of Complex Problems: Use research-based knowledge and
PO4 research methods including design of experiments, analysis and interpretation of
data, and synthesis of the information to provide valid conclusions.
Modern tool usage: Create, select, and apply appropriate techniques, resources, and
PO5 modern engineering and IT tools including prediction and modeling to complex
engineering activities with an understanding of the limitations.
The Engineer and Society: Apply reasoning informed by the contextual knowledge
PO6 to assess societal, health, safety, legal and cultural issues and the consequent
responsibilities relevant to the professional engineering practice.
Environment and sustainability: Understand the impact of the professional
PO7 engineering solutions in societal and environmental contexts, and demonstrate the
knowledge of, and need for sustainable development.
Ethics: Apply ethical principles and commit to professional ethics and
PO8
responsibilities and norms of the engineering practice.
Individual and team work: Function effectively as an individual, and as a member
PO9
or leader in diverse teams, and in multidisciplinary settings.
Communication: Communicate effectively on complex engineering activities with
the engineering community and with society at large, such as, being able to
PO10
comprehend and write effective reports and design documentation, make effective
presentations, and give and receive clear instructions.
Project Management and Finance: Demonstrate knowledge and understanding of
the engineering and management principles and apply these to one’s own work, as a
PO11
member and leader in a team, to manage projects and in multidisciplinary
environments.
Life-long learning: Recognize the need for, and have the preparation and ability to
PO12 engage in independent and life-long learning in the broadest context of technological
change.
JERUSALEM COLLEGE OF ENGINEERING
(An Autonomous Institution)
(Approved by AICTE, Affiliated to Anna University
Accredited by NBA and NAAC with ‘A’ Grade)
Velachery Main Road,Narayanapuram, Pallikaranai , Chennai – 600100

Name……………………………………………………………………………………

Year…………………………………Semester………………Branch……………….

Regulation………………..

Register No.

Certified that this is a Bonafide Record work done by the above student in the

…..……………………………………………. Laboratory during the year 20 - 20

Signature of Lab In-charge Signature of Head of the Department

EXAMINERS

DATE:

INTERNAL EXAMINER EXTERNAL EXAMINER


CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

L T P C
JCS1412 OPERATING SYSTEMS LABORATORY 0 0 4 2

SYLLABUS
COURSE OBJECTIVES:
 To Develop web application in cloud
 To learn the design and development process involved in creating a cloud based application
 To learn to implement and use parallel programming using Hadoop.
 To learn to setup a instance , monitor , allocate and configure the compute Engines in cloud

LIST OF EXPERIMENTS
1. Install Virtual box/VMware Workstation with different flavors of linux or windows OS on
top of windows7 or 8.
2. Install a C compiler in the virtual machine created using virtual box and execute Simple
Programs.
3. Install Google App Engine. Create hello world app and other simple web applications using
python/java.
4. Use GAE launcher to launch the web applications.
5. Simulate a cloud scenario using Cloud Sim and run a scheduling algorithm that is not present
in Cloud Sim.
6. Find a procedure to transfer the files from one virtual machine to another virtual machine.
7. Find a procedure to launch virtual machine using try stack (Online Open stack Demo
Version).
8. Install Hadoop single node cluster and run simple applications like word count.

TOTAL :60 PERIODS


COURSE OUTCOMES:
At the end of the course, the student should be able to
 Configure various virtualization tools such as virtual Box, VMware workstation, GCP, AWS.
 Design and deploy a web applications, Run a notebook programs in Cloud Environment
 Learn how to simulate a cloud environment to implement new schedulers.
 Install and use a generic cloud environment that can be used as a private cloud.
 Manipulate large dataset in a parallel environment.
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

LIST OF EXPERIMENTS

CYCLE 1

S.No NAME OF THE EXPERIMENT


1. Install Virtual box/VMware Workstation with different flavors of linux or windows
OS on top of windows7 or 8.

2. Install a C compiler in the virtual machine created using virtual box and execute
Simple Programs.

3. Install Google App Engine. Create hello world app and other simple web
applications using python/java.

4. Use GAE launcher to launch the web applications.

CYCLE 2

S.No NAME OF THE EXPERIMENT

5. Simulate a cloud scenario using Cloud Sim and run a scheduling algorithm that is
not present in Cloud Sim.

6. Find a procedure to transfer the files from one virtual machine to anothervirtual
machine.

7. Find a procedure to launch virtual machine using try stack (Online Open stack Demo
Version).

8. Install Hadoop single node cluster and run simple applications like word count.

ADDITIONAL EXPERIMENTS

S.No NAME OF THE EXPERIMENT


1.
Mount the One Node Hadoop Cluster Using Fuse.
2. Building a pub/sub messaging in system
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

CONTENTS

Ex.No Date Name of the Experiment Page No. Marks Signature


with Date

Average Marks :

Signature :
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

CLOUD COMPUTING LABORATORY

Mandatory Prerequisite:

1. Linux 64 bit Operating System (The commands mentioned are for Linux Operating System latest
version).
INSTALLATION OF PACKAGES

Installing KVM (Hypervisor for Virtualization)

1. Please check if the Virtualization flag is enabled in BIOS


Run the command in terminal
egrep -c 'vmx|svm)' /proc/cpuinfo

If the result is any value higher than 0, then virtualization is enabled.


If the value is 0, then in BIOS enable Virtualization – Consult system administrator
for this step.

2. To check if your OS is 64 bit,


Run the command in terminal

uname -m

If the result is x86_64, it means that your Operating system is 64 bit Operating system.

3. Few KVM packages are availabe with Linux installation.


To check this, run the command,

ls /lib/modules/{press tab}/kernel/arch/x86/kvm

The three files which are installed in your system will be displayed
kvm-amd.ko kvm-intel.ko kvm.ko

4. Install the KVM packages

1. Switch to root (Administrator) user

sudo -i

2. To install the packages, run the following commands,


apt-get update

1
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

apt-get install qemu-kvm


apt-get install libvirt-bin
apt-get install bridge-utils
apt-get install virt-manager
apt-get install qemu-system

5. To verify your installation, run the command


virsh -c qemu:///system list
it shows output

Id Name State

If VMs are running, then it shows name of VM. If VM is not running, the system shows blank output,
whcih means your KVM installation is perfect.

6. Run the command


virsh –connect qemu:///system list –all

7. Working with KVM

run the command


virsh
version (this command displays version of software tools installed)
nodeinfo (this command displays your system information)
quit (come out of the system)

8. To test KVM installation - we can create Virtual machines but these machines are to be done in
manual mode. Skipping this, Directly install Openstack.

Installation of Openstack

1. add new user named stack – This stack user is the adminstrator of the openstack services.

To add new user – run the command as root user.

adduser stack

2. run the command


apt-get install sudo -y || install -y sudo
2
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

3. Be careful in running the command – please be careful with the syntax. If any error in thsi following
command, the system will crash beacause of permission errors.

echo “stack ALL=(ALL) NOPASSWD:ALL” >> /etc/sudoers

4. Logout the system and login as stack user

5. Run the command (this installs git repo package)


sudo apt-get install git
6. Run the command (This clones updatesd version of dev-stack (which is binary auto-installer package
of Openstack)
git clone https://git.openstack.org/openstack-dev/devstack
ls (this shows a folder named devstack)
cd devstack (enter into the folder)

7. create a file called local.conf. To do this run the command,


nano local.conf

8. In the file, make the following entry (Contact Your Network Adminstrator for doubts in these values)
[[local|localrc]]
FLOATING_RANGE=192.168.1.224/27
FIXED_RANGE=10.11.11.0/24
FIXED_NETWORK_SIZE=256
FLAT_INTERFACE=eth0
ADMIN_PASSWORD=root
DATABASE_PASSWORD=root
RABBIT_PASSWORD=root
SERVICE_PASSWORD=root
SERVICE_TOCKEN=root

9. Save this file


10. Run the command (This installs Opentack)
./stack.sh
11. If any error occurs, then run the command for uninistallation
./unstack.sh
1. update the packages
apt-get update
2. Then reinstall the package
./stack.sh

3
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

12. Open the browser, http://IP address of your machine, you will get the openstack portal.

13. If you restart the machine, then to again start open stack

open terminal,
su stack
cd devstack
run ./rejoin.sh

14. Again you can access openstack services in the browser, http://IP address of your machine,

4
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Ex.No:1 INSTALL VIRTUALBOX/VMWARE WORKSTATION


Date : WITH DIFFERENT FLAVOURS OF LINUX OR
WINDOWS OS ON TOP OF WINDOWS7 OR 8.

AIM : To Install Virtualbox/VMware Workstation with different flavours of linux or windows OS on


top of windows7 or 8.

PROCEDURE :
This experiment is to be performed through portal, install virtual machines.

TO INSTALL VM
Step 1 : After clicking on the setup so click on the Yes button..
Step 2 : Setup Wizard: then the setup wizard will install the VMware Workstation Pro on your
system so click Next to go ahead and click Exit to cancel the installation.
Step 3: End-User License Agreement: in this step accept the terms in the license agreement and
click on the Next button
Step 4: Custom Setup: select the folder in which you would like to install the VMware application.
and besides that select enhanced keyboard driver
Step 5: User Experience Setting: select both options but you can uncheck it, so I leave it as
default and click on the Next button.
Step 6: Application Shortcuts preference: here select the place you want the shortcut icons to be
placed on your system to launch the application so I recommend you select both options and click
on the Next button.
Step 7: Install VM Workstation: in this step, the installation is ready to go, so click on the
Install button to begin the installation.
Step 8: Complete the Setup Wizard: In the end, you will see the installation complete dialog box.
so, click on Finish and you are done with installation progress
Your VM will get installed on your windows.

5
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

OUTPUT:

RESULT:

Thus Installation of Virtual box/VMware Workstation of Linux or windows OS is completed successfully

6
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Ex.No:2 INSTALL A C COMPILER IN THE VIRTUAL


Date : MACHINE AND EXECUTE A SAMPLE PROGRAM

AIM:
ToInstall a C compiler in the virtual machine and execute a sample program.
Through Openstack portal create virtual machine. Through the portal connect to virtual machines.
Login to VMs and install c compiler using commands.
Eg : apt-get install gcc

Most of the time, when you are installing Linux, GNU Gcc compiler is already installed. If not,
run the following command (our system is Ubuntu Linux) :

If C compiler is already installed, it will show you a message like above. If not, it will install all
the necessary packages.

Now open a text editor and write a small C program like following and save it as demo.c :

#include <stdio.h>
main()
{
printf("Welcome to C Programming");
}

Now run the command as shown below to compile and execute the file :

7
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

OUTPUT:

RESULT:

Thus Installation of a C compiler in the virtual machine is done successfully and a sample program is executed.

8
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Ex.No:3 INSTALL GOOGLE APP ENGINE CREATE HELLO


Date : WORLD APP AND OTHER SIMPLE WEB
APPLICATIONS USING PYTHON/JAVA.

AIM: To create hello world app and other simple web applications using python/java.

PROCEDURE:
Step1: Download Python Release 2.5.4

Step2: First install Python2.5.4 and then install Google App Engine

Step3: Create a directory named “helloworld” in “/google_appengine”.

This directory will contain your application files.

Step 4: Open Text Editor (Notepad) and copy the following code in it

and save it as “helloworld.py”

print 'Content-Type: text/plain'


print ''
print 'Hello, world!'

This is your application written in Python! All it does is print Hello World on the screen every time
its called.

Step 5: Map every request to our helloworld.py file, so create a “app.yaml” file in
your helloworld directory and write the following code in it:

application: helloworld
version: 1
runtime: python
api_version: 1

handlers:
- url: /.*
script: helloworld.py

9
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Step 6: Deployment on Web Server - you can deploy your applications on local machine before
deploying them to the actual cloud.

Step 7: Open command prompt (type cmd in run) and go to the directory where you installed.
Once you are in the directory, you have to start the webserver with your application deployed.

dev_appserver.py helloworld/

The command prompt will look something like this :

Web Server is started and your HelloWorld application is deployed in it! Now you can send
requests to your app by opening the browser and typing in: http://localhost:8080

10
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

OUTPUT:

RESULT:

Thus Google App Engine is installed and a simple program is executed successfully.

11
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Ex.No:4 USE GAE LAUNCHER TO LAUNCH THE WEB


Date : APPLICATIONS.

AIM: To launch GAE launcher to launch the web applications.

PROCEDURE:

Deploying the app to App Engine

To upload the guest book app, run the following command from within the appengine-guestbook-
python directory of your application where the app.yaml and index.yaml files are located:

gcloud app deploy app.yaml index.yaml

Optional flags:

Include the --project flag to specify an alternate Cloud Console project ID to what you initialized a
as the default in the gcloud tool. Example: --project [YOUR_PROJECT_ID]
Include the -v flag to specify a version ID, otherwise one is generated for you. Example: -v
[YOUR_VERSION_ID]

The Data store indexes might take some time to generate before your application is available. If the
indexes are still in the process of being generated, you will receive a Need Index Error message
when accessing your app. This is a transient error, so try a little later if at first you receive this
error.

To learn more about deploying your app from the command line, see Deploying a Python App.

Viewing your deployed application

To launch your browser and view the app at https://PROJECT_ID.REGION_ID.r.appspot.com, run


the following command:

gcloud app browse

12
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

OUTPUT :

RESULT:

Thus GAE launcher is launched successfully and Web application is also launched.

13
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Ex.No:5 SIMULATE A CLOUD SCENARIO USING CLOUDSIM


Date : AND RUN A SCHEDULING ALGORITHM THAT IS
NOT PRESENT IN CLOUDSIM

AIM : To simulate a cloud scenario using Cloud Sim and run a scheduling algorithm that is not present
in Cloud Sim

PROCEDURE : The steps to be followed:

Step 1: Download CloudSim installable files from


https://code.google.com/p/cloudsim/downloads/list and unzip
Step 2: Open Eclipse
Step 3: Create a new Java Project: File -> New
Step 4: Import an unpacked CloudSim project into the new Java Project
Step 5: The first step is to initialise the CloudSim package by initialising the CloudSim library,

CloudSim.init(num_user, calendar, trace_flag)

Step 6: Data centres are the resource providers in CloudSim; hence, creation of data centres is a second
step. To create Datacenter, you need the DatacenterCharacteristics object that stores the properties of a
data centre such as architecture, OS, list of machines, allocation policy that covers the time or
spaceshared, the time zone and its price:

Datacenter datacenter9883 = new Datacenter


(name, characteristics, new VmAllocationPolicySimple(hostList), storageList, 0);

Step7: The third step is to create a broker:


DatacenterBroker broker = createBroker();

Step 8: The fourth step is to create one virtual machine unique ID of the VM, userId ID of the VM’s
owner, mips, number Of Pes amount of CPUs, amount of RAM, amount of bandwidth, amount of
storage, virtual machine monitor, and cloudletScheduler policy for cloudlets:

Vm vm = new Vm(vmid, brokerId, mips, pesNumber, ram, bw, size, vmm, new
CloudletSchedulerTimeShared())

Step 9: Submit the VM list to the broker:

14
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Step 10: Create a cloudlet with length, file size, output size, and utilisation model:

Cloudlet cloudlet = new Cloudlet(id, length, pesNumber, fileSize, outputSize, utilizationModel,


utilizationModel, utilizationModel)

Step 11: Submit the cloudlet list to the broker:

broker.submitCloudletList(cloudletList)

Step 12: Start the simulation:

CloudSim.startSimulation()
Sample Output from the Existing Example:
Starting CloudSimExample1...
Initialising...
Starting CloudSim version 3.0
Datacenter_0 is starting...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>null
Broker is starting...
Entities started.
0.0 : Broker: Cloud Resource List received with 1 resource(s)
0.0: Broker: Trying to Create VM #0 in Datacenter_0
0.1 : Broker: VM #0 has been created in Datacenter #2, Host #0
0.1: Broker: Sending cloudlet 0 to VM #0
400.1: Broker: Cloudlet 0 received
400.1: Broker: All Cloudlets executed. Finishing...
400.1: Broker: Destroying VM #0
Broker is shutting down...
Simulation: No more future events
CloudInformationService: Notify all CloudSim entities for shutting down.
Datacenter_0 is shutting down...
Broker is shutting down...
Simulation completed.
Simulation completed.
========== OUTPUT ==========
Cloudlet ID STATUS Data center ID VM ID Time Start Time Finish Time
0 SUCCESS 2 0 400 0.1 400.1
*****Datacenter: Datacenter_0*****
User id Debt
3 35.6

15
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

OUTPUT:

RESULT:

Thus Cloud simulator is used and a scheduling algorithm is executed in cloud sim and output is verified.

16
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Ex.No:6 FILES TRANSFER FROM ONE VIRTUAL MACHINE


Date : TO ANOTHER VIRTUAL MACHINE.

AIM : To find a procedure to transfer the files from one virtual machine to another virtual machine.

PROCEDURE :

Step 1: Locate to the folder you want to share.


Step 2: Right-click on it and select Properties.
Step 3: Under Sharing tab, click on Advanced Sharing
Step 4: Check the box of Share this folder and tap on OK. You can also click on Permissions to
change permissions for users.
Step 5: Run VirtualBox and press Windows + R to invoke Run dialog box. Type the IP address of
your host machine and hit Enter. Now you can share files between Windows and VirtualBox.

17
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

OUTPUT:

RESULT:

Transfer of files from one virtual machine to another virtual machine is completed successfully.

18
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Ex.No:7 TO LAUNCH VIRTUAL MACHINE USING TRYSTACK


Date :

AIM : To find a procedure to launch virtual machine using trystack.

PROCEDURE :

Step 1: Move to Project -> Instances and hit on Launch Instance button and a new
window will appear.
Step 2:On the first screen add a name for your instance, leave the Availability Zone to nova,
use one instance count and hit on Next button to continue.
Step 3:Choose a descriptive Instance Name for your instance because this name will be used
to form the virtual machine hostname.
Step 4:Next, select Image as a Boot Source, add the Cirros test image created earlier
by hitting the + button and hit Next to proceed further.
Step 5:Allocate the virtual machine resources by adding a flavor best suited for your
needs and click on Next to move on.
Step 6:Finally, add one of the OpenStack available networks to your instance using the +
button and hit on Launch Instance to start the virtual machine.
Step 7:Once the instance has been started, hit on the right arrow from Create Snapshot
menu button and choose Associate Floating IP.
Step 8:Select one of the floating IP created earlier and hit on Associate button in order to
make the instance reachable from your internal LAN.
Step 9:Use the instance View Log utility to obtain Cirros default credentials

19
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

OUTPUT:

RESULT:

Thus virtual machine is launched successfully using Trystack.

20
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Ex.No:8 INSTALL HADOOP SINGLE NODE CLUSTER AND RUN


Date : SIMPLE APPLICATIONS LIKE WORDCOUNT

AIM: To install hadoop single node cluster and run simple applications like wordcount

PROCEDURE:

Hadoop MapReduce is a software framework for easily writing applications which process vast amounts
of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity
hardware in a reliable, fault-tolerant manner.

A MapReduce job usually splits the input data-set into independent chunks which are processed by the
map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then
input to the reduce tasks. Typically both the input and the output of the job are stored in a file- system.
The framework takes care of scheduling tasks, monitoring them and re-executes the failed tasks.

Typically the compute nodes and the storage nodes are the same, that is, the MapReduce framework and
the Hadoop Distributed File System are running on the same set of nodes. This configuration allows the
framework to effectively schedule tasks on the nodes where data is already present, resulting in very high
aggregate bandwidth across the cluster.

The MapReduce framework consists of a single master ResourceManager, one slave NodeManager per
cluster-node, and MRAppMaster per application.

Minimally, applications specify the input/output locations and supply map and reduce functions via
implementations of appropriate interfaces and/or abstract-classes. These, and other job parameters,
comprise the job configuration.

The Hadoop job client then submits the job (jar/executable etc.) and configuration to the
ResourceManager which then assumes the responsibility of distributing the software/configuration to the
slaves, scheduling tasks and monitoring them, providing status and diagnostic information to the job-
client.

Prerequisites:

 You have set up a single-node "cluster" by following the single-node setup


 We assume that you run commands from inside the Hadoop directory.
 This program uses the HadoopStreaming API to interact with Hadoop. This API allows you to
write code in any language and use a simple text-based record format for the input and output
<key, value> pairs.

21
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Inputs and Outputs


The MapReduce framework operates exclusively on <key, value> pairs, that is, the framework views the
input to the job as a set of <key, value> pairs and produces a set of<key, value> pairs as the output of the
job, conceivably of different types.

The key and value classes have to be serializable by the framework and hence need to implement the
Writable interface. Additionally, the key classes have to implement theWritableComparable
interface to facilitate sorting by the framework.

Input and Output types of a MapReduce job:

(input) <k1, v1> -> map -><k2, v2> -> combine -><k2, v2> -> reduce -><k3, v3> (output)

CODING :

WordCount is a simple application that counts the number of occurrences of each word in a given input
set.

package hadoop;

import java.util.*;

import java.io.IOException;
import java.io.IOException;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;

public class ProcessUnits


{
//Mapper class
public static class E_EMapper extends MapReduceBase implements
Mapper<LongWritable ,/*Input key Type */
Text, /*Input value Type*/
Text, /*Output key Type*/
IntWritable> /*Output value Type*/
{

//Map function

22
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

public void map(LongWritable key, Text value,


OutputCollector<Text, IntWritable> output,
Reporter reporter) throws IOException
{
String line = value.toString();
String lasttoken = null;
StringTokenizer s = new StringTokenizer(line,"\t");
String year = s.nextToken();

while(s.hasMoreTokens())
{
lasttoken=s.nextToken();
}

int avgprice = Integer.parseInt(lasttoken);


output.collect(new Text(year), new IntWritable(avgprice));
}
}

//Reducer class
public static class E_EReduce extends MapReduceBase implements
Reducer< Text, IntWritable, Text, IntWritable >
{

//Reduce function
public void reduce( Text key, Iterator <IntWritable> values,
OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException
{
int maxavg=30;
int val=Integer.MIN_VALUE;

while (values.hasNext())
{
if((val=values.next().get())>maxavg)
{
output.collect(key, new IntWritable(val));
}
}

23
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

//Main function
public static void main(String args[])throws Exception
{
JobConf conf = new JobConf(ProcessUnits.class);

conf.setJobName("max_eletricityunits");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(E_EMapper.class);
conf.setCombinerClass(E_EReduce.class);
conf.setReducerClass(E_EReduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);

FileInputFormat.setInputPaths(conf, new Path(args[0]));


FileOutputFormat.setOutputPath(conf, new Path(args[1]));

JobClient.runJob(conf);
}
}

EXECUTION:

Environment variables are set as follows:

Hadoop launches jobs by getting a jar file containg the compiled Java code. In addition, we typically
send two command line arguments through to the Java program: the input data file or directory, and an
ouput directory for the results from the reduce tasks. Using a tool called ant makes it pretty quick to
create a jar file from the above code.

The ant tool uses an xml file that describes what needs to be compiled and packaged into a jar file. Here
is the one you used for the above WordCount example:

1 <project name="hadoopCompile" default="jar" basedir=".">


2 <target name="init">
3 <property name="sourceDir" value="."/>

24
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

4 <property name="outputDir" value="classes" />


5 <property name="buildDir" value="jar" />
6 <property name="lib.dir" value="/usr/lib/hadoop"/>
7
8 <path id="classpath">
9 <fileset dir="${lib.dir}" includes="**/*.jar"/>
10 </path>
11 </target>
12 <target name="clean" depends="init">
13 <delete dir="${outputDir}" />
14 <delete dir="${buildDir}" />
15 </target>
16 <target name="prepare" depends="clean">
17 <mkdir dir="${outputDir}" />
18 <mkdir dir="${buildDir}"/>
19 </target>
20 <target name="compile" depends="prepare">
21 <javac srcdir="${sourceDir}" destdir="${outputDir}" classpathref="classpath" />
22 </target>
23 <target name="jar" depends="compile">
24
25 <jar destfile="${buildDir}/wc.jar" basedir="${outputDir}">
26 <manifest>
27 <attribute name="Main-Class" value="wc.WordCount"/>
28 </manifest>
29 </jar>
30 </target>
31 </project>

Compile WordCount.java and create a jar:

$ bin/hadoop com.sun.tools.javac.Main WordCount.java


$ jar cf wc.jar WordCount*.class

Assuming that:

 /user/joe/wordcount/input - input directory in HDFS


 /user/joe/wordcount/output - output directory in HDFS

25
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Sample text-files as input:

 $ bin/hadoop dfs -mkdir/home/bigdata/cse

 $ bin/hadoop dfs -put/home/bigdata/cse/sample.txt /user

Run the application:

$ bin/hadoop jar home/bigdata/wc.jar /data/sample.txt /output

Applications can specify a comma separated list of paths which would be present in the current working
directory of the task using the option -files. The -libjars option allows applications to add jars to the
classpaths of the maps and reduces. The option -archives allows them to pass comma separated list of
archives as arguments. These archives are unarchived and a link with name of the archive is created in
the current working directory of tasks.

Running wordcount example with -libjars, -files and -archives:

$ bin/hadoop jar home/bigdata/wc.jar /data/sample.txt /output

Output:

$ bin/hadoop fs -cat /user/joe/wordcount/output/part-r-00000`

26
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

OUTPUT :

RESULT:

Thus Hadoop single node cluster is installed and word count application is executed successfully.

27
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Ex.No:9 MOUNT THE ONE NODE HADOOP CLUSTER


Date : USING FUSE

AIM : To mount the one node HADOOP Cluster using FUSE.

PROCEDURE :

FUSE (Filesystem in Userspace) enables you to write a normal user application as a bridge for
a traditional filesystem interface.

The hadoop-hdfs-fuse package enables you to use your HDFS cluster as if it were a traditional
filesystem on Linux. It is assumed that you have a working HDFS cluster and know the
hostname and port that your NameNode exposes.

To install fuse-dfs on Ubuntu systems:

sudo apt-get install hadoop-hdfs-fuse

To set up and test your mount point:

mkdir -p <mount_point>

hadoop-fuse-dfs dfs://<name_node_hostname>:<namenode_port><mount_point>

You can now run operations as if they are on your mount point. Press Ctrl+C to end the fuse-
dfs program, and umount the partition if it is still mounted.

Note: To find its configuration directory, hadoop-fuse-dfs uses the HADOOP_CONF_DIR


configured at the time the mount command is invoked.

· If you are using SLES 11 with the Oracle JDK 6u26 package, hadoop-fuse-dfs may exit
immediately because ld.so can't find libjvm.so. To work around this issue,add

/usr/java/latest/jre/lib/amd64/server to the LD_LIBRARY_PATH.

To clean up your test:

$ umount<mount_point>

28
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

You can now add a permanent HDFS mount which persists through reboots. To add a system
mount:

1. Open /etc/fstab and add lines to the bottom similar tothese:

hadoop-fuse-dfs#dfs://<name_node_hostname>:<namenode_port><mount_point> fuse
allow_other,usetrash,rw 2 0

For example:

hadoop-fuse-dfs#dfs://localhost:8020 /mnt/hdfs fuse allow_other,usetrash,rw 2 0

2. Test to make sure everything is workingproperly:

$ mount <mount_point>

Your system is now configured to allow you to use the ls command and use that mount point
as if it were a normal system disk.

29
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

OUTPUT:

RESULT:

Thus one node HADOOP Cluster using FUSE is executed successfully.

30
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

Ex.No:10 BUILDING A PUB/SUB MESSAGING SYSTEM IN GCP


Date :

AIM : To build a Pub/sub messaging system in GCP

PROCEDURE :

Pub/Sub allows services to communicate asynchronously. Pub/Sub is used for streaming


analytics and data integration pipelines to ingest and distribute data. It is equally effective as
messaging-oriented middleware for service integration or as a queue to parallelize tasks.
Pub/Sub enables you to create systems of event producers and consumers,
called publishers and subscribers. Publishers communicate with subscribers asynchronously by
broadcasting events, rather than by synchronous remote procedure calls (RPCs).

 Setup the GCP Cloud Environment and create a new project.


 To Enable Pub/Sub API in the project by search the api and click Enable option.
 To create a topic, go to the Pub/Sub browser.
 Click create topic. In the window that opens, enter avid-grid-323405-topic in the Topic ID
field and then click create topic.
 To receive messages, you need to create subscriptions. A subscription needs to have a
corresponding topic. When you created the topic in the previous step, Pub/Sub
automatically created a corresponding subscription named avid-grid-323405-topic-sub.

 To create another subscription, complete the following steps:

 Click Create subscription, and then click Create subscription in the menu that appears.

 Enter avid-grid-323405-topic-sub2 in the Subscription ID field.

 Click Create.

 Return to the Topics page and click avid-grid-323405-topic.

 The avid-grid-323405-topic-sub2 subscription is now attached to the topic avid-grid-


323405-topic. Pub/Sub will deliver all messages sent to avid-grid-323405-topic to this
subscription.

 Publish two messages to the topic


 Click Publish message.

31
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

 In the Message window, enter hello.


 Click Publish.. A message displays at the bottom of the page that says "Message published"
if the publish was successful.
 Repeat steps 1-33 again, but in the Message window, enter goodbye.
 Go to the Subscriptions page and click the name of the avid-grid-3234
323405-topic-
sub2 subscription.
 Pull messages from the subscription
 Click View messages.
 In the Messages pane that opens, click Pull.
Note : You should see the two messages that you just published. The messages have
hav the
data, hello and goodbye,, and the time when the messages wer
were published..

32
CS8711-CC LAB DEPARTMENT OF CSE 2021-2022

RESULT:

Thus builded the Pub/Sub messaging system in GCP is executed successfully.

33

You might also like