Introduction to MapReduce training for beginners in Aurora, IL | Map Reduce Training for Beginners | Advanced MapReduce Training

Introduction to MapReduce training for beginners in Aurora, IL | Map Reduce Training for Beginners | Advanced MapReduce Training


In this course we will introduce you to distributed data processing, how to use MapReduce to process large amounts of data. This course is focused on providing practical hands-on exercises. Students will learn to write MapReduce programs. Advanced Features of MapReduce will be covered as well. 

Course Schedule

Prerequisite

Desired but not required – Exposure to, Working proficiency of Java, sql.

 

Course Features

  • 4 weeks, 8 sessions, 16 hours of total LIVE Instruction
  • Training material, instructor handouts and access to useful resources on the cloud provided
  • Practical Hands on Lab exercises on cloud workstations provided
  • Actual code and scripts provided
  • Real-life Scenarios

Course Outline

1. Introduction to MapReduce

  • MapReduce Overview
  • MapReduce in Hadoop
  • History of MapReduce
  • MapReduce applications
  • Data Flow in MapReduce
  • Map and Reduce operations
  • Job submission flow of MapReduce
  • Map Operation
  • Job Initialization
  • Task Assignment
  • Job Completion
  • Job Scheduling
  • Job Failures
  • Shuffle and sort
  • Word Count Problem, Flow and Solution
  • MapReduce Algorithms

2. Map Reduce Types and Formats

  • Data Types
  • File Formats
  • Input Formats
  • Output Formats
  • Explain the Driver, Mapper and Reducer code
  • Configuring development environment – Eclipse
  • Writing Unit Test
  • Running locally
  • Running on Cluster

3. Understanding MapReduce

  • Data Flow in MapReduce
  • MapReduce example
  • MapReduce Daemons
  • Job tracker
  • Task Tracker
  • Other phases in MapReduce
  • Data Flow in single, multiple and no reduce task

4. MapReduce with YARN

  • Hadoop Architecture
  • Problem with Hadoop 1.x, Hadoop 2.x features,
  • YARN MapReduce Application Execution Flow
  • YARN Workflow
  • Anatomy of MapReduce Program

5. Advanced MapReduce

  • Counters
  • Sorting
  • Input Splits in MapReduce
  • MapReduce Combiner
  • MapReduce Partitioner
  • MapReduce Distributed Cache
  • MRunit
  • Reduce Join
  • Joins – Map Side and Reduce Side
  • Custom Input Format
  • Sequence Input Format
  • Side Data Distribution

Refund Policy

  • There are no Refunds. All Sales is final.