Need Help In Hadoop MapReduce and Spark Using Java

Introduction about Hadoop Platform?

Hadoop is a data management and distributed processing system. It contains many components, including:

  • HDFS is a file system that distributes data across many machines

  • MapReduce for batch parallel computing

MapReduce provides:

  • Automatic parallelization and distribution

  • Fault tolerance

  • Input/Output Scheduling

  • Monitoring and Status updates

How MapReduce Data programming model works

  • MapReduce has two main tasks: Map, Reduce

  • The data blocks distributed across different machines are processed by Map tasks in parallel.

  • Results are aggregated in Reducers

  • Works only with KEY/VALUE pairs

MapReduce Key/Value Pairs

The data exchanged between Map and Reduce, and more, in the entire job are pairs (key, value):

  • a key: it is any type of data: integer, text. . .

  • a value: it is any type of data

Everything is represented like this. For example :

  • a text file is a set of (line number, line).

  • a weather file is a set of (date and time, temperature)

It is this notion that makes programs quite strange to beginners: the two functions Map and Reduce receive and transmit such pairs.

MapReduce – Word count example:

Pseudo-code: word count

Map (Long input_key, String input_values) :

   foreach word w in input_values:
       EmitIntermediate (w, ‘1’); 

Reduce (String key, Iterator intermediate_values):

     int result=0;

     foreach v in intermediate_values:
         result += ParseInt( v );
     Emit (key, String( result ));


  1. Using MapReduce, calculate the sum of the odd and even numbers contained in a Text file:

  2. Using MapReduce, calculate the sum of a list of numbers contained in a Text file:

  3. Using MapReduce, determine the duplicate elements in a list of numbers contained in a file:

  4. Using MapReduce, determine the maximum and minimum of a list of numbers contained in a file:

  5. Using MapReduce, determine how many odd and even numbers in a list of numbers contained in a file:

  6. Using MapReduce, determine the total number of words in a in a Text file:

