Question

Assuming the input is a large collection of simple text documents, and the following MyMapper and...

Assuming the input is a large collection of simple text documents, and the following MyMapper and MyReducer (written in Hadoop) will be applied to these text documents:

public static class MyMapper extends Mapper<Object, Text, IntWritable, Text> {

    private Text word = new Text();

public void map(Object key, Text value, Context context)

throws IOException, InterruptedException {     

      StringTokenizer itr = new StringTokenizer(value.toString());

      while (itr.hasMoreTokens()) {

      String currentWord = itr.nextToken();

      word.set(currentWord);

      context.write(new IntWritable(currentWord.length()), word);

      }

    }

}

public static class MyReducer

       extends Reducer<IntWritable,Iterable<Text>,IntWritable, IntWritable> {

    private IntWritable result = new IntWritable();

public void reduce(IntWritable key, Iterable<Text> values, Context context)

throws IOException, InterruptedException {

      int sum = 0;

      for (Text val : values) { sum ++; }

      result.set(sum);

      context.write(key, result);

    }

}

1.1 (5 points) What is the goal of the above mapper and reducer? answer very briefly

1.2 (3 points) For an input text document which has the following content,

one of the input text file

write the output from the mapper:

1.3 (4 points) write the result of the grouping stage between mapper and reducer

1.4 (2 points) write the output from the reducer:

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Answer:

(1.1) The goal of the above mapreduce program is to "Word Count".

Explanation:

Record Reader
-------------
This is the first phase of MapReduce where the Record Reader reads every line from the input text file as text and yields output as key-value pairs.

Map Phase
----------
The Map phase takes input from the Record Reader, processes it, and produces the output as another set of key-value pairs.

The Map phase reads each key-value pair, divides each word from the value using StringTokenizer, treats each word as key and the count of that word as value.

Combiner Phase
---------------
The Combiner phase takes each key-value pair from the Map phase, processes it, and produces the output as key-value collection pairs.

The Combiner phase reads each key-value pair, combines the common words as key and values as collection. Usually, the code and operation for a Combiner is similar to that of a Reducer.

Reducer Phase
-------------
The Reducer phase takes each key-value collection pair from the Combiner phase, processes it, and passes the output as key-value pairs. Note that the Combiner functionality is same as the Reducer.

Record Writer
-------------
This is the last phase of MapReduce where the Record Writer writes every key-value pair from the Reducer phase and sends the output as text.

(1.2) Given input text document cntains

"one of the input text file"

Output:
-------
<one,1> <of,1> <the,1> <input,1> <text,1> <file,1>

(1.3) Combiner Phase output:

<one,1> <of,1> <the,1> <input,1> <text,1> <file,1>

(1.4) Reducer Otuput:
<one,1> <of,1> <the,1> <input,1> <text,1> <file,1>

In case of any queries please mention in the comment section.

Please do UpVote it!

Thank you!

Add a comment
Know the answer?
Add Answer to:
Assuming the input is a large collection of simple text documents, and the following MyMapper and...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Implementation of a MapReduce-style distributed word count application For this assignment, you can use any programming...

    Implementation of a MapReduce-style distributed word count application For this assignment, you can use any programming language you want and you can use either RMI or any version of RPC for client/server communication. For this assignment, you will focus only on a single type of application; Word Count. In a single Word Count job, the programmer provides a set of text files to be processed, and the frequency of each word in all the documents is counted and stored in...

  • Modify the code to turn the text area background to the color YELLOW if you click...

    Modify the code to turn the text area background to the color YELLOW if you click an "x" (lower or uppercase)? package com.java24hours; import javax.swing.*; import java.awt.event.*; import java.awt.*; public class KeyViewer extends JFrame implements KeyListener { JTextField keyText = new JTextField(80); JLabel keyLabel = new JLabel("Press any key in the text field."); public KeyViewer() { super("KeyViewer"); set LookAndFeel(); setSize(350, 100); setDefaultCloseOperation (JFrame.EXIT_ON_CLOSE) ; keyText.addKeyListener(this); BorderLayout bord = new BorderLayout(); set Layout (bord); add (keyLabel, BorderLayout. NORTH); add (keyText, BorderLayout....

  • 1. Copy the file secret.txt into a path that you can access. Read FilePath.doc if you...

    1. Copy the file secret.txt into a path that you can access. Read FilePath.doc if you have questions on file path. Copy SecretMessage.java into your NetBeans or other IDE tools. 2. Finish the main method that will read the file secret.txt, separate it into word tokens.You should process the tokens by taking the first letter of every fifth word, starting with the first word in the file. These letters should converted to capitals, then be appended to StringBuffer object to...

  • Java only. Thanks These questions involve choosing the right abstraction (Collection, Set, List, Queue, Deque, SortedSet,...

    Java only. Thanks These questions involve choosing the right abstraction (Collection, Set, List, Queue, Deque, SortedSet, Map, or SortedMap) to EFFICIENTLY accomplish the task at hand. The best way to do these is to read the question and then think about what type of Collection is best to use to solve it. There are only a few lines of code you need to write to solve each of them. Unless specified otherwise, sorted order refers to the natural sorted order...

  • Lab Description Sort all words by comparing the length of each word. The word with the...

    Lab Description Sort all words by comparing the length of each word. The word with the smallest length would come first. If you have more than one word with the same length, that group would be sorted alphabetically Input: The data file contains a list of words. The first line in the data file is an integer that represents the number of data sets to follow Output: Output the complete list of words in order by length. Sample Data 10...

  • In java write a simple 1-room chat server that is compatible with the given client code.

    In java write a simple 1-room chat server that is compatible with the given client code. 9 public class Client private static String addr; private static int port; private static String username; 14 private static iter> currenthriter new AtomicReference>(null); public static void main(String[] args) throws Exception [ addr -args[]; port Integer.parseInt (args[1]); username-args[21 Thread keyboardHandler new Thread(Main: handlekeyboardInput); 18 19 while (true) [ try (Socket socket -new Socket (addr, port) println(" CONNECTED!; Printwriter writer new Printwriter(socket.getoutputStreamO); writer.println(username); writer.flush); currenthriter.set(writer); BufferedReader...

  • Can anyone helps to create a Test.java for the following classes please? Where the Test.java will...

    Can anyone helps to create a Test.java for the following classes please? Where the Test.java will have a Scanner roster = new Scanner(new FileReader(“roster.txt”); will be needed in this main method to read the roster.txt. public interface List {    public int size();    public boolean isEmpty();    public Object get(int i) throws OutOfRangeException;    public void set(int i, Object e) throws OutOfRangeException;    public void add(int i, Object e) throws OutOfRangeException; public Object remove(int i) throws OutOfRangeException;    } public class ArrayList implements List {   ...

  • Help check why the exception exist do some change but be sure to use the printwriter...

    Help check why the exception exist do some change but be sure to use the printwriter and scanner and make the code more readability Input.txt format like this: Joe sam, thd, 9, 4, 20 import java.io.File; import java.io.PrintWriter; import java.io.IOException; import java.io.FileNotFoundException; import java.io.FileWriter; import java.util.Scanner; public class Main1 { private static final Scanner scan = new Scanner(System.in); private static String[] player = new String[622]; private static String DATA = " "; private static int COUNTS = 0; public static...

  • Swing File Adder: Build a GUI that contains an input file, text box and Infile button....

    Swing File Adder: Build a GUI that contains an input file, text box and Infile button. It also must contain and output file, text box and Outfile button. Must also have a process button must read the infile and write to the outfile if not already written that is already selected and clear button. It must pull up a JFile chooser that allows us to brows to the file and places the full path name same with the output file.  Program...

  • The following code uses a Scanner object to read a text file called dogYears.txt. Notice that...

    The following code uses a Scanner object to read a text file called dogYears.txt. Notice that each line of this file contains a dog's name followed by an age. The program then outputs this data to the console. The output looks like this: Tippy 2 Rex 7 Desdemona 5 1. Your task is to use the Scanner methods that will initialize the variables name1, name2, name3, age1, age2, age3 so that the execution of the three println statements below will...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT