Question

Suppose you’re given the RDD obtained below. sc = SparkContext("local", "app") text_file = sc.textFile("data.txt") Write a...

Suppose you’re given the RDD obtained below.

sc = SparkContext("local", "app") text_file = sc.textFile("data.txt")

Write a simple Pyspark program that outputs, for each unique word, a list of the line numbers on which it occurs in the data file.

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Answer:
Please find below the program which can be copied. Also, screenshot from python shell, sample txt file and Output are also attached. Thanks

#Pyspark and Python
from pyspark import SparkContext
sc = SparkContext("local", "first app")
logData = sc.textFile("ab.txt").cache()
Countx = logData.filter(lambda s: 'xab' in s).count()
County = logData.filter(lambda s: 'ypq' in s).count()
print "Lines with xab:",Count1, " lines with ypq: ",Count2

Screenshot of code:

Output of code

Sample file

Add a comment
Know the answer?
Add Answer to:
Suppose you’re given the RDD obtained below. sc = SparkContext("local", "app") text_file = sc.textFile("data.txt") Write a...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Write a program to read the contents of data.txt file, and determine the smallest value (min),...

    Write a program to read the contents of data.txt file, and determine the smallest value (min), largest value (max), and the average value (ave). The minimum, maximum, and average values must be calculated in the function calc(). Remember to declare the stdlib.h library, so that you can use the atof function, which will convert the number in the text file into float number. Also, remember to check if there is an error opening the file or not. If there is...

  • The input file should be called 'data.txt' and should be created according to the highlighted instructions...

    The input file should be called 'data.txt' and should be created according to the highlighted instructions below. Note that even though you know the name of the input file, you should not hard-code this name into your program. Instead, prompt the user for the name of the input file. The input file should contain (in order): the weight (a number greater than zero and less than or equal to 1), the number, n, of lowest numbers to drop, and the...

  • c++ write a program that reads all values from a text file "data.txt" and stores them in ID array valuesl I. The input process from the file should be terminated when a negative value is detec...

    c++ write a program that reads all values from a text file "data.txt" and stores them in ID array valuesl I. The input process from the file should be terminated when a negative value is detected. (An example of such a file is shown below). Also, the program should copy from values[ 1 any value which has even sum of digits and its index (location) to two new arrays evenArr 1 and idxArrl I, respectively. In addition, the program must...

  • (Python 3) Write a program that reads the contents of a text file. The program should...

    (Python 3) Write a program that reads the contents of a text file. The program should then create a dictionary in which the keys are individual words found in the file and the values are the number of times each word appears and a list that contains the line numbers in the file where the word (the key) is found. Then the program will create another text file. The file should contain an alphabetical listing of the words that are...

  • Write a program in C++: Using classes, design an online address book to keep track of the names f...

    Write a program in C++: Using classes, design an online address book to keep track of the names first and last, addresses, phone numbers, and dates of birth. The menu driven program should perform the following operations: Load the data into the address book from a file Write the data in the address book to a file Search for a person by last name or phone number (one function to do both) Add a new entry to the address book...

  • Python 3.7 Coding assignment This Program should first tell users that this is a word analysis...

    Python 3.7 Coding assignment This Program should first tell users that this is a word analysis software. For any user-given text file, the program will read, analyze, and write each word with the line numbers where the word is found in an output file. A word may appear in multiple lines. A word shows more than once at a line, the line number will be only recorded one time. Ask a user to enter the name of a text file....

  • You are given a set of ABC cubes for kids (like the one shown below) and a word. On each side of ...

    You are given a set of ABC cubes for kids (like the one shown below) and a word. On each side of each cube a letter is written 7 FI 0 You need to find out, whether it it possible to form a given word by the cubes For example, suppose that you have 5 cubes: B1: MXTUAS B2:OQATGE ВЗ: REwMNA B4: MBDFAC В5: IJKGDE (here for each cube the list of letters written on its sides is given) You...

  • C++ programming: Write a program to score a two player game of scrabble. The program reads...

    C++ programming: Write a program to score a two player game of scrabble. The program reads the player number, word played and the squares on the board used for each word played then reports the scores for each word and the total score for each player. • All information is read from the file, moves.in. • Each line of the file contains a player number (1 or 2), the word and the board spaces used. • The words are entered...

  • The Payroll Department keeps a list of employee information for each pay period in a text...

    The Payroll Department keeps a list of employee information for each pay period in a text file. The format of each line of the file is the following: Write a program that inputs a filename from the user and prints to the terminal a report of the wages paid to the employees for the given period. The report should be in tabular format with the appropriate header. Each line should contain: An employee’s name The hours worked The wages paid...

  • Pythong Program 4 should first tell users that this is a word analysis software. For any...

    Pythong Program 4 should first tell users that this is a word analysis software. For any user-given text file, the program will read, analyze, and write each word with the line numbers where the word is found in an output file. A word may appear in multiple lines. A word shows more than once at a line, the line number will be only recorded one time. Ask a user to enter the name of a text file. Using try/except for...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT