Question

Question about spark with python.

I have a dataset named example.csv, here is data in CSV file below: [0] name [1] name of fruits [2] number of fruits

apple banana Jack Bob Diana apple 7 Eric orange frank 2 mango 0LO C N

1. use Pyspark code to plot the number of each fruit. sort the result based on the number of fruits

the x-axis is the name of fruits, the y-axis is num of total fruits

the output should be apple 10 banana 5 orange 3 mango 2

2. use Pyspark code to plot a number apple that each person have

the x-axis is the name, y is num of apple each person have

the output should be jack 3 Diana 7

0 0
Add a comment Improve this question Transcribed image text
Answer #1

PySpark doesn't have any plotting functionality (yet). If you want to plot something, you can bring the data out of the Spark Context and into your "local" Python session, where you can deal with it using any of Python's many plotting libraries

there you can use the following code:

1)

import matplotlib.pyplot as plt
import csv

the_dict = dict()

with open('example.csv','r') as csvfile:
    plots = csv.reader(csvfile, delimiter=',')
    for row in plots:
        if row[1] in the_dict:
            the_dict[row[1]]+=int(row[2])
        else:
             the_dict[row[1]] = int(row[2])

sorted_dict = sorted(the_dict.items(), key=itemgetter(1),reverse=True)

x = []

y=[]

for i in sorted_dict:

    x.append(i[0])

    y.append(i[1])

plt.xlabel('name of fruits')
plt.ylabel('num of total fruits')
plt.plot(x,y, label='fruits')
plt.legend()
plt.show()

2)

import matplotlib.pyplot as plt

import csv

x=[]

y =[]

with open('example.csv','r') as csvfile:

    plots = csv.reader(csvfile, delimiter=',')

        for row in plots:

   if row[1]=='apple':

   x.append(row[0])

   y.append(row[2])

plt.xlabel('name of the person')
plt.ylabel('number of apples')
plt.plot(x,y, label='apples')
plt.legend()
plt.show()

       

Add a comment
Know the answer?
Add Answer to:
Question about spark with python. I have a dataset named example.csv, here is data in CSV...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • I only need the r code. the dataset we have is a .csv file. 3. (10...

    I only need the r code. the dataset we have is a .csv file. 3. (10 points) The data below are called ttable.csv on Bb. Read the data into R as a data frame "tmat" with names as shown. This data shows critical points for the Student t-distribution for degrees of freedom (by rows) 2, 3....10. Print the data frame to the screen as below. Then construct a graph as similar to the one shown below as you can Provide...

  • Write an open_file() and a function that reads a csv data file and returns a dictionary...

    Write an open_file() and a function that reads a csv data file and returns a dictionary Microsoft Word - project09_final_RJE.docx 1/17 This project focuses on analyzing a publicly available dataset containing information about the spread of nCoV. Since this is an active situation, this dataset is constantly being updated with the latest figures. However, for our purposes, we will use a static version of the dataset provided to you with this project (ncov.csv). This static version was lasted updated on...

  • 1) Which of the following is NOT true about a Python variable? a) A Python variable...

    1) Which of the following is NOT true about a Python variable? a) A Python variable must have a value b) A Python variable can be deleted c) The lifetime of a Python variable is the whole duration of a program execution.   d) A Python variable can have the value None.        2) Given the code segment:        What is the result of executing the code segment? a) Syntax error b) Runtime error c) No error      d) Logic error 3) What...

  • I have a python project that requires me to make a password saver. The major part...

    I have a python project that requires me to make a password saver. The major part of the code is already giving. I am having trouble executing option 2 and 3. Some guidance will be appreciated. Below is the code giving to me. import csv import sys #The password list - We start with it populated for testing purposes passwords = [["yahoo","XqffoZeo"],["google","CoIushujSetu"]] #The password file name to store the passwords to passwordFileName = "samplePasswordFile" #The encryption key for the caesar...

  • MATLAB question: I have some data on excell and I have to write a code that...

    MATLAB question: I have some data on excell and I have to write a code that does the following: Please use dummy data for your convenience. Thank you in advance for your help! Read from Data_all.xlsx Sheet “Wave_data", your specific Wave Height (WH) measurements and using a "for loop" and "if-elseif-else" command, count the number of measurement in the following ranges: a. WH= 0 or 1 m b. WH=2 or 3 m c. WH=4 or 5 m d. WH=6 or...

  • Step 1: Getting Started Create a new .java file named Lab12.java. At the beginning of this...

    Step 1: Getting Started Create a new .java file named Lab12.java. At the beginning of this file, include your assignment documentation code block. After the documentation block, you will need several import statements. import java.util.Scanner; import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.IOException; Next, declare the Lab12 class and define the main function. public class Lab12 { public static void main (String [] args) { Step 2: Declaring Variables For this section of the lab, you will need to declare...

  • What this Assignment Is About: Review on Java I topics, such as primitive data types, basic...

    What this Assignment Is About: Review on Java I topics, such as primitive data types, basic I/O, conditional and logical expressions, etc. Review on Java loops. Documentation Requirements to get full credits in Documentation The assignment number, your name, StudentID, Lecture number(time), and a class description need to be included at the top of each file/class. A description of each method is also needed. Some additional comments inside of methods (especially for a "main" method) to explain code that are...

  • Question2 uses structured design implemented in C. Array of records (structs) with file I/O is needed....

    Question2 uses structured design implemented in C. Array of records (structs) with file I/O is needed. The program takes two inputs at a time. The name of a person, and, the coin value as an integer in the range 5 to 95. Input coin values should always be divisible by 5 (integer division). Names are one word strings. An example input is: Jane 30 This input line indicates that 30 cents change is to be given to Jane. Output change...

  • I just need an algorithm for this please! I have C++ code for it but I dont know how to creat an ...

    I just need an algorithm for this please! I have C++ code for it but I dont know how to creat an algorithm .. CSE 1311-Project 4 Part I: Create and print out the two arrays: (Be sure to do this first) You are allowed to hard code these arrays into your program. You can also put the data into a file and read the information into the program. The data is as follows: 150 250 Anne Bob Ralph 305...

  • Need different C code then what has been posted here Problem In this assignment, you have...

    Need different C code then what has been posted here Problem In this assignment, you have to simulate the Josephus problem. There are n number of prisoners standing in a circle waiting to be executed. The counting out begins at some point in the circle and proceeds around the circle in a fixed direction. In each step, a certain number of people are skipped and the next person is executed. The elimination proceeds around the circle (which is becoming smaller...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT