Question
  1. What does the five number tell us about the time spent on email (Hint, interpret the five number summary in plain English) and what does the Boxplot and the normality test show? Explain.
  2. Use the 1.5xIQR rule to identify possible outliers. List the cutoff points for outliers, Show your workings. Explain what you found out. (Hint: Are there any excessive time spent on email for Male(1) or Female(2) or both).

GET DATA /TYPE-XLS /FILE=C: Users manda 1 Desktophomework! . xls SHEET-name gss2004 /CELLRANGE=FULL / READNAMES=ON DATATYPEMIN PERCENTAGE- 95.0 EXECUTE DATASET NAME DataSetl WINDOW-FRONT GRAPH /BAR (SIMPLE )=COUNT BY sex. Graph Notes Output Created Comments Input 16-JAN-2019 12:21:47 Active Dataset Filter Weight Split File N of Rows in Working Data File DataSet1 <none <none <none> 575 Syntax GRAPH BAR(SIMPLE)-COUNT BY sex Resources Processor Time 00:00:01.14 Elapsed Time 00:00:00.77 [DataSet1]400 300 O 200 336 58.43% 239 1.57% 100 sex GRAPH /PIE-COUNT BY sex. Graph Notes Output Created Comments Input 16-JAN-2019 12:26:15 Active Dataset Filter Weight Split File N of Rows in Working Data File DataSet <none> <none> Snone> 575Syntax GRAPH PIE COUNT BY sex. Resources Processor Time 00:00:00.37 Elapsed Time 00:00:00.16 sex 239 336 XAMİNE VARIABLES-emailhr PLOT BOXPLOT STEMLEAF HISTOGRAM NPPLOT /COMPARE GROUPS PERCENTILES (5,10,25,50,75,90,95) HAVERAGE STATISTICS DESCRIPTIVES /CINTERVAL 95 MISSING LISTWISE /NOTOTAL ExploreOutput Created Comments Input 16-JAN-2019 12:33:36 Active Dataset Filter Weight Split File N of Rows in Working Data File Definition of Missing DataSet1 <none <none <none> 575 Missing Value Handling User-defined missing values for dependent variables are treated as missin Statistics are based on cases with no missing values for any dependent variable or factor used EXAMINE VARIABLES emailhr Cases Used Syntax PLOT BOXPLOT STEMLEAF HISTOGRAM NPPLOT COMPARE GROUPS PERCENTILES(5,10,25,50,7 5,90,95) HAVERAGE STATISTICS DESCRIPTIVES /CINTERVAL 95 MISSING LISTWISE NOTOTAL Resources Processor Time 00:00:00.89 Elapsed Time 00:00:00.54Case Processing Summary Cases Valid Missing Total Percent Percent Percent emailhr 575 100.0% 0 00% 575 100.0% Descriptives Statistic Std. Error emailhr Mean 381 6.10 5.35 6.84 4.68 2.00 83.567 9.142 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std. Deviation Minimumm Maximum Range Interquartile Range Skewness Kurtosis Lower Bound Upper Bound 50 50 6 2.667 7.696 102 203Percentiles Percentiles 5 10 25 50 75 Weighted Average (Definition emailhr 1.00 2.00 7.00 Tukeys Hinges emailhr 1.00 2.00 7.00 Percentiles Percentiles 90 95 Weighted Average(Definition1 emailhr 18.80 28.00 Tukeys Hinges emailhr Tests of Normality Kolmogorov-Smirnova Shapiro-Wilk Statistic df Si Statistic df Si emailhr 257 575 650 575 a. Lilliefors Significance Correctionemailhr Histogram 250 Mean- 6.1 Std. Dev. - 9.142 N-575 200 150 100 50 10 20 50 emailhr emailhr Stem-and-Leaf Plot Frequency Stem & Leaf 95.00 116.00 0 .00000000000000000000000000000000000000000000000 86.00 26.00 25.00 60.00 13.00 22.00 12.00 3.00 38.00 .00 7.00 2 . 0000000000000000000000000000000000000000000 3 . 0000000000000 4 . 000000000000 5 . 000000000000000000000000000000 6 000000 8 000000 100000000000000000000 1213 2.00 12.00 58.00 Extremes 14 .0 15 Stem width: Each leaf: 2 case (s) Normal Q-Q Plot of emailhr 20 40 60 Observed ValueDetrended Normal Q-Q Plot of emailhr 10 20 50 Observed Value 572 575 50 5737335 574 336 571 568 567 570 333 334 566 332 330 40 331 558103 329 556557 20 10 emailhr EXAMINE VARIABLES-ema 11hr BY sex /PLOT BOXPLOT HISTOGRAM NPPLOT/COMPARE GROUPS /PERCENTILES (5,10,25,50,75,90,95) HAVERAGE STATISTICS DESCRIPTIVES /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL Explore Notes Output Created Comments Input 16-JAN-2019 12:47:11 DataSet Active Dataset Filter Weight Split File N of Rows in Working Data File Definition of Missing <none> <none> 575 Missing Value Handling User-defined missing values for dependent variables are treated as missin Statistics are based on cases with no missing values for any dependent variable or factor used Cases UsedStd. Deviation Minimum Maximum Ran Interquartile Range Skewness Kurtosis Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std. Deviation Minimum Maximum Ran Interquartile Range Skewness Kurtosis 9.506 50 50 157 314 485 2.658 7.585 5.93 4.98 6.88 4.57 2.00 78.924 8.884 Lower Bound Upper Bound 50 50 2.675 133 7.834 265 Percentiles Percentiles sex 10 25 50 75 Weighted Average(Definition emailhr 1 1) Tukeys Hinges 1.00 1.00 1.00 1.00 2.00 2.00 2.00 2.00 8.00 7.00 7.50 7.00 2 emailhr1 Percentiles Percentiles sex 95 Weighted Average(Definition 1) emailhr 20.00 30.00 15.90 28.00 Tukeys Hinges emailhr 2emailhr Histogram 250 Mean- 6.1 Std. Dev. - 9.142 N-575 200 150 100 50 10 20 50 emailhr emailhr Stem-and-Leaf Plot Frequency Stem & Leaf 95.00 116.00 0 .00000000000000000000000000000000000000000000000 86.00 26.00 25.00 60.00 13.00 22.00 12.00 3.00 38.00 .00 7.00 2 . 0000000000000000000000000000000000000000000 3 . 0000000000000 4 . 000000000000 5 . 000000000000000000000000000000 6 000000 8 000000 100000000000000000000 12Detrended Normal Q-Q Plot of emailhr 10 20 50 Observed Value 572 575 50 5737335 574 336 571 568 567 570 333 334 566 332 330 40 331 558103 329 556557 20 10 emailhr EXAMINE VARIABLES-ema 11hr BY sex /PLOT BOXPLOT HISTOGRAM NPPLOTSyntax EXAMINE VARIABLES-emailhr BY sex PLOT BOXPLOT HISTOGRAM NPPLOT COMPARE GROUPS PERCENTILES(5,10,25,50,7 5,90,95) HAVERAGE STATISTICS DESCRIPTIVES CINTERVAL 95 MISSING LISTWISE NOTOTAL Processor Time 00:00:01.09 Elapsed Time 00:00:00.80 sex Case Processing Summary Cases Valid Missing Total sex Percent Percent Percent emailhr 1 239 100.0% 0.0% 239 100.0% 336 100.0% 0.0% 336 100.0% Descriptives sex Statistic Std. Error 615 6.33 5.12 7.54 4.85 2.00 90.364 emailhr1 Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Lower BoundStd. Deviation Minimum Maximum Ran Interquartile Range Skewness Kurtosis Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std. Deviation Minimum Maximum Ran Interquartile Range Skewness Kurtosis 9.506 50 50 157 314 485 2.658 7.585 5.93 4.98 6.88 4.57 2.00 78.924 8.884 Lower Bound Upper Bound 50 50 2.675 133 7.834 265 Percentiles Percentiles sex 10 25 50 75 Weighted Average(Definition emailhr 1 1) Tukeys Hinges 1.00 1.00 1.00 1.00 2.00 2.00 2.00 2.00 8.00 7.00 7.50 7.00 2 emailhr1 Percentiles Percentiles sex 95 Weighted Average(Definition 1) emailhr 20.00 30.00 15.90 28.00 Tukeys Hinges emailhr 2Tests of Normality Kolmogorov-Smirnova Shapiro-Wilk sex Statistic df Si Statistic df emailhr1 253 239 649 239 2 262 336 651 336 a. Lilliefors Significance Correction emailhr Histograms Histogram for sex- 1 100 Mean 6.33 Std. Dev. 9.506 N-239 80 60 20 10 20 50 emailhrHistogranm for sex- 2 120 Mean- 5.93 Std. Dev. 8.884 N 336 100 80 O 60 40 20 10 20 30 40 50 emailhrNormal Q-Q Plots Normal Q-Q Plot of emailhr for sex-1 20 40 60 Observed ValueNormal Q-Q Plot of emailhr for sex- 2 20 40 60 Observed ValueDetrended Normal Q-Q Plots Detrended Normal Q-Q Plot of emailhr for sex- 1 40 50 10 20 30 Observed ValueDetrended Normal Q-Q Plot of emailhr for sex-2 10 20 30 Observed Value 575 573 335 336* 104 571 568 331 #333 566 234 561 233560 562 558 k 559 329 555 103 232 102 O 556 551 10 sex

0 0
Add a comment Improve this question Transcribed image text
Answer #1

(first part)

There are many descriptive statistics. Numbers such as the mean, median, mode, skewness, kurtosis, standard deviation, first q and third quartile, to name a few, each tell us something about our data.The five-number summary is a set of descriptive statistics of the given frequency distribution that provide information about a dataset. It consists of the five most important sample percentiles:

  1. the sample minimum (smallest observation)
  2. the lower quartile or first quartile
  3. the median (the middle value)
  4. the upper quartile or third quartile
  5. the sample maximum (largest observation)

(second part) In descriptive statistics, a box plot or boxplot is a method for graphically depicting frequency distribution through their 5 number summary. Box plots may also have lines extending vertically from the boxes indicating variability outside the upper and lower quartiles, Box plots display variation in samples of a statistical population without making any assumptions of the underlying statistical distribution . The spacings between the different parts of the box indicate the degree of spreadness and skewness in the data, and show outliers.

(third part) normality tests are used to determine if a data set is coming from a normal distribution and to compute how likely it is for a random variable underlying the data set to be normally distributed.

(fourth) here the first quartile Q1, third quartile Q3 is not given, it is difficult to find the outlier. also raw data is also not given.

however observation below the Q1-IQR*1.5 and above Q3+1.5*IQR will be known as outlier

Add a comment
Know the answer?
Add Answer to:
What does the five number tell us about the time spent on email (Hint, interpret the...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT