(a) Load the data file data/tips.csv into a pandas DataFrame called tips_df using the pandas read_table()...

Question

Question

(a) Load the data file data/tips.csv into a pandas DataFrame called tips_df using the pandas read_table()...

(a) Load the data file data/tips.csv into a pandas DataFrame called tips_df using the pandas read_table() function. Check the first five rows.

(b) Create a new dataframe called tips by randomly sampling 6 records from the dataframe tips_df. Refer to the sample() function documentation.

(c) Add a new column to tips called idx as a list ['one', 'two', 'three', 'four', 'five', 'six'] and then later assign it as the index of tips dataframe. Display the dataframe.

(d) Create a new Series called kids as Series([1, 2, 1], index = ['two', 'five', 'six']). Assign the series as a new column in the dataframe.

(e) List the various columns in the dataframe using the columns attribute of the dataframe. Also, check the various column datatypes in the dataframe.

(f) Transpose the dataframe tips.

(g) Check the name of the dataframe index. If there isn't one, assign a new name.

(h) Check the name of the dataframe columns. If there isn't one, assign a new name.

(i) List the rows in the dataframe using the values attribute of the dataframe. Check the datatype of the result.

(j) Check if 'time' is one of the columns in the dataframe. Use set-like operation in.

(k) Check if 'six' is one of the index values in the dataframe. Use set-like operation in.

(l) Check if 'seven' is one of the index values in the dataframe. Use set-like operation in.

---------------------------------------------------------------------------

(a) Add a new row with the following values [18.0, 4.0, 'Male', 'No', 'Mon', 'Lunch', 3, 1.0, True] to the tips dataframe with a duplicate index value six.

(b) Select all occurences of the index six. Hint: Use the loc attribute for retrieving rows by position.

(c) Reset the index for the dataframe. Hint: Use reset_index.

(d) Reindex using day column. Hint: Use set_index.

(e) Now, revert back to using the index column as the index.

(f) Drop the newly added row from the tips dataframe with duplicate index value six. Hint: First, reset the index, then use drop_duplicates function and reassign the index back to normal.

(g) Drop the row with index value six. Hint: Use drop.

(h) Drop the columns kids and kidcheck.

(i) Drop the column size.

------------------------------------------------------------------------

(a) Select two columns tip and sex from the dataframe.

(b) Select one column sex from the dataframe.

(c) Select the first 3 rows using slicing notation.

(d) Select the first 4 rows using the index labels. Note: Slicing with index labels behaves differently than normal Python slicing.

(e) Select the rows where the value of sex is Male. Hint: Use boolean array.

(f) Select the rows where tip is greater than 2.

(g) Select the column smoker where the row where tip is greater than 2. Hint: Use loc.

(h) Select the columns smoker and total_bill where the row where tip is greater than 3. Hint: Use loc.

(i) For the rows where sex is Male, assign the value of tip to 5.

(j) Check what happens when you compare the dataframe with the following scalar boolean expression. tips < 2. Intrepret what is happening, why.

(k) Select the third and second columns (in that order) for the third row in the dataframe using integer indexing. Hint: Use iloc.

(l) Select the third and second columns (in that order) for the third and fifth rows (in that order) in the dataframe using integer indexing. Hint: Use iloc.

(m) Select all the rows and the third and second columns (in that order) using integer indexing for cases where the tip value is greater than 3. Hint: Use iloc.

------------------------------------------------------------------------

(a) Create two sample dataframes with 6 records tips1 and tips2 from tips_df dataframe. tips_df.sample(n = 6).

(b) Append tips2 to tips1.

(c) Assign the value np.nan to all the records in tips1 where smoker is Male.

(d) Use forward fill to fill missing values in the smoker column in tips1. Hint: Use fillna.

------------------------------------------------------------------------

(a) Find the descriptive statistics for the dataframe tips1. Notice how the statistics are reported only for numeric columns.

(b) Create a new dataframe tips3 that only contains columns with numeric values from the dataframe tips1. Find the descriptive statistics for tips3.

(c) Compute the sum of all rows in each column in tips3.

(d) Compute the sum of all columns for each row in tips3.

(e) Compute the cumulative sums for values in each row for every column in tips3.

(f) Compute the correlation and covariance of the columns in tips3.

(g) Use the corrwith DataFrame method to find the correlation of all the columns with the the column total_bill.

------------------------------------------------------------------------

(a) Create a new dataframe tips4 that only contains columns with non-numeric values from the dataframe tips1. Describe tips4 data.

(b) Get the counts of unique values of the days in tips4.

(c) Create a boolean array called mask that only retrieves records in tips4 that have day values as Thur or Sat.

I have been able to complete the top sections I just need the bolded one. The questions are all linked so I am posting them all.

python programming language

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

NOTE : FEEL FREE TO ASK ANY DOUBTS OR CORRECTION IN COMMENT SECTION.

NOTE : AS PER THE CHEGG RULES & GUIDELINES WE HAVE TO DO ONLY 4 SUB PARTS FROM THE MULTIPLE SUB PARTS.

I HAD DONE FEW QUESTION PREVIOUSLY.
BELOW ARE THEM. HERE I'M ALSO ATTACHING THEM.
I'M GIVING YOU NEXT 4 SUB PARTS.
PLEASE RE UPLOAD QUESTIONS AGAIN WITH MENTIONING QUESTION NUMBERS. PLEASE UNDERSTAND.

PREVIOUSLY I HAD DONE BELOW

#!/usr/bin/env python
# coding: utf-8

# ### importing pandas

# In[1]:

import pandas as pd

# ### Rows and Columns list

# In[2]:

rows = ['one','two','three','four','five','six']
column = ['total_bill','tip','sex','smoker','day','time','size','kids']

# ### Data for dataframe from given Question

# In[3]:

data=[
[44.3,2.5,'Female','Yes','Sat','Dinner',3,None],
[20.27,2.83,'Female','No','Thur','Lunch',2,1.0],
[18.28,4.0,'Male','No','Thur','Lunch',2,None],
[18.433,3.0,'Male','No','Sun','Dinner',4,None],
[24.71,5.85,'Male','No','Thur','Lunch',2,2.0],
[16.4,2.5,'Female','Yes','Thur','Lunch',2,1.0]
]

# ### Creating dataframe with above data and column and row

# In[4]:

tips = pd.DataFrame(data,index=rows,columns=column)

# Printing dataframe
print(tips)

# ## Problem a

# ### Creating new data frame with row index 'six' and new column 'kidcheck'

# In[5]:

# [18.0,4.0,'Male','No','Mon','Lunch',3,1.0,True]

newData=[[18.0,4.0,'Male','No','Mon','Lunch',3,1.0,True]]
column.append('kidcheck')

# In[6]:

newRow = pd.DataFrame(newData,index=['six'],columns=column)

# ### Concatinating above data with tips data

# In[7]:

dataframes = [tips, newRow]
tips = pd.concat(dataframes,sort=False)

# In[8]:

# printing data
print(tips)

# ## Problem b

# ### Select all occurences of index 'six'

# In[9]:

sixIndex = tips.loc['six']
print(sixIndex)

# ## Problem c

# ### Reset index for the dataframe

# In[10]:

tips.reset_index(inplace = True)
print(tips)

# ## Problem d

# ### Re index using 'day' column

# In[11]:

tips=tips.set_index(['day'])
print(tips)

# ## Problem e

# ### Revert back to 'index' column as the 'index'

# In[12]:

# setting 'index' column as index
tips=tips.set_index(['index'])
print(tips)

# ## Problem f

# ### Drop the newly added index from dataframe, drop duplicate index 'six'

# In[13]:

# Reset index
tips.reset_index(inplace = True)
# Dropping duplicate rows at index
tips = tips.drop_duplicates(subset=['index'])
# Setting index as index
tips = tips.set_index(['index'])
# printing dataframe
print(tips)

# ## Problem g

# ### Drop the row with index value 'six'

# In[14]:

tips.drop(['six'],inplace=True)
print(tips)

# ## Problem h

# ### Drop the column kids and kidcheck

# In[15]:

tips.drop(['kids','kidcheck'],axis=1,inplace=True)
print(tips)

# ## Problem i

# ### Drop the column size

# In[16]:

tips.drop(['size'],axis=1,inplace=True)
print(tips)

HERE ARE THE NEXT 4 SUB PARTS FROM THE QUESTION

# # NEXT 4 SUB PARTS

# ### a) select two columns tip and sex from dataframe

# In[17]:

# printing columns tip and sex
print(tips.loc[:,['tip','sex']])

# ### b) select one column sex from dataframe

# In[18]:

# Printing one column sex from dataframe
print(tips.loc[:,['sex']])

# ### c) select first 3 rows using slicing notation

# In[19]:

# Printing rows 1 to 3
# 0 index = 1st row
print(tips.iloc[0:3])

# ### d) select thre first 4 rows using the index labels. Note: slicing with index labels behaves differently than normal python slicing.

# In[20]:

# Getting first four rows index labels as list
first_four_index_labels = list(tips.index[0:4])

# passing labels list to loc to print first four index labels
# tips.loc[[label1, label2,......., labelN]]
print(tips.loc[first_four_index_labels])

OUTPUT in JUPYTER NOTEBOOK

CONSIDER CHEGG RULES & GUIDELINES

DEAR SIR, PLEASE DON'T FORGET TO GIVE AN UP VOTE

Thank YOU :-)

Add a comment

Answer 2

(a) Load the data file data/tips.csv into a pandas DataFrame called tips_df using the pandas read_table()...

Homework Answers

Add Answer to:
(a) Load the data file data/tips.csv into a pandas DataFrame called tips_df using the pandas read_table()...

Post as a guest

Earn Coins

Question:- Please create the 5*3 two dimensional data having numerical value by using pandas dataframe (You...

Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data...

#importing file users = pd.read_table('u.user', sep='|', index_col='user_id') Describe and show the dataframe In [ ]: #...

Python with Pandas dataframe I have a csv file that contains a large number of columns...

The picture is given in a PPM file and your program should put the converted one...

write a Java console application that Create a text file called Data.txt. Within the file, use...

2. Write a script that implements the following design: In the Downloads table, the user_id and...

both question need to be solved Some names gain/ose popularity because of cultural phenomena such as...

I can't attach the data due to the file being real large i can email it...

CSCI 0229-01 C++ for Engineers, Spring 2019 Assignment 3 Due: Wednesday, April 10, 2019, 11:00 pm...

(a) Load the data file data/tips.csv into a pandas DataFrame called tips_df using the pandas read_table()...

Homework Answers

Add Answer to: (a) Load the data file data/tips.csv into a pandas DataFrame called tips_df using the pandas read_table()...

Post as a guest

Earn Coins

Add Answer to:
(a) Load the data file data/tips.csv into a pandas DataFrame called tips_df using the pandas read_table()...