Question

You have just been hired as an analyst for an investment firm. Your first assignment is...

You have just been hired as an analyst for an investment firm. Your first assignment is to analyze data for stocks in the S&P 500. The S&P 500 is a stock index that contains the 500 largest publicly traded companies.
You have been given two sources of data to work with. The first is an XML file that contains the Symbol (ticker), company name, sector, and industry for every stock in the S&P 500, as of summer 2016.
The second is a CSV file that contains pricing information for stocks in the S&P 500 between August 2009 and August 2010. There is one row in the CSV file for every stock, on every date that the market was open. Each row contains the date as a string, the stock’s ticker, the day’s opening price, the day’s high price, the day’s low price, the day’s closing price, and the volume traded that day.
Provided Files SP500_ind.csv and SP500_symbols.xml

Link1: https://www.cs.odu.edu/~sampath/courses/f19/cs620/files/data/SP500_ind.csv

Link2: https://www.cs.odu.edu/~sampath/courses/f19/cs620/files/data/SP500_symbols.xml

Write a Python module that includes the functions from the following activities.
1.) Read the .csv file into a DataFrame called “csv_data” and .xml file to a dictionary called “xml_dict” in your python module.
2.) Generate a list of unique symbol values from the csv_data and name the list “ticker” using unique() method.
3.) Complete the following functions in your python module

def ticker_find(xml_dict, ticker):
"""This function takes in the xml_dict and the list that contains a
Symbol (ticker). Return the name of the ticker
Ex: for ticker “A”, the function returns Agilent Technologies Inc
"""

def calc_avg_open(csv_data, ticker):
"""This function takes in the csv_data and a ticker.
Return the average opening price for the stock as a float.
"""

I am new in Python and so far I have the below code. I ended up with the attribute error: 'str' object has no attribute 'attrib'. Now, I got stuck into completing the above two functions. I will appreciate your expert answer.

import pandas as pd
import numpy as np
import io
from pandas import DataFrame
import urllib.request
from lxml import etree


# Reading the .csv file into dataframe
csv_data = pd.read_csv(io.BytesIO(uploaded['SP500_ind.csv']))

# Reading the .xml file to a dictionary


file = urllib.request.urlopen('https://www.cs.odu.edu/~sampath/courses/f19/cs620/files/data/SP500_symbols.xml')
data = file.read()

parser = etree.XMLParser(recover=True)
tree = etree.fromstring(data, parser = parser)
xml_dict = tree.getroottree()

# Generate a list of unique symbol values from the csv_data and name the list “ticker” using unique() method.
ticker = csv_data['Symbol'].unique()

def ticker_find(xml_dict, ticker):
for child in xml_dict:
if child.attrib['ticker'] == ticker:
return child.attrib['name']

0 0
Add a comment Improve this question Transcribed image text
Answer #1

NOTE : the xml file is getting read online , but csv file was downloaded locally and the path is given to the read_csv file function of pandas

please find the screen shot below so that indentation might be a little clearer:

Also please go through the code and the explanation given in the comments to understand more:

------------------------------------------------------------------------------------------

import pandas as pd
import numpy as np
import io
from pandas import DataFrame
import urllib.request
from lxml import etree
#library to convert xml data directly to dictionary of python
import xmltodict

#ANswer to question 1
#=================================================
# Reading the .csv file into dataframe
csv_data = pd.read_csv('SP500_ind.csv')

# print (csv_data) please uncomment if want to see what is the content
# # Reading the .xml file to a dictionary
file = urllib.request.urlopen('https://www.cs.odu.edu/~sampath/courses/f19/cs620/files/data/SP500_symbols.xml')
data = file.read()
#following line will convert xml data to python dictionary(ordered dictionary)
xml_dict = xmltodict.parse(data)
print (xml_dict)

#Answer toquestion 2
#below line just gets the column Symbol of the data frame csv_data
#on that column , we are calling .unique() function , which returns unique value
#=================================================
ticker = csv_data['Symbol'].unique()
# print (ticker)


#Answer to question 3
#Q3 is a bit confusing,
#for ticker_find() function , the parent to entire xml file is 'symbols' , so when converted to dictionary
#the parent key to all properties will be 'symbols' , inside 'symbols' , we have array of elemnts called 'symbol'
#so we are iterating over xml_dict['symbols']['symbol'] which returns an array
#each item in this array will again be a dictionary from which we can extract ticker and name using
#keys @ticker and @name
#please print xml_dict if want to analyze further more
#=================================================
def ticker_find(xml_dict, ticker):
for item in xml_dict['symbols']['symbol']:
if(item['@ticker']==ticker):
return item['@name']

print (ticker_find(xml_dict,'A'))


#Pandas DataFrame.loc attribute access a group of rows and columns by label(s)
#csv_data.loc[csv_data['Symbol']==ticker] returns the rows from data frame where 'Symbol' is equal to passed ticker
#from those rows , we are selecting only ['Open'] column , as we only need opening price
#we store those records into a dataframe ans find its average by diving its sum() by its count()
def calc_avg_open(csv_data, ticker):
open_value_for_given_stock = csv_data.loc[csv_data['Symbol']==ticker]['Open']
return open_value_for_given_stock.sum()/open_value_for_given_stock.count()


print (calc_avg_open(csv_data, 'A'))

=================================================

Command to run code

copy above code into a python file

>python filename.py

(remove unnecessary print statements)

Add a comment
Know the answer?
Add Answer to:
You have just been hired as an analyst for an investment firm. Your first assignment is...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • The way I understand it is i'm trying to link a list that I read into...

    The way I understand it is i'm trying to link a list that I read into python from a cvs file to json and xml and pass the doctest. Please refere the lines where I show what I did below. home / study / engineering / computer science / questions and answers / """this script converts a csv file with headers ... Question: """This script converts a CSV file with headers to... Bookmark """This script converts a CSV file with...

  • Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data...

    Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data Files We provide three comma-separated-value file, scores.csv , college_scorecard.csv, and mpg.csv. The first file is list of a few students and their exam grades. The second file includes data from 1996 through 2016 for all undergraduate degree-granting institutions of higher education. The data about the institution will help the students to make decision about the institution for their higher education such as student completion,...

  • In Python!! 1. Correcting string errors It's easy to make errors when you're trying to type...

    In Python!! 1. Correcting string errors It's easy to make errors when you're trying to type strings quickly. Don't forget to use quotes! Without quotes, you'll get a name error. owner = DataCamp Use the same type of quotation mark. If you start with a single quote, and end with a double quote, you'll get a syntax error. fur_color = "blonde' Someone at the police station made an error when filling out the final lines of Bayes' Missing Puppy Report....

  • Use the link in the Jupyter Notebook activity to access your Python script. Once you have...

    Use the link in the Jupyter Notebook activity to access your Python script. Once you have made your calculations, complete this discussion. The script will output answers to the questions given below. You must attach your Python script output as an HTML file and respond to the questions below. In this discussion, you will apply the statistical concepts and techniques covered in this week's reading about hypothesis testing for the difference between two population proportions. In the previous week’s discussion,...

  • 22.39 LAB 13 C FALL 2019 Overview Demonstrate your ability to use pandas with functions Description...

    22.39 LAB 13 C FALL 2019 Overview Demonstrate your ability to use pandas with functions Description Write a program that reads data from an input file using a DataFrame and displays a subset of data using a method Provided Input Files An input file with nearly 200 rows of data about automobiles. The input file has the following format: mpg, cylinders, displacement, horsepower, weight, acceleration, model_year, origin, name 18,9,307,130,3504, 12, 70, usa, chevrolet chevelle malibu 15,8,350,165,3693, 11.5,70, usa, buick skylark...

  • Python with Pandas dataframe I have a csv file that contains a large number of columns...

    Python with Pandas dataframe I have a csv file that contains a large number of columns and rows. I need to write a script that concatenates some elements of the first row with some elements of the 2 row. Something like # if data[1][0] starts with ch then concatenate the element right below it. I have attached a picture of just a sample of my data. The booleans have to stay on there as is. But I must drop the...

  • Can someone help me with this problem? I have been struggling with Python 3 for a...

    Can someone help me with this problem? I have been struggling with Python 3 for a while now and not even the professor would help me solve this problem. I have to import a file called grades.csv and use error handling for a mid-semester report. please help me, I have been having trouble understand import csv. You don't have to check for errors or anything like that. I just have to display the mid-semester report This project will have you...

  • This assignment will require you to analyze time series of monthly returns. Start by retrieving MONTHLY...

    This assignment will require you to analyze time series of monthly returns. Start by retrieving MONTHLY data for the period of 08/01/2015 – 08/31/2019 from Yahoo website for − S&P 500 Index (ticker: ^GSPC) − General Electric Company (ticker: GE) − Chevron Corporation (ticker: CVX) − Intel Corporation (ticker: INTC) − Tesla, Inc. (ticker: TSLA) Instructions for downloading the data from Yahoo! Website (https://finance.yahoo.com/): To obtain the monthly data for each company, on Yahoo! Finance website, enter the ticker symbol...

  • 23.4 Project 4: Using Pandas for data analysis and practice with error handling Python Please! 23.4...

    23.4 Project 4: Using Pandas for data analysis and practice with error handling Python Please! 23.4 PROJECT 4: Using Pandas for data analysis and practice with error handling Overview In this project, you will use the Pandas module to analyze some data about some 20th century car models, country of origin, miles per gallon, model year, etc. Provided Input Files An input file with nearly 200 rows of data about automobiles. The input file has the following format (the same...

  • Lab Exercise #11 Assignment Overview You will work with a partner on this exercise during your la...

    help with cse problem Lab Exercise #11 Assignment Overview You will work with a partner on this exercise during your lab session. Two people should work at one computer. Occasionally switch the person who is typing. Talk to each other about what you are doing and why so that both of you understand each step Part A: Class Date . Download the files for this laboratory exercise, then run the Python shell and enter the following commands: >>>import date >>help(...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT