Question

Coding for Python – don’t need it to be complex: In this assignment, your goal is to write a Python program to determine whether a given pattern appears in a data series, and if so, where it is locate...

Coding for Python – don’t need it to be complex:

In this assignment, your goal is to write a Python program to determine whether a given pattern appears in a data series, and if so, where it is located in the data series. This type of problem is common in many health disciplines. For example, identifying treatment pathways.

The goal is to detect whether a certain pattern appears in the data series. In the following example, the pattern to be detected is a sequence of 4 numbers (see Fig 2); and the data series contains 10 data points

data_series = [-1, 2, -2, 3, 41, 38, 22, 10, -1, 3]

pattern = [40, 30, 20, 10]

If you compare these two figures, you can see that this pattern appears between 5th (at index 4) to 8th (at index 7) data points of the data series. Note that it is not an exact match, but a close one. Now you can spot the pattern using your eyes, let us see how an algorithm can do it.

Since the given pattern has 4 data points, we will take a segment of 4 consecutive data points from the data series at a time and compute a similarity measure. The similarity measure should have the property that if a segment of the data series is like the given pattern, then the similarity measure of that segment is large, and vice versa. Hence, if we can compute the similarity measures of all possible segments, then we can identify the segment that is most similar to the given pattern. We will now provide more details.

The algorithm begins with computing the similarity measure between the given pattern and the first segment, which is formed by the first 4 data points of the data series. In terms of the two lists above, the similarity measure for the first segment is:

data_series[0]*pattern[0] + data_series[1]*pattern[1] + data_series[2]*pattern[2] + data_series[3]*pattern[3]

After this, we compute the similarity measure between the given pattern and the second segment, which is formed by the second to fifth data points of the data series. For the above example, the similarity measure for this segment is:

data_series[1]*pattern[0] + data_series[2]*pattern[1] + data_series[3]*pattern[2] + data_series[4]*pattern[3]

We then do the same with the segment formed by the third to sixth data points (the third segment), giving a similarity measure equal to:

data_series[2]*pattern[0] + data_series[3]*pattern[1] + data_series[4]*pattern[2] + data_series[5]*pattern[3]

We need to consider the following cases:

  • Case 1 - It is possible that the given data series is shorter than the given pattern, in this case, we return "Insufficient data".
  • Case 2 - All the similarity measures are (strictly) less than the given threshold value. This means none of the segments is similar to the given pattern. In this case, we say the pattern is not present in the data series and return "Not detected". This is not the case for the example just outlined.
  • Case 3 - At least one similarity measure is greater than or equal to the given threshold value. This is the case for the example outlined. The similarity measure plot above shows that the fifth similarity measure is the largest. You can work out that the fifth similarity measure is computed by using the segment formed by the fifth (index 4) to eighth (index 7) data points of the data series. This procedure is therefore telling us that the segment consisting of fifth to eighth data points is most similar to the given pattern. Indeed, this is what we had found by using visual inspection earlier. We will identify the location of the pattern by using the first index of the segment that has the highest similarity measure.
  • function pattern_search_max (described below): we return the index of the highest similarity measure that is also greater than or equal to the given threshold value. In Fig 3, the index of the highest similarity measure is 4.
  • function pattern_search_multiple (described below): Consider the following definition of overlapping indices,

We say two indices are overlapping if the distance between them is less than the width of the pattern (4 in the above example).

The function 'pattern_search_multiple' returns a list of non-overlapping indices that are greater than or equal to the given threshold value, and that satisfy the following criteria:

  • an index is not selected if the value at the index is less than a value at one of its overlapping indices.
  • an index is not selected if it is overlapping with first or last index.

In the following example, selected indices are marked with green circles. Here, index 6 is not selected because the value at index 5 (distance of 1) is greater than the value at index 6. Similarly index 32 is not selected because the value at index 34 (distance 2) is higher. The example in Fig 4 will return the list [5, 16, 34].

There are many different approaches to calculate similarity measures between data segments and a pattern, above we have discussed just one of them! Therefore, a given data series representing similarity measures may or may not have clearly defined 'pyramid' patterns.

You need to implement the following four functions, each in a separate file (provided). You need to submit these four files, each containing one function you implement – see code at bottom of page.

  1. def calculate_similarity(data_segment, pattern):
  • The aim of this function is to calculate the similarity measure between one data segment and the pattern.
  • The first parameter 'data_segment' is a list of (float) values, and the second parameter 'pattern' is also a list of (float) values. The function calculates the similarity measure between the given data segment and pattern and returns the calculated similarity value (float), as described earlier in this assignment outline. The function returns "Error" if 'data_segment' and 'pattern' have different lengths.
  1. def calculate_similarity_list(data_series, pattern):
  • The aim of this function is to calculate the similarity measures between all possible data segments and the pattern.
  • The first parameter 'data_series' is a list of (float) values, and the second parameter 'pattern' is also a list of (float) values. The function calculates the similarity measures, using the above function 'calculate_similarity', of all possible data segments in a sequence, and returns the list of calculated similarity values.
  1. def pattern_search_max(data_series, pattern, threshold):
  • The three possible outcomes are "Insufficient data", "Not detected" and the location of the pattern if the pattern is found in the data series. Here, the location of the pattern refers to the index of the highest similarity measure that is also greater than or equal to the given threshold value.
  • The first parameter 'data_series' is a list of (float) values, the second parameter 'pattern' is a list of (float) values, and the third parameter 'threshold' is a (float) value. In this function, you need to use the function 'calculate_similarity_list'.
  1. def pattern_search_multiple(data_series, pattern_width, threshold):
  • The three possible outcomes are "Insufficient data", "Not detected" and a list of non-overlapping indices that are greater than or equal to the given threshold value, and that satisfy the following criteria:
  • an index is not selected if the value at the index is less than a value at one of its overlapping indices.
  • an index is not selected if it is overlapping with first or last index.

We say two indices are overlapping if the distance between them is less than the width of the pattern.

  • The first parameter 'data_series' is a list of (float) values, the second parameter 'pattern_width' is a (float) value, and the third parameter 'threshold' is a (float) value.
0 0
Add a comment Improve this question Transcribed image text
Answer #1

def calculate_similarity (data_segment, pattern): #checking-if the lengths of data segment and pattern are matching or not ifdef pattern_search max (data_series, pattern, threshold): #checking if the length of pattern is less than data series lengthdef pattern_search multiple (data_series, pattern_width, threshold): selected - [I for i in range (len (data_series)-pattern

Add a comment
Know the answer?
Add Answer to:
Coding for Python – don’t need it to be complex: In this assignment, your goal is to write a Python program to determine whether a given pattern appears in a data series, and if so, where it is locate...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Coding for Python - The pattern detection problem – part 3: def pattern_search_max(data_series, pattern, threshold) Plea...

    Coding for Python - The pattern detection problem – part 3: def pattern_search_max(data_series, pattern, threshold) Please do not use 'print' or 'input' statements. Context of the assignment is: In this assignment, your goal is to write a Python program to determine whether a given pattern appears in a data series, and if so, where it is located in the data series. Please see attachments below: We need to consider the following cases: Case 1 - It is possible that the...

  • Coding for Python - The pattern detection problem – part 2: def calculate_similarity_list(data_series, pattern) Please do not use 'print' or 'input' statements. Context of the assignme...

    Coding for Python - The pattern detection problem – part 2: def calculate_similarity_list(data_series, pattern) Please do not use 'print' or 'input' statements. Context of the assignment is: In this assignment, your goal is to write a Python program to determine whether a given pattern appears in a data series, and if so, where it is located in the data series. Please see attachments below: We need to consider the following cases: Case 1 - It is possible that the given...

  • I have to follow functions to perform: Function 1: def calculate_similarity(data_segment, pattern): The aim of this func...

    I have to follow functions to perform: Function 1: def calculate_similarity(data_segment, pattern): The aim of this function is to calculate the similarity measure between one data segment and the pattern. The first parameter 'data_segment' is a list of (float) values, and the second parameter 'pattern' is also a list of (float) values. The function calculates the similarity measure between the given data segment and pattern and returns the calculated similarity value (float), as described earlier in this assignment outline. The...

  • I really need help with this python programming assignment Program Requirements For part 2, i need...

    I really need help with this python programming assignment Program Requirements For part 2, i need the following functions • get floats(): It take a single integer argument and returns a list of floats. where it was something like this def get_floats(n):   lst = []   for i in range(1,n+1):     val = float(input('Enter float '+str(i)+': '))     lst.append(val)   return lst • summer(): This non-void function takes a single list argument, and returns the sum of the list. However, it does not use...

  • Python coding exercise: please include comments Goal #1: import financial data given into your pr...

    Python coding exercise: please include comments Goal #1: import financial data given into your program provided to you as a CSV formatted text file Use the following data for testing (the following is only a sample of the data; there are over 4000 rows of data): *Note: The data you will read in is linear by date (but, non-contiguous due to holidays and weekends,) reflecting a timeline of stock performance in chronological order; however your program should run through the...

  • CSC151 Stock Portfolio GUI Project Goal You are to write a GUI program that will allow...

    CSC151 Stock Portfolio GUI Project Goal You are to write a GUI program that will allow a user to buy, sell and view stocks in a stock portfolio. This document will describe the minimum expected functions for a grade of 90. Your mission is to “go and do better.” You’ll find a list of enhancement options at the end of this document. Objectives By the end of this project, the student will be able to • write a GUI program...

  • Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data...

    Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data Files We provide three comma-separated-value file, scores.csv , college_scorecard.csv, and mpg.csv. The first file is list of a few students and their exam grades. The second file includes data from 1996 through 2016 for all undergraduate degree-granting institutions of higher education. The data about the institution will help the students to make decision about the institution for their higher education such as student completion,...

  • In Python Question 3 (13 points): Purpose: To practice your ability to modify lists and compute...

    In Python Question 3 (13 points): Purpose: To practice your ability to modify lists and compute with lists of lists Degree of Difficulty: Moderate For this question, you are given some population estimates, by age group, from Statistics Canada for some provinces. Starter File a5q3 starter.py is a file that contains a list of lists, to which the variable Pop-data refers, which represents 2020 population numbers. The first item in Pop_data is a list whose first item is the string...

  • C++ program, inventory.cpp implementation Mostly need the int load(istream&) function. Implementation: You are supp...

    C++ program, inventory.cpp implementation Mostly need the int load(istream&) function. Implementation: You are supposed to write three classes, called Item, Node and Inventory respectively Item is a plain data class with item id, name, price and quantity information accompanied by getters and setters Node is a plain linked list node class with Item pointer and next pointer (with getters/setters) Inventory is an inventory database class that provides basic linked list operations, delete load from file / formatted print functionalities. The...

  • How to write the insert, search, and remove functions for this hash table program? I'm stuck......

    How to write the insert, search, and remove functions for this hash table program? I'm stuck... This program is written in C++ Hash Tables Hash Table Header File Copy and paste the following code into a header file named HashTable.h Please do not alter this file in any way or you may not receive credit for this lab For this lab, you will implement each of the hash table functions whose prototypes are in HashTable.h. Write these functions in a...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT