Question

Apply different classification and clustering methods on any dataset and evaluate them, with the aim to improve analysis. Pro
0 0
Add a comment Improve this question Transcribed image text
Answer #1

Introduction of Classification and Clustering:

First of all, it’s better to know the differences between classification and prediction before knowing the difference between classification and clustering.Both Classification and Clustering is used for the categorisation of objects into one or more classes based on the features. They appear to be a similar process as the basic difference is minute. In the case of Classification, there are predefined labels assigned to each input instances according to their properties whereas in clustering those labels are missing.

Comparison between Classification and Clustering:

PARAMENTER CLASSIFICATION CLUSTERING
Type used for supervised learning used for unsupervised learning
Basic process of classifying the input instances based on their corresponding class labels grouping the instances based on their similarity without the help of class labels
Need it has labels so there is need of training and testing dataset for verifying the model created there is no need of training and testing dataset
Complexity more complex as compared to clustering less complex as compared to classification
Example Algorithms Logistic regression, Naive Bayes classifier, Support vector machines etc. k-means clustering algorithm, Fuzzy c-means clustering algorithm, Gaussian (EM) clustering algorithm etc.

Differences between Classification and Clustering

  1. Classification is used for supervised learning whereas clustering is used for unsupervised learning.
  2. The process of classifying the input instances based on their corresponding class labels is known as classification whereas grouping the instances based on their similarity without the help of class labels is known as clustering.
  3. As Classification have labels so there is need of training and testing dataset for verifying the model created but there is no need for training and testing dataset in clustering.
  4. Classification is more complex as compared to clustering as there are many levels in classification phase whereas only grouping is done in clustering.
  5. Classification examples are Logistic regression, Naive Bayes classifier, Support vector machines etc. Whereas clustering examples are k-means clustering algorithm, Fuzzy c-means clustering algorithm, Gaussian (EM) clustering algorithm etc.

Applying classification and clustering methods on any dataset:

Now Let’s begin with classification concepts.

Classification Concept

Gender Classification based on hair length #: Male Count # # @: Female Hair Length atas dataaspirant.com

Classification Concepts

In classification, the idea is to predict the target class by analysis the training dataset. This could be done by finding proper boundaries for each target class. In a general way of saying, Use the training dataset to get better boundary conditions which could be used to determine each target class. Once the boundary conditions determined, the next task is to predict the target class as we have said earlier. The whole process is known as classification.

Target class examples:

  • Analysis the customer data to predict whether he will by computer accessories (Target class: Yes or No)
  • Gender classification from hair length (Target classes: Male or Female)
  • Classifying fruits from each fruit feature like colour, taste, size, weight (Targe class: Apple, Orange, Cherry, Banana)

Let’s understand the concept of classification with gender classification using hair length. To classify gender (target class) using hair length as feature parameter we could train a model using any classification algorithms to come up with some set of boundary conditions which can be used to differentiate the male and female genders using hair length as the training feature. In gender classification case the boundary condition could the proper hair length value. Suppose the differentiated boundary hair length value is 15.0 cm then we can say that if hair length is less than 15.0 cm then gender could be male or else female.

Some classification algorithms listed below.

Classification Algorithms

  • Linear classifiers
    • Logistic regression
    • Naive Bayes classifier
    • Fisher’s linear discriminant
  • Support vector machines
    • Least squares support vector machines
  • Quadratic classifiers
  • Kernel estimation
    • k-nearest neighbor
  • Decision trees
    • Random forests
  • Neural networks
  • Learning vector quantization

Application of Classification Algorithms

  • Email spam classification
  • Bank customers loan pay bank willingness prediction.
  • Cancer tumour cells identification.
  • Sentiment analysis.
  • Drugs classification
  • Facial key points detection
  • Pedestrians detection in an automotive car driving.

Clustering Concept

Gender Clustering based on hair length #: Male Count . # # @: Female Hair Length dataaspirant.com

Clustering

In clustering the idea is not to predict the target class as like classification , it’s more ever trying to group the similar kind of things by considering the most satisfied condition all the items in the same group should be similar and no two different group items should not be similar. To group the similar kind of items in clustering, different similarity measures could be used.

Group items Examples:

  • While grouping similar language type documents (Same language documents are one group.)
  • While categorising the news articles (Same news category(Sport) articles are one group )

Let’s understand the concept with clustering genders based on hair length example. To determine gender, different similarity measure could be used to categorise male and female genders. This could be done by finding the similarity between two hair lengths and keep them in the same group if the similarity is less(Difference of hair length is less). The same process could continue until all the hair length properly grouped into two categories.

To get used to different similarity measure to perform clustering we have some popular clustering algorithms. Which are listed below.

Clustering Algorithms

Clustering algorithms can be classified into two main categories Linear clustering algorithms and Non-linear clustering algorithms.

  • Linear clustering algorithm
    • k-means clustering algorithm
    • Fuzzy c-means clustering algorithm
    • Hierarchical clustering algorithm
    • Gaussian(EM) clustering algorithm
    • Quality threshold clustering algorithm
  • Non-linear clustering algorithm
    • MST based clustering algorithm
    • kernel k-means clustering algorithm
    • Density-based clustering algorithm

Application of Clustering Algorithms

  • Recommender systems
  • Anomaly detection
  • Human genetic clustering
  • Genom Sequence analysis
  • Analysis of antimicrobial activity
  • Grouping of shopping items
  • Search result grouping
  • Slippy map optimization
  • Crime analysis
  • Climatology

Conclusion:

Classification: Predicting target class for test dataset from the trained modeled from the training dataset.

Clustering: Using different similarity measure to place the all the similar items in a group.

Add a comment
Know the answer?
Add Answer to:
Apply different classification and clustering methods on any dataset and evaluate them, with the aim to...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • 1. The operations manager is planning to send all employees across the nine different stores to a two day course on customer service. The aim is to develop a customer centric culture across the organi...

    1. The operations manager is planning to send all employees across the nine different stores to a two day course on customer service. The aim is to develop a customer centric culture across the organisation as well as provide instruction on exactly how to interact successfully with customers. How can the operations manager ensure that they gather feedback to measure the effectiveness of the training at all four of Kirkpatrick’s levels to improve future learning programs? Propose some tools and...

  • Determine the security updates that apply to your computer. Compile a list of security updates for your computer and provide a summary of the vulnerabilities they prevent from being exploited. Provide...

    Determine the security updates that apply to your computer. Compile a list of security updates for your computer and provide a summary of the vulnerabilities they prevent from being exploited. Provide a summary of the course of action you have taken to secure your computer. If your computer is up-to-date in terms of recommended patches and configuration changes, choose three of the optional enhancements that would apply to your operating system (OS) version and summarize why they would be beneficial....

  • CLINICAL CLASSIFICATION SYSTEMS AND REIMBURSEMENT METHODS CASE 2-12 . Physician Query Polity You have suspected there...

    CLINICAL CLASSIFICATION SYSTEMS AND REIMBURSEMENT METHODS CASE 2-12 . Physician Query Polity You have suspected there are problems in the physician query process for a while now, and you have planned to review the policy and query form to look for any compliance issues. You would rather find theproblem s yourself before the Office of the Inspect r General (OIG) İ ds iheim Yor task toda, is to evaluate the physician query process at your facility 1. Review Figures 2-1...

  • Summarize each parts(provide a thoughtful, complete (yet concise) overview, Do not include any direct quotes from...

    Summarize each parts(provide a thoughtful, complete (yet concise) overview, Do not include any direct quotes from your text or any other source) 1. One of the earliest ethics codes was the Nuremberg Code—a set of 10 principles written in 1947 in conjunction with the trials of Nazi physicians accused of shockingly cruel research on concentration camp prisoners during World War II. It provided a standard against which to compare the behavior of the men on trial—many of whom were eventually...

  • Farming with AgriPro (Case Study) AgriPro is a firm based in Colorado, USA, which does research...

    Farming with AgriPro (Case Study) AgriPro is a firm based in Colorado, USA, which does research on and produces genetically modified wheat seed. Every year AgriPro conducts thousands of experiments on different varieties of wheat seeds in different locations of the USA. In these experiments, the agricultural and economic characteristics, regional adaptation, and yield potential of different varieties of wheat seeds are investigated. In addition, the benefits of the wheat produced, including the milling and baking quality, are examined. If...

  • These weekly exercises provide the opportunity for you to understand and apply statistical methods and analysis....

    These weekly exercises provide the opportunity for you to understand and apply statistical methods and analysis. Unless otherwise stated, use 5% (.05) as your alpha level (cutoff for statistical significance). #1. Define "power" in relation to hypothesis testing. #2. Alpha (a) is used to measure the error for decisions concerning true null hypotheses. What is beta (ß) error used to measure? #3. In the following studies, state whether you would use a one-sample t test or a two-independent-sample t test....

  • Case studies are used to enable you to apply new concepts, use the tools you have...

    Case studies are used to enable you to apply new concepts, use the tools you have mastered, and improve the technical skills you have attained. Through the individual case studies you will discover for yourself the usefulness of quantitative problem solving methods, how to apply them in practice, and their benefit to organizational decision-makers. In this case study, you will act as a consultant for a manufacturing company looking to maximize net profit generated by a production facility subject to...

  • Objectives To evaluate different costing systems and their impact on business decisions. To illustrate the importance...

    Objectives To evaluate different costing systems and their impact on business decisions. To illustrate the importance of hidden (undirected) issues that arise from a detailed analysis. To prepare a coherent report and integrated analysis that meets specific user needs. Instructions In order to complete your case analysis successfully, you must identify the role you are playing, assess user needs analyze user needs or issues (qualitatively and quantitatively), and provide a recommendation and conclusion. An average grade will result from answering...

  • Building a Case Study Analysis Choose an issue of international relevance on any of the topics:...

    Building a Case Study Analysis Choose an issue of international relevance on any of the topics: Week 3: Bitcoin, Tariffs and Subsidies, Global Institutions How to Build the Case Responses Find a minimum of three news articles discussing this issue. The articles should be news articles, written by journalists for publication in newspapers or on news, websites (cannot be blogs or commercial sites). They may also be transcripts of news stories that were broadcast on television or radio. The article...

  • Arizona State University - CSE205 Assignment #9 Due Date Friday, March April 3rd, 5:30pm Important: This...

    Arizona State University - CSE205 Assignment #9 Due Date Friday, March April 3rd, 5:30pm Important: This is an individual assignment. Please do not collaborate. No late assignment will be accepted. Make sure that you write every line of your code. Using code written by someone else will be considered a violation of the academic integrity and will result in a report to the Dean's office. It must be submitted on-line (Course website). Go to "GradeScope" tab on Canvas -> CSE205...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT