What is Research article? Review articles provides summary of current state of the research on a...

Question

Question

What is Research article? Review articles provides

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

INTRODUCTION:

Data Mining can be defined as the procedure of extracting information from large sets of data. In simple language, data mining is the retrieval of knowledge from data.

Data mining can be well understood when one a basic knowledge of database concepts such as structured Query language (SQL), schema, ER model and Data Warehousing.

Data mining techniques have been born as a result of a long process of research and product development.

Data mining has been applied in the business world because it has the support of three technologies which are now sufficiently mature: These technologies are; massive data collection, powerful multiprocessor computers, and data mining algorithms.

There are two categories of functions involved in Data Mining based on the kind of data to be mined, namely: descriptive and classification and prediction

The descriptive function involved in data mining deals with the general properties of data in the database. These descriptive functions are: class/concept description, mining of frequent patterns, mining of associations, mining of correlations, and mining of clusters.

Classification and prediction - Classification refers to process of finding a model that describes the data classes or concepts. The functions under the classification and prediction categories are; classification, prediction, outlier analysis, and evolution analysis.

In data mining, there are data mining primitives involved. These primitives allow for communication in an interactive manner with the data mining system. These data mining primitives are;

Set of task relevant data to be mined,
The kind of knowledge to be mine,
The background knowledge to be used in discovery process,
The interestingness measures and thresholds for pattern evaluation,
The representation for visualizing the discovered patterns.

For data mining to be a success, data mining systems are required to be available. The major factors to consider when acquiring a data mining systems include; the data type (type of data to be processed), System/compatibility issues with your operating system, Data sources (the formats of the data to be acted upon), System Scalability (Row Scalability or column scalability), The visualization tools, the data mining functions and methodologies provided, coupling with databases and data warehouses.

Literature section

In the article “An Outlier detection approach with data mining in wireless sensor network” (M.Govindarajan, 2014), the authors have discussed in length how data mining can be used to provide an outlier detection functionality in wireless sensor network, abbreviated as WSN. An outlier has been defined as any deviation from the normal behavior in the dataset.

Some of the factors predisposing the Wireless sensor network to experience outliers include; the use of imperfect devices in the monitoring of data, depletion of energy on batteries that power the devices that lead to reduced performance of these devices, attacks from interested parties such as adversaries, and lastly, the exponential number of nodes connected to the sensor network which may be a cause for errors.

Some of the reasons why outliers may be introduced into the dataset may range from destructive intents to denial of service, and that depends on the party perpetrating the act.

Outlier detection can constructively be used in fraud detection, processing of loan applications to evaluate the risky nature of clients, detection of intrusions in networks, diagnosis of faults and monitoring of performance of a network, and finally, in the detection of structural faults.

The article goes further to show the different roles different components play in an experiment.

For instance,

The database schema - indicates the information contained by the dataset.
Performance measure – is the ratio of outliers to normal data.
Before processing – notes the detection and the false alarm rate.
After processing – notes the detection rate and false alarm rate and compares the results with the before processing data.

The author has vividly discussed how data mining can be used to obtain actionable information from sensor datasets of a wireless sensor networks. The author has clearly demonstrated by use examples the whole process of extracting this essential information and correctly classifying it.

The paper has greatly contributed to the field of data mining and is clear evidence on how data mining can be embraced in all fields to improve on the accuracy and efficiency of information obtained for decision making. The authors have presented their case in a simple and technical language with well structured wordings as well as illustrations. The authors also took their time to expound on technical terms that were used in the article, through this; the paper’s clarity is superb.

In the article titled “Accepting or Rejecting Students’ Self-grading in their

Final Marks by using Data Mining” (J. Fuentes, 2014), the authors have discussed in details on the proposal to use data mining in predicting whether or not the instructor rejects the self assessment marks for students. The researchers have used a dataset obtained from fifty-three computer science students in the 2012-13 year.

The paper has restated the higher education’s target of moving closer to having personalized and individualized instructions to its growing number of students, enrolling in the institutions of higher learning. The paper has indicated that students’ self grading system will for sure help achieve this target.

On the other hand, the article has silently regretted that this system (self grading system) will to a large extent take away the teacher’s role as the sole examiner of the students.

The authors have given a solution to cases where there’s a conflict between what the student awards themselves and what they should ‘genuinely’ be awarded, the teacher/instructor will be required to personally supervise the evaluation of the student.

From the authors’ demonstration, it’s quite clear that it’s more times significant to classify rejection results than acceptance results. The authors have finally concluded that cost sensitive classification gives accurate results as it is sensitive and specific in the data set.

Although the article gives insights on how data mining can be used to evaluate the instructor’s decision to accept or reject the student’s self assessment marks, it’s worth noting that there are a few questions that the article has not sufficiently addressed.

For instance the article shows how data mining can reduce cases of students’ self assessment marks being rejected wrongly by the instructor, it has not demonstrated or discussed how to check and minimize wrong acceptance of marks.

Secondly, at the beginning of the article, the author’s claims that “Grades are, by their nature, somewhat subjective; every instructor uses different criteria to assign them and place a different emphasis on them.” (J. Fuentes, 2014), are to some extent misleading since grading or evaluation especially in institutions of higher learning seeks to reward hard work, and therefore should be as objective as possible.

The article makes a good contribution to field of data mining and has demonstrated the importance of embracing data mining to improve on efficiency and accuracy, especially in the field of education and institutions of higher learning. The article is well suited for system designers and all information technology experts as well as administrators who seek to improve the efficiency, accuracy and costs in running institutional systems.

In an article titled “Educational Data Mining: An Advance for Intelligent Systems in Education” (Baker, 2014), the author has demonstrated how the data mining technology is relatively young in the field of education, despite it been used in other fields such as the fields of science for years.

The potential of data mining in education is enormous and cannot be underestimated. Some of the potential areas in education that can be improved through data mining include; the discovery of learning patterns in people, the prediction of learning and getting to understand the different learning behaviors.

The author has also discussed how data mining has been incorporated into intelligent systems to advance a number of online learning systems for students and the educators. The author further discusses how the models have recorded improved results from students and have been a success story. The author praises the models as being accurate.

The article has shown how data mining can be used to model constructs that can be used in different ways. Some of the ways they can be applied include: to study context, discover previously unknown patterns in student learning, and finally, to determine the pattern that impacts most when used.

The author has discussed the main reasons as to why data mining has taken quite some time before it could be embraced in education. They attribute them to the late realization of large datasets in proper and usable format, history of poor design of educational policies over time which has greatly relied on common imaginations and fantasies.

However, there have been major advances in educational data mining which can also be credited to creation of Pittsburgh Science of learning Center (PSLC) Data shop, continuous improvement of methodology which has in turn improved the quality of the model, and the growing awareness that all key information is stored in a single data stream.

From this article, the author has educated the readers on areas that are core to the education system that have unlimited capabilities, should data mining be employed in the field of education. For instance, the author has emphasized on the need to “design better and smart learning technology” (Baker, 2014). For these technologies to be developed, there need to be a discovery on how people learn, the learning also has to be predictable and the learning behavior also needs to be understandable.

The author has also introduced the idea of having a common format of storing learning data for better utilization through data mining tools.

Even though the author has presented a relatively good case on how data mining can be used to benefit the field of education, some claims that were made were not sufficiently supported in the paper.

For example, history in the field of education has been quoted as the possible cause for late emergence of data mining in the field of education. The author further blames the design of education decision system on the basis of “relatively loosely operationalized educational theories and common sense” (Baker, 2014).This claim was not been adequately substantiated in the article.

In another article titled “Application of Higher Education System for Predicting Student Using Data mining Techniques” by (P.Veeramuthu, 2014), the author has discussed how data mining techniques, with a focus on how predictive classification can be used to extract valuable information from current student databases in institutions of higher learning to aid in learning of the type of students that enroll in the institute, and secondly, uncover the areas in higher learning that requires better and more direct support, and lastly, help prepare institutional resources such as classrooms necessary for the incoming students.

The author has noted a considerable interest in data mining in institutions of higher learning, as there are a number of publication that relate to data mining in institutions of higher learning.

According to the author, the techniques that are essential in doing data mining on student database are: association analysis, outlier analysis, evolution analysis, classification, and clustering and prediction.

The author has shown how data mining is that tool that institutions of higher learning require to predict the patterns of the students that enroll in their institutes, to aid in resource allocation as well the making decision on areas that require customer retention management.

On the contrary, the article has failed to give new ideas in relation to data mining. The paper has therefore restated the common knowledge on data mining. For instance, the paper has failed to show how data mining can also be used to enhance relationship between the institute and the future students. For this and other reasons, one can easily conclude that the article an example of a good paper, whose expression of its ideas is not sufficiently good. The ideas in this paper have not been properly organized and therefore leave a lot to be desired.

In another Article titled “Customer Relationship Management through Data Mining” (Subramanian, 2014), the author discusses the role data mining plays in giving firms or organizations a competitive advantage in the market. For data mining to be a success, a few conditions must be met. Those conditions are: statistical data from the database must be available, and pruning algorithm must be available. The article has singled out an aspect that can give a firm a better rating; the customer relations and support.

The authors expressed their point in relation to data mining. They highlighted the need to maintain customers through innovative ways such as data mining.

The article has also educated the readers on the need automate processes to gain an edge over the competitor.

The authors lamented that it is extremely hard to bring back a gone customer compared to acquiring new ones. The author warns that acquiring new ones is an expensive affair for the firm. This therefore forms the basis for the need to implement data mining to solve customer related issues.

According to the author, data mining can be used to discover previously unique and unknown patterns in the customer’s character, which if harnessed well, can ensure that the customer remains loyal to the firm.

The authors’ focus was on using data mining algorithm to solve typical customer retention and management issues. The author gives a source where the necessary data required for data mining can be obtained from. They note that such a source include the database driven applications such as Enterprise Resource Planning. According to the author, these applications have over years accumulated the necessary data.

From the article, there is a contradiction in the theme of the article, which is, to use data mining to solve customer relation management issues. This was evident when they said “This paper would deal with implementing data mining algorithm for solving a typical CRM problem.” (Subramanian, 2014). To the understanding of this statement, this amounts to solving issues that can be easily solved by traditional algorithms, whereas data mining is supposed to be used as an advancement of the traditional algorithms.

Most of the issues that have been raised by the author were not sufficiently illustrated and require further research in order to clearly bring out the idea of incorporating data mining to help solve customer retention and management issues.

Method

Applications of data mining

Data Mining has primarily been used by companies such as financial, retail, marketing organizations and telecommunications to transform their to transactional data and determine product pricing, corporate profits, impact on sales, customer preferences and product positioning, and customer satisfaction. Records of customer purchases at Point-of-sale can be worked on through data mining tools to develop products and promotions that appeal to different and specific customer segments.

Financial Data Analysis

Generally there’s high reliability and quality in financial data in banking industry. This makes it easy to carryout data mining and systematic data analysis. Examples of the typical cases in the financial and banking industry include;

Classification and clustering of customers for targeted marketing.
Loan payment prediction and customer credit policy analysis.
The design and construction of data warehouses for multidimensional data analysis and data mining.
The detection of money laundering.
The detection a number of financial crimes such as favoritism and insider lending.

Healthcare

Data mining has a great potential to improve healthcare systems. Data mining has been using data and analytics for identification of best practices that that can improve health care and reduce costs. Data mining has also been used by healthcare insurers to detect fraudulent claims and abuse. Data mining tools have also been used in healthcare facilities in predicting the volumes of patients in each and every category. Through Data mining, a number of processes have been developed to ensure that the patients get appropriate care at the right place and at the right time.

Below is a typical data mining process commonly known as SEMMA(Sample, explore, manipulate, model, assess)

SAMPLE EXPLORE MANIPULATE MODEL ASSESS Sarpling? Clusterigg Factor Corraspondonce Exploration Variabla Grouping Subsotting Ad

Market Basket Analysis

Market basket analysis refers to a modeling technique based upon a theory that argues that if one buys a certain group of items; they are more likely to buy a another group of items. This analysis helps retailers have a better understanding of purchase behaviors of their buyers. The information can help retailers know a segment of buyers’ needs and consequently influence change in the store’s item arrangement layout accordingly.

In the articles researched, the authors used different methodologies to carry out experiments to support their research. For instance, in the Article “Outlier detection approach with data mining in wireless sensor network” (M.Govindarajan, 2014)

The paper has proposed that outliers be classified as either normal or outlier upon detection and classification.

The outlier classification process takes place to ensure that the classification is a true reflection of the data in the dataset, as follows;

Pre-processing

This is a phase that ensures that the data presentation is good and in the format required for processing.

The phase also seeks to check the quality of the data to ensure that it’s complete, devoid of noise and correct.

Outlier detection

This phase seeks to detect outliers on the basis of false alarm rate and detection rates equations. These rates represent outliers and normal occurrences in datasets respectively.

Outlier classification

The sensor data’s classification can be done by use of a decision tree, which is based on cross validation.

Measure of sensor trustfulness

The measure the trustfulness of the sensor is directly related to its accuracy in correctly classifying the data as either normal or outlier.

Below is the pictorial representation of the above methodology;

Data pre processing Outlier detection Classification Normal Outlier Measure sensor trustfulness

In another Article titled “Accepting or Rejecting Students’ Self-grading in their

Final Marks by using Data Mining” (J. Fuentes, 2014), the authors opted to use the following methodology to carry out experiments.

First, students in question were requested to take multiple questions tests during the course. Secondly, the students were requested to self evaluate themselves for marks in the final examination. These students’ self evaluation marks are to be further evaluated for true representation of the student’s performance in class, by the instructor. The instructor can either accept the marks or reject them. If the instructor rejects the marks, then the student whose marks have been rejected will be required to sit for those final examination tests and be supervised by the instructor.

The authors demonstrated two approaches that can be used to ensure that no students’ self evaluation marks wrongly get rejected. The author first shows how traditional algorithm that uses 3 numerical parameters (student’s score during course work, student’s proposed score, the difference in two previous scores).

The author further shows how the first approach can be improved to get more accurate results, this time by adding an additional parameter which is cost involved. This approach is referred to as cost sensitive classification.

INITIAL APPROACH Instructors Students Information Data Mining Decision Activity Proposed Differenceslassification Scores be

In an article titled “Customer Relationship Management through Data Mining” (Subramanian, 2014), the author have given the process of knowledge discovery.

Since data mining involves exploration and analysis of large datasets, a number of steps have to be taken, according to the article, for one to discover knowledge. Here’re the steps;

In the article “Application of Higher Education System for Predicting Student Using Data mining Techniques” by (P.Veeramuthu, 2014), the author has outlined the process the institutions need to take to discover knowledge, as illustrated below;

Fig: The process of knowledge extraction from data

Data selection and Data Cleaning and integfation Data mining -buh ation Knowledge Discovery data Pattern Transformation

Fig:process of knowledge discovery

Visualizing data Referencing Association Classification Segmentation Disovering Knowledge

And lastly, in the article “Educational Data Mining: An Advance for Intelligent Systems in Education” (Baker, 2014), the author chose the use of test data to build and test models from time to time. Old models are to be used in coming up with new models in pursuit of better models. Models are to be built in an incremental manner.

Results

From the experiments carried out in the research articles above, here are their results;

Article 1: “An Outlier detection approach with data mining in wireless sensor network” (M.Govindarajan, 2014)

Table 1 Dataset schema Data set schema Date Time Epoch Moteid Temp Humidity Light Voltage vy-mm-dd) int) nt) (real) (real) (r

This dataset contains information about data collected from 54 sensors deployed in the Intel Berkeley Research lab between February 28th and April 5th, 2004. And also contain 699 instances, such instances are numeric and nominal values, (M.Govindarajan, 2014)

Table 2 Performance comparison Performance comparison Detection False alarm rate Preprocessing Before After Performance rate

The table above shows “the detection rate and False alarm rate are comparing between before and after preprocessing”, (M.Govindarajan, 2014)

Fig:detection rate comparison

90% 8596 80% 75% 7096 DETECTION RATE Column!

The figure compare the detection rate between before and after preprocessing, the detection rate of before preprocessing is low and the detection rate of after preprocessing high. (M.Govindarajan, 2014)

Fig:False alarm rate comparison

30% 25% 20% 15% 10% 5% 0% ■ FALSE -- ALARM RATE ■ Column!

Article 2: “Accepting or Rejecting Students’ Self-grading in their

Final Marks by using Data Mining” (J. Fuentes, 2014)

Fig:Traditional/Initial classification versus New/Cost-sensitive classification performance (J. Fuentes, 2014)

100 90 80 70 60 50 40 20 10 Initial New NewCostt2) New.Cost(3) New.Cost(4) NewCost5) -Accuracy TP tate TN rate

Article 2: Educational Data Mining: An Advance for Intelligent Systems in Education” (Baker, 2014)

Fig: Performance of a sensor-free affect detector model, year-by-year. Model performance is shown here as Cohen’s Kappa (how much better than chance the model is). Human agreement on student affect varies by method, but quantitative field observation method. (Baker, 2014) )

Conclusion

In the articles that have been reviewed, it’s evident that data mining is the new way to process huge amounts of data to help organization uncover the hidden potential to propel them to a whole new level, business wise. In the current market characterized by stiff competitions for sale of products, the best way to have a competitive edge over your competitors is by use of a timely analyzing tool that will enable your businesses to adjust to the changing customer needs.

Through data mining, businesses can improve customer relations issues, learning institutions can also be able to streamline their operations as well increase the efficiency of their operations. Also data mining can be used in almost all sectors of the economy to ensure that tedious processes like fraud detection in provision health insurance as well as in credit card access are simplified through data mining.

Finally, I recommend the use of data mining tools in virtually all organizations to ensure that hidden potentials are discovered and utilized.

Add a comment

Answer 2

What is Research article? Review articles provides summary of current state of the research on a...

Homework Answers

Add Answer to:
What is Research article? Review articles provides summary of current state of the research on a...

Post as a guest

Earn Coins

Organ transplants topic Conduct and write a literature review about your topic. The iterature review should...

Find a topic from this module/week’s Reading & Study materials that interests you. Declare your topic...

My Research Topic: Is On What Is Sustainable Living? Annotated Bibliography: Assignment Description Assignment: Produce an...

Help please 1- For this activity, you will appraise and summarize a quantitative study to determine...

Overview You are required to submit a research paper on one of the project management topics...

Traditional Systems and Compared with Lean System 1. Key Concept Explanation: Define Traditional Systems and Compared...

In this project, you will complete a clinical case study analysis, research review, and oral presentation...

Using the GCU Library, locate a journal article about job-costing systems or how managerial accounting helps...

Topic Research and Selection Begin this process by researching what health care organizations are doing or...

b. in an unsteady state c. statistically significant d. visually significant 43. In a single-subject phase...

What is Research article? Review articles provides summary of current state of the research on a...

Homework Answers

Add Answer to: What is Research article? Review articles provides summary of current state of the research on a...

Post as a guest

Earn Coins

Add Answer to:
What is Research article? Review articles provides summary of current state of the research on a...