Question

Bioinformatics is one of the most impactful area of Data Mining. It is the science of...

Bioinformatics is one of the most impactful area of Data Mining. It is the science of storing, analyzing, and utilizing information from biological data such as sequences, molecules, gene expressions, and pathways. Though it is one of the promising areas, it comes with a lot of challenges. Outline the major research challenges of data mining in Bioinformatics. (15 Marks)

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Data Mining

Data Mining is the process of automatic discovery of novel and understandable models and patterns from large amounts of data.

Bioinformatics

Bioinformatics is the science of storing, analyzing, and utilizing information from biological data such as sequences, molecules, gene expressions, and pathways. Development of novel data mining methods will play a fundamental role in understanding these rapidly expanding sources of biological data.

Data mining approaches seem ideally suited for bioinformatics, which is data-rich, but lacks a comprehensive theory of life's organization at the molecular level. The extensive databases of biological information create both challenges and opportunities for developing novel data mining methods.

MAJOR RESEARCH CHALLENGES OF DATA MINING IN BIOINFORMATICS:-

A) Design classifiers to handle ultra-high dimensional classification problem:-

One challenge is how to design classifiers to handle ultra-high dimensional classification for text mining and drug safety applications. A new design procedure for a hybrid decision tree classifier which improves the classification efficiency and accuracy for classifying high-dimensional data with a small training sample size.

B) Mining data streams in extremely large database:-

One important problem is mining data streams in extremely large databases (e.g. 100 TB). Satellite and computer network data can easily be of this scale. However, today’s data mining technology is still too slow to handle data of this scale. In addition, data mining should be a continuous, online process, rather than an occasional one-shot process.

C) Mining complex knowledge from complex data:-

One important type of complex knowledge is in the form of graphs. Recent research has touched on the topic of discovering graphs and structured patterns from large data, but clearly, more needs to be done. Another form of complexity is from data that are non-i.i.d. (independent and identically distributed). This problem can occur when mining data from multiple relations. In most domains, the objects of interest are not independent of each other, and are not of a single type. We need data mining systems that can soundly mine the rich structure of relations among objects, such as interlinked Web pages, social networks, metabolic networks in the cell, etc.

D) Mining across multiple heterogeneous data sources: Multi database and multi relational mining:-

The problem of distributed data mining is very important in network problems. In a distributed environment (such as a sensor or IP network), one has distributed probes placed at strategic locations within the network. The problem here is to be able to correlate the data seen at the various probes, and discover patterns in the global data seen at all the different probes. There could be different models of distributed data mining here, but one could involve a NOC that collects data from the distributed sites, and another in which all sites are treated equally. The goal here obviously would be to minimize the amount of data shipped between the various sites essentially, to reduce the communication over head. In distributed mining, one problem is how to mine across multiple heterogeneous data sources: multi-database and multirelational mining.

E) Mining Non-Relational data:-

Yet another important problem is how to mine non-relational data. A great majority of most organizations’ data is in text form, not databases, and in more complex data formats including Image, Multimedia, and Web data. Thus, there is a need to study data mining methods that go beyond classification and clustering. Some interesting questions include how to perform better automatic summarization of text and how to recognize the movement of objects and people from Web and Wireless data logs in order to discover useful spatial and temporal knowledge.

F) Automate Data cleaning:-

Data cleaning, also called data cleansing or scrubbing , deals with detecting and removing errors and inconsistencies from data in order to improve the quality of data. Data quality problems are present in single data collections, such as files and databases, e.g., due to misspellings during data entry, missing information or other invalid data. When multiple data sources need to be integrated, e.g., in data warehouses, federated database systems or global web-based information systems, the need for data cleaning increases significantly.

G) Privacy preserving data mining:-

Privacy preserving data management is an important emerging research area that emerged in response to two important needs: data analysis and ensuring the privacy of the data owners. Privacy preserving data publishing emphasizes the importance of need for privacy threats in data sharing. A new approach seeks to protect data without focusing on the infrastructure level, but at element or aggregate data type. This type of pervasive security can be achieved by classifying data and enforcing access control.

Add a comment
Know the answer?
Add Answer to:
Bioinformatics is one of the most impactful area of Data Mining. It is the science of...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • you can get this Mining Big Data: Current Status, and Forecast to the Future pdf in the google search. this one is the article by Wei Fan Lab Instructions: Read the articles enclosed with thi...

    you can get this Mining Big Data: Current Status, and Forecast to the Future pdf in the google search. this one is the article by Wei Fan Lab Instructions: Read the articles enclosed with this assignment; Mining Big Data For each article, write a minimum of paragraphs. paragraph should provide you opinion of the article. Paragraphs should be approximately 4-8 sentences each. Do not plagiarize from the articles provided. All work should be your own. Submit your work as a...

  • explain 1 or 2 molecular details in how epigenetics may allow melanoma cells to outsmart BRAF inhibitors **This is for a genetics course Seope: This science commentary claims that epigeneties pl...

    explain 1 or 2 molecular details in how epigenetics may allow melanoma cells to outsmart BRAF inhibitors **This is for a genetics course Seope: This science commentary claims that epigeneties plays a role in cancer cells becoming therapy-resistant. The language used is generalized for the non-scientist and does not include any molecular details, such as methylation of a specific nucleotide in an allele or modification of a specific histone tail amino acid. Your task (individually or in a group of...

  • During the past six months, Mr. Seth Koranteng, Director of Transport for Atiwaa Metals and Electronic...

    During the past six months, Mr. Seth Koranteng, Director of Transport for Atiwaa Metals and Electronic Group Limited (AMEG), a company with two groups, Aluminum and Coper mining company and Electronic Manufacturing and Retailing with operations in eight regions, has been soliciting bids for the movement of ‘tool alloy’ used for manufacturing tools and related products. Tony’s goal is to reduce the shipping cost of this high-value alloy as well as reduce issues of pilfering. The supplier is located in...

  • Please read the article bellow and discuss the shift in the company's approach to genetic analysis....

    Please read the article bellow and discuss the shift in the company's approach to genetic analysis. Please also discuss what you think about personal genomic companies' approaches to research. Feel free to compare 23andMe's polices on research with another company's. Did you think the FDA was right in prohibiting 23andMe from providing health information? These are some sample talking points to get you thinking about the ethics of genetic research in the context of Big Data. You don't have to...

  • does anyone know what High and low group means in this context? i really do not...

    does anyone know what High and low group means in this context? i really do not understand this article so anyone that does please explain it to me and what the hugh and low group mean in the figures. Received: 21 November 2018 Revised: 27 February 2019 Accepted: 6 March 2019 DOE: 10.1002p28546 ORIGINAL RESEARCnes-highdearee of intra modole connecHvity WILEYa Phypliology ARTICLE Four novel biomarkers for bladder cancer identified by weighted gene coexpression network analysis Zi-Xin Guo | Xiao-Ping Liu...

  • What are the major areas of change from the old design to the new design? What...

    What are the major areas of change from the old design to the new design? What do you think the major concerns will be of employees and managers in the new design? Use the star model to identify the transitions at each point of the star. Case Study 4: Reorganizing the Finance Department: Managing Change and Transitions Read the finance department case and consider the challenges you might anticipate during this reorganization. Develop a transition plan that addresses the following...

  • ABC International: Solving the Rural Barrier

         Compensation sessionABC International:   Solving the Rural BarrierSource: Thunderbird School of Global Management, A unit of the Arizona State University Knowledge Enterprise. 2015. This case was prepared by Erin Bell under the guidance and supervision of Dr. Amanda Bullough, and revised and updated by Drew Helm for the purpose of classroom discussion only, and not to indicate either effective or ineffective managementSiham sat with her family and childhood friend, Leila, in their rural village of Qabatiya, Palestine. Leila had recently returned from...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT