Question
you can get this Mining Big Data: Current Status, and Forecast to the Future pdf in the google search.
Lab Instructions: Read the articles enclosed with this assignment; Mining Big Data For each article, write a minimum of parag

this one is the article by Wei Fan
Mining Big Data: Current Status, and Forecast to the Future Albert Bifet Yahoo! Research Barcelona Av. Diagonal 177 Barcelona
Lab Instructions: Read the articles enclosed with this assignment; Mining Big Data For each article, write a minimum of paragraphs. paragraph should provide you opinion of the article. Paragraphs should be approximately 4-8 sentences each. Do not plagiarize from the articles provided. All work should be your own. Submit your work as a Microsoft Word file. This lab is due Sunday by 11:59pm. Please structure your submission as follows: Your Name Article 1 - Mining Big Data Paragraph
Mining Big Data: Current Status, and Forecast to the Future Albert Bifet Yahoo! Research Barcelona Av. Diagonal 177 Barcelona, Catalonia, Spain [email protected] Wei Fan Huawei Noah's Ark Lab Hong Kong Science Park Shatin, Hong Kong [email protected] construction, police activities to our calendar schedules, and perform deep optimization under the tight time constraints. In all these applications, we are facing significant challenges in leveraging the vast amount of data, including challenges in (1) system capabilities (2) algorithmic design (3) business models. ABSTRACT Big Data is a new term used to identify the datasets that In all these due to their large size and complexity, we can not manage them with our current methodologies or data mining soft- ware tools. Big Dala mining is the capability of extracting useful information from these large datasets or streams of data, that due to its volume, variability, and velocity, it was not possible before to do it. The Big Data challenge As an example of the interest that Big Data is having in the data mining community, the grand theme of this year's KDD conference was Mining the Big Data. Also there was a specific workshop BigMine'12 in that topic: 1st Interna- tional Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Mod- els and Applications'. Both events successfully brought to- gether people from both academia and industry to present their most recent work related to these Big Data issues, and exchange ideas and thoughts. These events are important in order to advance this Big Data challenge, which is being considered as one of the most exciting opportunities in the years to come We introduce Big Data mining and its applications in Sec- tion 2. We summarize the papers presented in this issue in becoming one of the most exciting opportunities for the next years. We present in this issue, a broad overview o the topic, its current status, controversy, and forecast to the future. We introduce four articles, written by influen tial scientists in the field, covering the most interesting and state-of-tho-art topics on Big Data mining 1. INTRODUCTION Recent years have witnessed a dramatic increase in our abil ity to collect data from various sensors, devices, in different formats, from independent or connected applications. This data flood has outpaced our capability to process, analyze Section store and understand these datasets. Consider the Internetction J and discuss about Big Da data. The web pages indexed by Google were around one million in 1998, but quickly reached 1 billion in 2000 and have already exceeded 1 trillion in 2008. This rapid expan sion is accelerated by the dramatic increase in acceptance of social networking applications, such as Facebook, Twitter Weibo, etc., that allow users to create contents freely and 2. BIG DATA MINING amplify the already huge Web volume. Furthermore, with The term 'Big Data' appeared for first time in 1998 in a mobile phones becoming the sensory gateway to get real- Silicon Graphics (SGI) slide deck by John Mashey with the time data on people from different aspects, the vast amount title of" Big Data and the Next Wave of InfraStress 19). Big of data that mobile carrier can potentially process to im- Data mining was very relevant from the beginning, as the prove our daily life has significantly outpaced our past CDR fist book mentioning 'Big Data' is a data mining book that (call data record)-based processing for billing purposes only. appeared also in 1998 by Weiss and Indrukya [34]. However It can be foreseen that Internet of things (loT) applications the first academic paper with the words 'Big Data' in the will raise the scale of data to an unprecedented level. People title appeared a bit later in 2000 in a paper by Diebold |8). and devices (from home coffee railway stations and airports) are all loosely connected. Tril lions of such connected components will generate a huge Fayyad |11] in his invited talk at the KDD BigMine'12 Work ider the Internet tion 4. We point the importance of open-source software tools in Section 5 and give some challenges and forecast to the future in Section 6. Finally, we give some conclusions in The origin of the term Big Data' is due to the fact that we are creating a huge amount of data every day. Usama
0 0
Add a comment Improve this question Transcribed image text
Answer #1

Abstract:

Big Data is large volumes of structured and unstructured data. This data is what organizations collect on a daily basis. The amount of data is not the important part, but the information gathered from that data is the key. Collecting and analyzing Big Data gives organizations enhanced insight, decision making, and process automation. Approximately each one can agree that big data has taken the business world by storm, but what’s next? Will data continue to grow? What technologies will develop around it? Or will big data become a relic as quickly as the next trend — cognitive technology? Fast data? - appears on the horizon. I believe, am that big data is only going to get bigger and those companies that ignore it will be left further and further behind. This paper studies about what is big data, how does it helps organizations to extract information, its tools and technologies and its future.

Introduction:

In this digital era, analysts have enormous amounts of data available on hand. Big Data is the term for a collection of unstructured, semi-structured and structured datasets whose volume, complexity and rate of growth make them difficult to be captured, managed, processed or analyzed by using the typical database software tools and technologies. Different varieties are in the form of text, video, image, audio, webpage log files, blogs, tweets, location information, sensor data etc. Discovering useful insight from such huge datasets requires smart and scalable analytics services, programming tools and applications [1]. Data mining is also known as Knowledge Discovery in Database (KDD) is an analytical process used in different disciplines to search for significant relationships among variables in large data sets. Analyzing fast and massive stream data may lead to new valuable knowledge and theoretical concepts. Big data has potential to help organizations to improve operations and make faster & more intelligent decisions.

Big Data mining:

Big data is a term for data elements that are so large or intricate that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, querying and information privacy. Big data is data that's too big for traditional data management to handle. Big, of course, is also subjective. That's why we'll explain it according to three vectors: volume, velocity, and variety -- the three Vs and there are two more V's Variability and value.

Global pulse -Big Data for development:

The work that Global Pulse is doing using Big Data to improve life in developing countries. Global Pulse is a United Nations initiative, launched in 2009, that functions as an innovative lab, and that is based in mining Big Data for developing countries.

The strategy that consists of 1) researching innovative methods and techniques for analyzing real-time digital data to detect early emerging vulnerabilities; 2) assembling free and open source technology toolkit for analyzing real-time data and sharing

Challanges and Oppurtunities:

: • Early warning: develop fast response in time of crisis, detecting anomalies in the usage of digital media • Real-time awareness: design programs and policies with a more fine-grained representation of reality • Real-time feedback: check what policies and programs fails, monitoring it in real time, and using this feedback make the needed changes

Contributed articles:

The articles contributed in big data mining is:

- Scaling Big Data Mining Infrastructure: The Twitter Experience

Mining Heterogeneous Information Networks: A Structural Analysis Approach

Big Graph Mining: Algorithms and discoveries

Mining Large Streams of User Data for Personalized Recommendations

Controversy about Big Data:

s Big Data is a new hot topic, there have been a lot of controversy about it

There is no need to distinguish Big Data analytics from data analytics, as data will continue growing, and it will never be small again.

In real time analytics, data may be changing. In that case, what it is important is not the size of the data, it is its recency.

Limited access to Big Data creates new digital divides. There may be a digital divide between people or organizations being able to analyze Big Data or not.

Tools:Open Source Revolution:

Big Data infrastructure deals with Hadoop, and other related software as:

  • Apache Hadoop
  • Apache Hadoop related projects
  • Apache S4
  • Storm

In Big Data Mining, there are many open source initiatives. The most popular are the following:

Apache Mahout]:

R :

MOA :

Vowpal Wabbit

Forecast to the future:

There are many future important challenges in Big Data management and analytics, that arise from the nature of data:

1. Data volumes will continue to grow. In present day of internet world, we will continue generating bulk amount of data, so the number of devices handheld and Internet-connected devices exponentially grows.

2. Ways to analyze data will improve. As Ovum Says, While SQL as the standard, Spark is emerging as an analytical complementary tool and will continue to grow.

3. More tools for analysis (without the analyst) will emerge. Microsoft MSFT+0.18% and Sales force both recently announced features to let non-coders create apps to view business data.

4. Prescriptive analytics will be built in to business analytics software. IDC predicts that before 2020 intelligence will include in half of all business analytics software. Users will want to be able to use data to make decisions in real time programs like Kafka and Spark.

5. Big data will face huge challenges around privacy, especially with the new privacy regulation by the European Union. Companies will be forced to address the „elephant in the room‟ around their privacy controls and procedures. According to Gartner, business ethics violations will be related to data is about 50% by 2018.

Conclusion:

Big Data is going to continue growing during the next years, and each data scientist will have to manage much more amount of data every year. This data is going to be more diverse, larger, and faster. We discussed in this paper some insights about the topic, and what we consider are the main concerns, and the main challenges for the future.

References:

[1] SAMOA, http://samoa-project.net, 2013.

[2] C. C. Aggarwal, editor. Managing and Mining Sensor Data. Advances in Database Systems. Springer, 2013.

[3] Apache Hadoop, http://hadoop.apache.org.

[4] Apache Mahout, http://mahout.apache.org.

[5] A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. MOA: Massive Online Analysis http://moa. cms.waikato.ac.nz/. Journal of Machine Learning Research (JMLR), 2010.

[6] C. Bockermann and H. Blom. The streams Framework. Technical Report 5, TU Dortmund University, 12 2012

Add a comment
Know the answer?
Add Answer to:
you can get this Mining Big Data: Current Status, and Forecast to the Future pdf in the google search. this one is the article by Wei Fan Lab Instructions: Read the articles enclosed with thi...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT