Question

Describe the IMPACT cycle. Why does its order of the processes and its recursive nature make...

Describe the IMPACT cycle. Why does its order of the processes and its recursive nature make sense?
What is the purpose of a data dictionary? Identify four different attributes that could be stored in a data dictionary and describe the purpose of each.
In the ETL process, one important step to process when transforming the data is to work with NULL, N/A, and zero values in the dataset. If you have a field of quantitative data (e.g., number of years each individual in the table has held a full-time job), what would be the effect of the following?
Transforming NULL and N/A values into blanks
Transforming NULL and N/A values into zeroes
Deleting records that have NULL and N/A values from your dataset
0 0
Add a comment Improve this question Transcribed image text
Answer #1

1) Describe the IMPACT cycle. Why does its order of the processes and its recursive nature make sense?

Solution:-

Actually IMPACT Cycle is nothing but of the process that contended you have to specify your goal, also there are 3 imps role is,

1. Identify

2. Learn

3. Improve

àFirstly you have learn about anything you have to done with any person , after that you have to accept the original things you learn that subjects or whatever you learn then you have to know the reality of that subject after that you set your personal goal to achieve that goal anyhow in the current scenario.

àSecond part of IMPACT Cycle is you have accepted the real things, and then you will work properly. That contents you to study the equally to all points after that you have any choice or not for that subject that is also important. Again you learn to help of voice learning.

àThe last part is Confirm your direction after that to check your review progress and then you will improve your invent and then you will go to your final planning action regards to any Subject.

2. What is the purpose of a data dictionary? Identify four different attributes that could be stored in a data dictionary and describe the purpose of each.

Solution:-

A Structure place to keep details of the content of data flow process and data flow, actually it’s also known as data about data.

It content 4 parts,

1. Data Element.

It is nothing but smallest unit of data and its nothing but data of data.

2. Data Structure.

It is nothing but group of data element and its nothing but data of data.

3. Data Flow.

It is nothing but group of data Structure and its nothing but data of data.

4. Data Stores.

It is nothing but group of data Structure and its nothing but data of data.

3) In the ETL process, one important step to process when transforming the data is to work with NULL, N/A, and zero values in the data set. If you have a field of quantitative date, what would the effect of the following?

a. Transforming NULL and N/A values into blanks.

Solution:-

Incorporating NULL into the relational data warehouse can vastly complicate querying and cause much end-user confusion. If IT specialists (programmers, developers, and DBA) can’t grasp the idea or importance of NULL, then how can we expect non-technical end users to understand NULL? We must assume that business end users are querying the data warehouse for statistical information; how will NULL affect the answers they get from the data? If the end user queries the warehouse and filters on a null able value, is it valid to leave the entire record out of the analysis? Would this matter? If a value is unknown at the time of measurement, which is a plausible condition with slowly-changing dimensions, would it be better to omit the record from the analysis, or should you include it with some sort of zero or “temporarily unknown” condition. How would you do that? If a value exists in the source database but for some reason the extraction, transformation, and loading (ETL) process failed to bring it into the data warehouse, is it better to treat that value as NULL (the true condition) or should you calculate and assign a median or average value to it?

b. Transforming NULL and N/A values into zeros.

Solution:-

Step 1: Filter your column's null values into a separate branch using a Filter Row action.

Step 2: Add an Add Constants action to the null branch that adds a new column with whatever name (Zero Value is what I used) with a value set to 0.

Step 3: Add a Select Columns action to the null branch and select the columns you need except the column that holds the null values (the column you set the filter on in Step 1).

Step 4: Left join your filtered branch with your input data set on your unique key.

Step 5: Combine the original column on which you set the filter in Step 1 with the constant column you added to the null rows in Step 2 use a Combine Columns action.

c. Deleting records that have NULL and N/A values from your data set.

Solution:

As a side note, I sometimes prefer to talk about a "reporting database" rather than a "data warehouse", because it keeps things in perspective. Some DBAs and developers start making plans for huge server farms and multi-year ETL projects as soon as they hear the words "data warehouse", but in the end it's just a reporting database.)

Anyway, it isn't completely clear where you want to use NULL but it looks like it may be an attribute on a dimension.

I (probably) wouldn't use any of your three approaches, but it depends on the meaning of your data. Importing the data as-is is not useful because part of the value of a data warehouse is that the data has been cleaned and is consistent, which makes querying and comparing data along other dimensions much easier.

Replacing empty strings with 'Unknown' may or may not be correct: what does an empty string mean in the source system? There's a big difference between "it means there's no suburb" and "it means we don't know if there's a suburb". Assuming that an empty string means "no suburb" and NULL means "unknown" then I would import the empty strings as they are, but replace NULL with 'Unknown'. The main reason for doing that is that if the Suburb field will be used as a filter condition in a report, it's easier for users (and possibly your reporting tool) to work with a non-NULL value like 'UNKNOWN'. And if there is no consistency in the source system and you don't know what empty strings and NULLs mean, then you need to clarify that first and ideally fix the source system too (another benefit of a DWH is that it helps to identify inconsistencies and data handling errors in source systems).

Your last idea to convert NULLs to empty strings is the same issue: what does a NULL actually mean in the source system? If it means "no suburb" then replacing it with an empty string is probably a good idea, but if it means something else then you should handle it as something else.

So to summarize, my preference would be to import empty strings as-is, and convert NULL to 'UNKNOWN', but I can't be sure that this actually makes sense in your case. There's no single answer to this question because it all depends on your specific data and what it means. But there's no problem with using NULL in a data warehouse (or any other database) as long as you do it consistently and with a clear understanding of how the source systems handle data.

Add a comment
Know the answer?
Add Answer to:
Describe the IMPACT cycle. Why does its order of the processes and its recursive nature make...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • 1. Describe the IMPACT cycle. Why does its order of the processes and its recursive nature...

    1. Describe the IMPACT cycle. Why does its order of the processes and its recursive nature make sense? 2. What is the purpose of a data dictionary? Identify four different attributes that could be stored in a data dictionary and describe the purpose of each. 3. In the ETL process, one important step to process when transforming the data is to work with NULL, N/A, and zero values in the dataset. If you have a field of quantitative date, what...

  • 1.what is the fundamental of knowledge management cycle or process? 2. what is knowledge cycle? 3....

    1.what is the fundamental of knowledge management cycle or process? 2. what is knowledge cycle? 3. what is intellectual capital and three dimension of intellectual capital? 4. what is human capital? discusion with example 5. what is knowledge sharing and organization learning? 6. what is organization culture? 7. cultural impact of knowledge sharing? 8. what is data mining? 9. what is knowledge discover? UNIVERSAL Chapter 1 Knowledge Management Overview UBSS SCHOOL SYDNEY Introduction to Knowledge Management (KM) In a knowledge...

  • Code in C++. Can someone make it so that the code below can be compiled? ▪...

    Code in C++. Can someone make it so that the code below can be compiled? ▪ Creating an empty queue ▪ Inserting a value ▪ Removing a value ▪ Finding the size of the queue ▪ Printing the contents of the queue ▪ Adding the contents of one queue to the end of another ▪ Merging the contents of two queues into a third, new, queue Class Attributes Your class should be implemented using a linked list and should have...

  • Go the website of a well-known company and evaluate its external communication on the following a...

    help me please someone this the 240 page in the book the teacher said. Go the website of a well-known company and evaluate its external communication on the following attributes: How user-friendly is the website? Is the information presented clear and simple or confusing and full of jargon? What kinds of information are communicated? Was there anything important or useful missing? Is too much information (information overload) presented? On page 240, your textbook lists eight purposes of organizational communication directing...

  • **** ITS MULTI-PART QUESTION. PLEASE MAKE SURE TO ANSWER THEM ALL. SOLVE IT BY JAVA. ****...

    **** ITS MULTI-PART QUESTION. PLEASE MAKE SURE TO ANSWER THEM ALL. SOLVE IT BY JAVA. **** *** ALSO MAKE SURE IT PASS THE TESTER FILE PLEASE*** Often when we are running a program, it will have a number of configuration options which tweak its behavior while it's running. Allow text completion? Collect anonymous usage data? These are all options our programs may use. We can store these options in an array and pass it to the program. In its simplest...

  • Recursion and Trees Application – Building a Word Index Make sure you have read and understood...

    Recursion and Trees Application – Building a Word Index Make sure you have read and understood ·         lesson modules week 10 and 11 ·         chapters 9 and 10 of our text ·         module - Lab Homework Requirements before submitting this assignment. Hand in only one program, please. Background: In many applications, the composition of a collection of data items changes over time. Not only are new data items added and existing ones removed, but data items may be duplicated. A list data structure...

  • Food Microbiology Purpose To utilize the process of fermentation to make yogurt To describe organisms responsible...

    Food Microbiology Purpose To utilize the process of fermentation to make yogurt To describe organisms responsible for food borne illnesses and summarize a recent outbreak Introduction Microorganisms have been used for centuries for food preservation and to improve or change its taste. Evidence exists that yogurt, which is milk fermented by bacteria, has been around for over 4000 years. Today, many of the foods we eat are the result of microorganisms acting on foods for a specific and desired effect....

  • Introduction: A manufacturing company that possesses many complexities can be highly challenged when maintaining production goals...

    Introduction: A manufacturing company that possesses many complexities can be highly challenged when maintaining production goals and standards in conjunction with a major organizational change. Garment manufacturing is a complex industry for many reasons. The product line is a complex array of styles, seasons, varying life cycles and multidimensional sizing. Many sewn product firms are viewing TQM as the appropriate strategy to meet the double demand of competition and quality; however, many companies are finding sustaining their TQM adoption decision...

  • How can we assess whether a project is a success or a failure? This case presents...

    How can we assess whether a project is a success or a failure? This case presents two phases of a large business transformation project involving the implementation of an ERP system with the aim of creating an integrated company. The case illustrates some of the challenges associated with integration. It also presents the obstacles facing companies that undertake projects involving large information technology projects. Bombardier and Its Environment Joseph-Armand Bombardier was 15 years old when he built his first snowmobile...

  • please use python and provide run result, thank you! click on pic to make it bigger...

    please use python and provide run result, thank you! click on pic to make it bigger For this assignment you will have to investigate the use of the Python random library's random generator function, random.randrange(stop), randrange produces a random integer in the range of 0 to stop-1. You will need to import random at the top of your program. You can find this in the text or using the online resources given in the lectures A Slot Machine Simulation Understand...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT