Memory operations currently take 30% of execution time. A new widget called a cache speeds up 80% memory operations by a factor of 4. A second new widget called an L2 cache speeds up 1/2 of the remaining 20% by a factor of 2. What is the total speed up?
Answer is as follows :
The method is to solve by formula =
Speedup = 1 / [(1 - non-speedup portion) + (Speed up portion 1)/speedup1 + (speed up portion 2)/speedup2 + ...]
and result is =
Speedup = 1 / [0.7 + 0.3*0.8/4 + 0.3*0.2*0.5/2 + 0.3*0.2*0.5] = 1.2422
Explanation :
Memory operations = 30% = 30/100 = 0.3
80% of memory operations = 0.3 * 80 % = (0.3 * 80) / 100 = 0.24, this now takes 0.06 seconds, 0.18 seconds saved
20% of the memory operations = 0.3 * 20 % = (0.3*20) / 100 = 0.06
Half of the remaining 20% = 0.03, this now takes 0.015 seconds, 0.015 seconds saved , 0.195 seconds saved
What had taken 1 seconds,now takes 0.805 seconds.
1/0.79=1.2422 // 0.79 is calculated by denomenator of formula..
So total Speed up for given query is = 1.2422
if there is any query please ask in comments...
Memory operations currently take 30% of execution time. A new widget called a cache speeds up...
2. Cache hierarchy You are building a computer system with in-order execution that runs at 1 GHz and has a CPI of 1, with no memory accesses. The memory system is a split L1 cache. Both the I-cache and the D-cache are direct mapped and hold 32 KB each, with a block size of 64 bytes. The memory system is split L1 cache. Both the I-cache and the D-cache are direct mapped and hold 32 KB each, with a block...
Assume that speeding up the floating-point operations causes a slow down in the data cache accesses. For example, given the additional space taken by the more sophisticated floating-point unit, memory operations take some extra cycles to get to the cache. Specifically, assume that data cache accesses, which consume 10% of the original program execution time, are now slowed down by a factor 1.5×. What is the overall speedup? [
1. Cache memory (8pts) Consider adding cache to a processor-memory system design. The microprocessor without cache needs 12 clock cycles to read a 16-bit word from the memory. With cache, it takes only 4 clock cycles if the data happens to be in the cache and a total 20 clock cycles including the cache misses. a. What is the performance ratio of the cache system to the non-cache system given a hit rate of 80%? b. For what hit rate...
A new smartphone just out on the market has a L1 cache with an access time of 1 cycle, an L2 cache with an access time of 5 cycles and DRAM with access time of 30 cycles. The latest benchmarks indicate that for most applications the L1 hit rate is 80% and L2 hit rate is 95%. Compute the Average Memory Access Time for the memory hierarchy in this device. (More interested in the explanation of how to get the...
Question 4 - [25 Points] Part (a) - Average Access Time (AMAT) The average memory access time for a microprocessor with One (1) level (L1) of cache is 2.4 clock cycles - If data is present and valid in the cache, it can be found in 1 clock cycle If data is not found in the cache, 80 clock cycles are needed to get it from off- chip memory Designers are trying to improve the average memory access time to...
1. Cache memory (8pts) Consider adding cache to a processor-memory system desigrn. The microprocessor without cache needs 12 clock cycles to read a 16-bit word from the memory. With cache, it takes only 4 clock cycles if the data happens to be in the cache and a total 20 clock cycles including the cache misses a. What is the performance ratio of the cache system to the non-cache system given a hit rate of 80%? b. For what hit rate...
1a. convert the following decimal number to 32 bit single precision Floating point binary number and convert that binary number to hexadecimal NUMBER = -134.5 in decimal b. convert the following 32-bit single precision floating point number to decimal: 01000111111100000000000000000000 2. Using Booth's algorithm, multiply the decimal numbers -12 and +13. 3. you have two improvement alternatives, which is better and why? The first one improves 15% of the instructions, and it improves that speed by a factor of 6....
.13 : Assume that we make an enhancement to a computer that improves some mode of execution by a factor of 10. Enhanced mode is used 50% of the time, measured as a percentage of the execution time when the enhanced mode is in use. Recall that Amdahl’s law depends on the fraction of the original, unenhanced execution time that could make use of enhanced mode. Thus, we cannot directly use this 50% measurement to compute speedup with Amdahl’s law....
Consider the organization of address bus and data bus in the following two ways • The address bus operates in parallel with the data bus • The address bus is multiplexed with the data bus i) Compare and contrast the two modes of operations. You should briefly explain their advantages and disadvantages. ii) With the aid of a diagram, explain what is burst mode and how burst mode can improve the efficiency of address buses. b) A machine is running...
Part 1: A pipelined computer completes instructions more quickly by having more than one instruction at a time "in the pipeline." Explain what problem branch instructions cause with instruction pipelining. Describe one approach to overcoming this problem. Part 2: RISC computers generally execute more instructions per second than CISC computers. Describe the penalty or trade-off paid when adopting the RISC architecture. Part 3: When a cache hit to a cache on the CPU chip occurs on a memory write the...