In the heirarchical model the memory consists of separate instruction and data caches along with main memory. This ARM processor is running at 1GHz.
a) Given,
The instruction cache is always hit where as the data cache is 5% miss.
The processor takes 60ns to access main memory
Average memory access time is given as;
T = 1 + 0.05 * 60
T = 4 ns
b) For the non - ideal memory system , the average CPI required to store and load word instruction is;
CPI = ( 5 * 25 + 4 * 10 ) / 100
CPI = 1.65
4 cycles for store instructions and 5 cycles for load instructions.
c) The average CPI for the benchmarkis,
ACPI = (5*25 + 4*10 + 3*11 + 3*2 + 4*52)/100
ACPI = 4.12
d) Now for 7 % miss rate the average CPI is given as,
ACPI = 4.12 + 2.45
ACPI = 6.57
( 25+10 = 35 are load and store instructions * 0.07 = 2.45 )
Exercise 8.16 You are building a computer with a hierarchical memory systenm that consists of separate...
Consider a processor with a CPI of 1.5, excluding memory stalls. The instruction cache has a miss rate of 1.5%, whereas the miss rate of the data cache is 3.5%. The miss penalty of the data cache is 80 cycles. The percentage of load/store instructions within the running programs is 25%. If the CPI of the whole system, including memory stalls, is 2.5, calculate the miss penalty of the instruction cache. Miss penalty of the instruction cache- Cycles.
4B, 20%) compare performance of a Processor with cache vs. without cache. Assume an Ideal processor with 1 cycle memory access, CPI1 Assume main memory access time of 8 cycles Assume 40% instructions require memory data access Assume cache access time of I cycle Assume hit rate 0.90 for instructiens, 0.80 for data Assume miss penalty (time to read memory inte cache and from cache to Processor with cache processor) is 10 cycles >Compare execution times of 100-thousand instructions: 4B,...
6. Memory Access Time [15 points] Consider a MIPS processor that includes a cache, a main memory, and a hard drive. Access times of cache memory, main memory, and hard drive are 5 ns, 200 ns, and 1000 ns, respectively. Assume that cache memory is divided into instruction cache and data cache. Assume that data cache has a 90% hit rate. Assume that main memory has a 98% hit rate and hard drive is perfect (it has a 100% hit...
Consider a memory hierarchy using one of the three organization for main memory shown in a figure below. Assume that the cache block size is 32 words, That the width of organization b is 4 words, and that the number of banks in organization c is 2. If the main memory latency for a new access is 10 cycles, sending address time is 1 cycle and the transfer time is 1 cycle, What are the miss penalties for each of...
Suppose you have a machine with separate I- and D- caches. The miss rate on the I-cache is 2.6% , and on the D-cache 3.8%. On an I-cache hit, the value can be read in the same cycle the data is requesfed. On a D-cache hit, one additional cycle is required to read the value. The miss penalty is 100 cycles for data cache, 150 for I-cache. 40% of the instructions on this RISC machine are LW or SW instructions,...
virtual memory support into our baseline 5-stage MIPS pipeline using the TLB miss handler. Assume that accessing the TLB does not incur an extra cycle in memory access in case of hits. Without virtual memory support (i.e. she had only a single address space for the entire system, or a physical address is same as a logical address), the average cycles per instruction (CPI) was 2 to run Program X. If the TLB misses 10 times for instructions and 20...
2. Cache hierarchy You are building a computer system with in-order execution that runs at 1 GHz and has a CPI of 1, with no memory accesses. The memory system is a split L1 cache. Both the I-cache and the D-cache are direct mapped and hold 32 KB each, with a block size of 64 bytes. The memory system is split L1 cache. Both the I-cache and the D-cache are direct mapped and hold 32 KB each, with a block...
Table 1: Load 26% Compare 14% Shift left and shift right 4% Store 9% Load immediate 4% AND 3% Add 14% Conditional branch 17% OR 5% Sub 0% Jump 1% Other register-register instructions (XOR, NOT, etc.) 1% Multiply 0% Call 1% Divide 0% Return 1% Using the data in Table 1, which of the following two enhancements will result in faster execution of the five benchmark programs that are described by the instruction frequency data? Assume that the computer used...
Computer Architecture The format of this document is as follows: First, I give a practice problem for which the solution is also provided. In bold italic font, I slightly modify the problem for your homework. 3) The 4-Stage Pipeline below suffers from the memory access resource conflict as shown below (instruction i and i+2 want to access memory at the same time and i+2 needs to be denied, so it waits for the next cycle; in the next cycle it...
1 Overview The goal of this assignment is to help you understand caches better. You are required to write a cache simulator using the C programming language. The programs have to run on iLab machines. We are providing real program memory traces as input to your cache simulator. The format and structure of the memory traces are described below. We will not give you improperly formatted files. You can assume all your input files will be in proper format as...