Question

Consider a VEX-executing VLIW machine with the following characteristics: The machine supports 4 slots (4-wide machine)...

Consider a VEX-executing VLIW machine with the following characteristics:

  • The machine supports 4 slots (4-wide machine) with the following resources:
    1. 2 memory units each with a load latency of 3 cycles
    2. 2 integer-add/sub functional units with a latency of 2 cycle
    3. 1 integer-multiply functional unit with a latency of 4 cycles
  • Each functional unit in the machine is pipelined and can be issued a new operation at each cycle. However, the results of an operation are only available after the latency of the operation has passed.

Given the following Sequential VEX Code (a total of 16 instructions):

1. ldw $r1 = 4[$r10]

2. add $r2 = $r1, $r10

3. ldw $r3 = 8[$r10]

4. ldw $r5 = 12[$r20]

5. sub $r4 = $r3, $r2

6. ldw $r6 = 0[$r21]

7. add $r8 = $r20, 4

8. sub $r7 = $r23, $r22

9. mpy $r9 = $r8, $r5

10. ldw $r13 = 12[$r27]

11. ldw $r14 = 4[$r27]

12. mpy $r14 = $r13, $r14

13. add $r11 = $r7, $r6

14. ldw $r12 = 4[$r9]

15. sub $r15 = $r14, 4

16. add $r16 = $r12, 10

Note: An instruction cannot be scheduled in a given slot unless all of its operands are available from other instructions on which it is dependent.

  1. Draw the data dependence graph for the above code.
    • Each node in the graph is an instruction. You can simply write the instruction number (1-16) on the node; you do not need to write the instruction itself.
    • An arrow from node A to node B signifies that the instruction in B is dependent on the instruction in A. Note: You are not required to include the latencies in your submission, but it may prove useful for you to draw the graph with latencies included in order to find the critical path.
  2. Using the dependence graph and given latencies, find the smallest (in terms of execution time) possible valid instruction schedule. Make your schedule as follows:
    • Place the instructions in the earliest slots possible where the dependencies are satisfied.
      • Hint: When having to decide between multiple possible instructions that have their dependencies satisfied, pick the instructions that are on the critical path of the code.
    • Ensure that the number of instructions assigned to the appropriate function does not exceed the number of available memory or functional units.
    • Don’t forget the latencies of the respective operations.
    • It is ok if operations that schedule to the same VLIW instruction read from the same register, as long as no other operation in the VLIW instruction is writing to that register.
      • For example, it is valid to schedule instruction 1 and 3 to the same slot, even though both read from register $r10. However, this would not apply to instructions 12 and 15, since even though they both read register $r14, instruction 12 also writes to $r14.
  3. Calculate the number of cycles it takes to execute the VLIW code.
  4. How many slots remain empty where no operation could be scheduled? Provide a ratio of empty slots to the total number of slots present in the instruction schedule. (Do not simplify, we want to see you selected the right number of total slots and empty slots).
0 0
Add a comment Improve this question Transcribed image text
Know the answer?
Add Answer to:
Consider a VEX-executing VLIW machine with the following characteristics: The machine supports 4 slots (4-wide machine)...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • We’re executing the following instruction sequences on a 5-stage MIPS pipeline. Add R8, R9, R10 Lw...

    We’re executing the following instruction sequences on a 5-stage MIPS pipeline. Add R8, R9, R10 Lw R14, 0x0020(R12) Or R16, R9, R10 Sw R12, 0x0020(R10) Addi R20, R21, 5 (1) At cycle 5, what action (add, sub, and, or) is ALU performing? (2) At cycle 5, what is the action (read, write, no action) of DM? (3) At cycle 5, which registers are being read out? (4) What is the speedup comparing with the unpipelined execution of the same instruction...

  • We’re executing the following instruction sequences on a 5-stage MIPS pipeline. Add R8, R9, R10 Lw...

    We’re executing the following instruction sequences on a 5-stage MIPS pipeline. Add R8, R9, R10 Lw R14, 0x0020(R12) Or R16, R9, R10 Sw R12, 0x0020(R10) Addi R20, R21, 5 (1) At cycle 5, what action (add, sub, and, or) is ALU performing? (2) At cycle 5, what is the action (read, write, no action) of DM? (3) At cycle 5, which registers are being read out? (4) What is the speedup comparing with the unpipelined execution of the same instruction...

  • 5.3 Rewrite the following program fragment that is written using the GPR instruction set for execution...

    5.3 Rewrite the following program fragment that is written using the GPR instruction set for execution on a CISC processor that provides the same instruction set as the GPR processor but allows the register addressing mode to be used on the input operands or destination of any instruction. (Yes, the code fragment will execute correctly as written on such a processor. Your goal should be to reduce the number of instructions as much as possible. ) Assume that the program...

  • Ch04.2. [3 points] Consider the following assembly language code: I0: ADD R4 R1RO I1: SUB R9R3 R4...

    Ch04.2. [3 points] Consider the following assembly language code: I0: ADD R4 R1RO I1: SUB R9R3 R4; I2: ADD R4 - R5+R6 I3: LDW R2MEMIR3100]; 14: LDW R2 = MEM [R2 + 0]; 15: STW MEM [R4 + 100] = R3 ; I6: AND R2R2 & R1; 17: BEQ R9R1, Target; I8: AND R9 R9&R1 Consider a pipeline with forwarding, hazard detection, and 1 delay slot for branches. The pipeline is the typical 5-stage IF, ID, EX, MEM, WB MIPS...

  • Problem 4 (15pts): (a) (5pts) Consider the following MIPS memory with data shown in hex, which ar...

    Problem 4 (15pts): (a) (5pts) Consider the following MIPS memory with data shown in hex, which are located in memory from address 0 through 15. Show the result of the MIPS instruction "lw Ss0,4(Sa0)" for machines in little-endian byte orders, where Sa0 4. Address Contents Address Contents 9b lb 2 4 6 10 b4 c5 12 13 14 15 3d 5f 70 7 8f (b) (10pts)Assume we have the following time, performance and architecture parameters in the specified units Ec-...

  • I just need part (d) answered 7) [24 marks] Consider the following MIPS code segment that is executed on a 5-stage pipeline architecture that does not implement forwarding or stalling in hardware....

    I just need part (d) answered 7) [24 marks] Consider the following MIPS code segment that is executed on a 5-stage pipeline architecture that does not implement forwarding or stalling in hardware. (1) add $4, $1, $1 (2) add $7, $4, $9 (3) lw $2, 400S8) (4) sub $8, $1, $2 (5) SKSs, so($2) (6) sub $2, $8, $4 (7) lw $3, 2($1) (8) add $8, $4, $2 Identify the data dependences that cause hazards. You are to use the...

  • Problem 4 (15pts): (a) (5pts) Consider the following MIPS memory with data shown in hex, which...

    Problem 4 (15pts): (a) (5pts) Consider the following MIPS memory with data shown in hex, which are located in memory from address 0 through 15. Show the result of the MIPS instruction "lw Ss0,4(Sa0)" for machines in little-endian byte orders, where Sa0 4. Address Contents Address Contents 9b lb 2 4 6 10 b4 c5 12 13 14 15 3d 5f 70 7 8f (b) (10pts)Assume we have the following time, performance and architecture parameters in the specified units Ec-...

  • 5. Consider the SPIM code below. globl main .text main: ori $t1, $0, 10 ori $t2,...

    5. Consider the SPIM code below. globl main .text main: ori $t1, $0, 10 ori $t2, $0, 11 add $t3, $t1,$t2 move $t4, $t3 The following image shows a screen shot of QtSPIM page when this program is loaded, and executed in step-by step fashion. Current instruction is highlighted. Data Text x Text Regs Int Regs [16] Int Regs [16] PC = 400028 EPC 0 Cause = 0 BadAddr = 0 Status = 3000ff10 HI LO = 0 = 0...

  • Assembly language 64 bit please ! An example file for set up ==========+ ;| Data Segment...

    Assembly language 64 bit please ! An example file for set up ==========+ ;| Data Segment BEGINS Here | ;+======================================================================+ segment .data ;Code this expression: sum = num1+num2 num1 dq 0 ;left operand of the addition operation num2 dq 0 ;right operand of the addition operation sum dq 0 ;will hold the computed Sum value RetVal dq 0 ;Integer value RETURNED by function calls ;can be ignored or used as determined by the programmer ;Message string prompting for the keyboard...

  • 4) Consider the following assembly language code: INSTRUCTIONS T01 T02 T03 T04 T05 T06 T07 T08...

    4) Consider the following assembly language code: INSTRUCTIONS T01 T02 T03 T04 T05 T06 T07 T08 T09 T10 T11 T12 T13 T14 (as a table) Loop: sll $t1, $s3, 2 add $t1, $t1, $s6 lw $t0, 0($t1) beq $t0, $s5, Exit addi $s3, $s3, 1 j Loop Exit: Use a pipeline with forwarding, hazard detection, and 1 delay slot for branches. The pipeline is the typical 5-stage IF, ID, EX, MEM, WB MIPS design. For the above code, complete the...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT