計算機結構
Midterm Exam
Nov. 11. 2003
1.[14 points] True or False:
a.If all other factors are the same, a computer with a higher throughput on some
workload has better performance compared to a computer with a lower
throughput.
b.To find the execution time of a C++ program which compiles into 1 million
machine instructions, multiply 1 million instructions by CPI over the clock
rate.
c.The best benchmark of a computer consists of real programs that you will
routinely use.
d.In the MIPS ISA each register holds 32 bits.
e.In the MIPS ISA there are 32 programmable general-purpose registers.
f.J-format instructions support conditional branchin
g.
g.The stack in MIPS grows from lower address to higher address.
2.[6 points] Fill in the following blanks:
R-format:
funct op rs rt rd shamt
____ bits ____ bits ____ bits____ bits____ bits ____ bits I-format:
op rs rt Addr/immediate
____ bits ____ bits ____ bits____ bits
J-format:
op address ____ bits ____ bits
3.[5 points] List the five class components of a computer as defined by the
textbook authors.
4.[5 points] Suppose Microsoft Excel takes 2 seconds to recalculate the grades for
Computer Architecture on a 133MHz PC, which exhibits an average CPI of 2.66.
What MIPS (millions of instructions per second) rating does the PC have for this
benchmark situation? (By the way, we are making all of these numbers up.)
5. [5 points] Suppose die area is 0.42 cm 2 and there are 4 defects per cm 2. Calculate
the yield. Then calculate the yield if defects per area can be cut in half.
6. [5 points] Suppose you wrote a program which executes in 100μs that loads a
string into memory and prints it out. You discover a trick to loading the string into memory which will speed up that part of the program by 4 times. If loading the string took 80% of the execution before, what will the execution time be after making this change?
7. [10 points] Performance:
Two machine designs have been proposed, A and B. Assume that Machine A’s clock rate is 500MHz and Machine B’s clock rate is 400MHz . Both machines have three classes of instructions. The cycle counts per instruction for the three classes are as follows:
Machine A
Machine B
Class X 4 2 Class Y 3 4 Class Z
2
1
Two programs of interest that will be run on these machines require the following number of instructions (in billions) for each instruction class:
Program 1
Program 2
Class X 1 1 Class Y 2 2 Class Z
5
10
(a) Which program is faster on Machine A? Please calculate the average CPI for both programs.
(b) Which program will execute faster according to MIPS (million instruction per
second) on Machine A?
(c) How much faster (or slower) is Machine A compared to Machine B if two programs were run successively on these two machines?
8. [5 points] Suppose that we run a benchmark suite using a CPU with the
instruction classes and CPIs from the Appendix A . Given that 70% of those instructions were arithmetic operations, 10% are branches, and the remaining 20% are data transfers, compute the average CPI for the machine running this benchmark suite.
?
?
????
?+=
211
2
DieArea Area DefectsPer Yield
9.[5 points] Branch offset computation problem – Fill in the missing line of MIPS
assembly code so that it branches to the instruction labelled loop when the
contents of the $t0 register are nonzero.
loop:
lbu $t0,0($s0) # get byte pointed to by $s0
addi $s0,$s0,1 # increment pointer to next byte
andi $t0,$t0,127 # mask off most significant bit ___________________
s ub $s2,$s0,$s1 # subtract end pointer from start
. . . . .
10. [10 poins] Here is a C fragment that repeatedly calls sequence:
main() {
int t;
int sum = 0;
for(t = 0; t < 5; t++)
sum = sum + sequence(t);
}
This might translate into the following assembly code:
main: sw $ra, 0($sp)
addi $sp, $sp, -4
# Initialize sum = 0 and t = 0
add $s0, $0, $0 # $s0 = sum
addi $t0, $0, 0 # $t0 = 0
# Call sequence ( t )
loop: move $a0, $t0
sequence
jal
# Increment loop index t
addi $t0, $t0, 1
# Compute sum = sum + sequence ( t )
add $s0, $s0, $v0
# Repeat while ( t < 5 )
slti $t1, $t0, 5
bne $t1, $0, loop
addi $sp, $sp, 4
$ra,
0($sp)
lw
$ra
jr
This code fragment does not correctly follow the MIPS procedure calling
convention. What is wrong? Show the changes required to make this code
observe the calling convention.
11. [15 points] Considering the code fragment given below. Suppose the label
graduation used to be quite close to the bne instruction. Unfortunately, the program that this fragment is from has been rewritten so that it is at an address that is much farther, over 2 megabytes away in fact. Rewrite the code fragment so that the control flow stays the same. add $t0,$t0,$t1 # receive credits s ge $t2,$t0,$t3 # set $t2=1 if enough credits
bne $t2,$0,graduation # get out of here if enough s ub $s 0,$s 0,12000 # pay tuition beq $0,$0,another s eme s ter # keep looping… ...
graduation:
Use comments as way to gain partial credit if you’re not sure of your answer.
12. [5 points]Execute the following MIPS code fragments, showing the changes that
occur in the register file and in memory. You only need to show the changes .
Registers
B EFORE A FTER Memory B EFORE A FTER $16 0x10× 0x20 0X 22 × $17 0x14 0x24 0X 30 × $18 0x16× 0x28 0X 40 $19 0x28 0x2
C 0X 50 × $20 0x1234 0x30 0X 60 × Registers B EFORE A FTER Memory B EFORE A FTER $0 0x00× 0x20 0X 10 × $1 0x14 0x24 0X 30 × $2 0x16 0x28 0X 40 × $3 0x28× 0x2C 0X 50 × $4 0x1234×
0x30 0X 60 × (c) Is the branch taken? (answer YES or NO )
13. [10 points] Carry look-ahead Adder
Please derive c 1, c 2, c 3, c 4 for a four bit carry look-ahead adder in terms of p i, g i , and c 0.
(a)
addi $19, $0, 0x20 lw $17, 0x04($19) add $20, $19, $16 sw $20, 0x08($19) (b)
addi $1, $0, 0x20 lw $2, 0x04($3) add $0, $3, $1 bne $0, $1, loop
Appendix A. Given a machine with the following instruction classes with the given CPIs for those classes:
Instruction Class CPI
Data transfers 2
Arithmetic operations 1
Branches 3