Presentation is loading. Please wait.

Presentation is loading. Please wait.

순천향대학교 전산학과 2002. 10. 8 홍종국(siro1hope@hanmail.net) SimpleScalar 예제 프로그램 순천향대학교 전산학과 2002. 10. 8 홍종국(siro1hope@hanmail.net)

Similar presentations


Presentation on theme: "순천향대학교 전산학과 2002. 10. 8 홍종국(siro1hope@hanmail.net) SimpleScalar 예제 프로그램 순천향대학교 전산학과 2002. 10. 8 홍종국(siro1hope@hanmail.net)"— Presentation transcript:

1 순천향대학교 전산학과 2002. 10. 8 홍종국(siro1hope@hanmail.net)
SimpleScalar 예제 프로그램 순천향대학교 전산학과

2 C 예제 File facctorial.c #include <stdio.h>
int factorial(int val); main() { int inval=3, outval; outval = factorial(inval); printf(“%d factorial = %d\n”, inval, outval); } 컴퓨터구조특론 I

3 C 예제 File int factorial(int val) { if(val == 1) return 1; else
return (val * factorial(val-1)); } 컴퓨터구조특론 I

4 컴파일 .file 1 "factorial.c" # GNU C [AL 1.1, MM 40, tma 0.1] SimpleScalar running sstrix compiled by GNU C # Cc1 defaults: # -mgas -mgpOPT # Cc1 arguments (-G value = 8, Cpu = default, ISA = 1): # -quiet -dumpbase -o gcc2_compiled.: __gnu_compiled_c: .rdata .align 2 $LC0: .ascii "%d factorial = %d\n\000" .text .globl factorial .loc 1 5 컴퓨터구조특론 I

5 .ent main main: .frame $fp,32,$31 # vars= 8, regs= 2/0, args= 16, extra= 0 .mask 0xc ,-4 .fmask 0x ,0 subu $sp,$sp,32 sw $31,28($sp) sw $fp,24($sp) move $fp,$sp jal __main li $2,0x # 3 sw $2,16($fp) lw $4,16($fp) jal factorial sw $2,20($fp) la $4,$LC0 lw $5,16($fp) lw $6,20($fp) jal printf 컴퓨터구조특론 I

6 $L1: move $sp,$fp # sp not trusted here lw $31,28($sp) lw $fp,24($sp) addu $sp,$sp,32 j $31 .end main .loc 1 12 .ent factorial factorial: .frame $fp,24,$31 # vars= 0, regs= 2/0, args= 16, extra= 0 .mask 0xc ,-4 .fmask 0x ,0 subu $sp,$sp,24 sw $31,20($sp) sw $fp,16($sp) move $fp,$sp sw $4,24($fp) lw $2,24($fp) 컴퓨터구조특론 I

7 li $3,0x # 1 bne $2,$3,$L3 li $2,0x # 1 j $L2 j $L4 $L3: lw $3,24($fp) subu $2,$3,1 move $4,$2 jal factorial lw $4,24($fp) mult $2,$4 mflo $3 move $2,$3 $L4: $L2: move $sp,$fp # sp not trusted here lw $31,20($sp) lw $fp,16($sp) addu $sp,$sp,24 j $31 .end factoria 컴퓨터구조특론 I

8 Sim-safe 수행 ☞ /edu/simplesim-3.0/sim-saf factorial.ss
sim-safe: SimpleScalar/PISA Tool Set version 3.0 of November, 2000. ….. # -config # load configuration from a file # -dumpconfig # dump configuration to a file # -h false # print help message # -v false # verbose operation # -d false # enable debug message # -i false # start in Dlite debugger -seed # random number generator seed (0 for timer seed) # -q false # initialize and terminate immediately # -chkpt <null> # restore EIO trace execution from <fname> # -redir:sim <null> # redirect simulator output to file (non-interactive only) # -redir:prog <null> # redirect simulated program output to file -nice # simulator scheduling priority -max:inst # maximum number of inst's to execute 컴퓨터구조특론 I

9 sim: ** starting functional simulation **
3 factorial = 6 sim: ** simulation statistics ** sim_num_insn # total number of instructions executed sim_num_refs # total number of loads and stores executed sim_elapsed_time # total simulation time in seconds sim_inst_rate # simulation speed (in insts/sec) ld_text_base x # program text (code) segment base ld_text_size # program text (code) size in bytes ld_data_base x # program initialized data segment base ld_data_size # program init'ed `.data' and uninit'ed `.bss' size in bytes ld_stack_base x7fffc000 # program stack segment base (highest address in stack) ld_stack_size # program initial stack size ld_prog_entry x # program entry point (initial PC) ld_environ_base x7fff8000 # program environment base address address ld_target_big_endian # target executable endian-ness, non-zero if big endian mem.page_count # total number of pages allocated mem.page_mem k # total size of memory pages allocated mem.ptab_misses # total first level page table misses mem.ptab_accesses # total page table accesses mem.ptab_miss_rate # first level page table miss rate 컴퓨터구조특론 I

10 sim-outorder 수행 sim-outorder: SimpleScalar/PISA Tool Set version 3.0 of November, 2000. # -config # load configuration from a file # -dumpconfig # dump configuration to a file # -h false # print help message # -v false # verbose operation # -d false # enable debug message # -i false # start in Dlite debugger -seed # random number generator seed (0 for timer seed) # -q false # initialize and terminate immediately # -chkpt <null> # restore EIO trace execution from <fname> # -redir:sim <null> # redirect simulator output to file (non-interactive only) # -redir:prog <null> # redirect simulated program output to file -nice # simulator scheduling priority -max:inst # maximum number of inst's to execute -fastfwd # number of insts skipped before timing starts # -ptrace <null> # generate pipetrace, i.e., <fname|stdout|stderr> <range> -fetch:ifqsize # instruction fetch queue size (in insts) -fetch:mplat # extra branch mis-prediction latency -fetch:speed # speed of front-end of machine relative to execution core 컴퓨터구조특론 I

11 sim: ** starting performance simulation **
3 factorial = 6 sim: ** simulation statistics ** sim_num_insn # total number of instructions committed sim_num_refs # total number of loads and stores committed sim_num_loads # total number of loads committed sim_num_stores # total number of stores committed sim_num_branches # total number of branches committed sim_elapsed_time # total simulation time in seconds sim_inst_rate # simulation speed (in insts/sec) sim_total_insn # total number of instructions executed sim_total_refs # total number of loads and stores executed sim_total_loads # total number of loads executed sim_total_stores # total number of stores executed sim_total_branches # total number of branches executed sim_cycle # total simulation time in cycles sim_IPC # instructions per cycle sim_CPI # cycles per instruction sim_exec_BW # total instructions (mis-spec + committed) per cycle sim_IPB # instruction per branch IFQ_count # cumulative IFQ occupancy IFQ_fcount # cumulative IFQ full count ifq_occupancy # avg IFQ occupancy (insn's) 컴퓨터구조특론 I

12 ifq_rate 0.3292 # avg IFQ dispatch rate (insn/cycle)
ifq_latency # avg IFQ occupant latency (cycle's) ifq_full # fraction of time (cycle's) IFQ was full RUU_count # cumulative RUU occupancy RUU_fcount # cumulative RUU full count ruu_occupancy # avg RUU occupancy (insn's) ruu_rate # avg RUU dispatch rate (insn/cycle) ruu_latency # avg RUU occupant latency (cycle's) ruu_full # fraction of time (cycle's) RUU was full LSQ_count # cumulative LSQ occupancy LSQ_fcount # cumulative LSQ full count lsq_occupancy # avg LSQ occupancy (insn's) lsq_rate # avg LSQ dispatch rate (insn/cycle) lsq_latency # avg LSQ occupant latency (cycle's) lsq_full # fraction of time (cycle's) LSQ was full sim_slip # total number of slip cycles avg_sim_slip # the average slip between issue and retirement bpred_bimod.lookups # total number of bpred lookups bpred_bimod.updates # total number of updates bpred_bimod.addr_hits # total number of address-predicted hits bpred_bimod.dir_hits # total number of direction-predicted hits (includes addr-hits) bpred_bimod.misses # total number of misses bpred_bimod.jr_hits # total number of address-predicted hits for JR's 컴퓨터구조특론 I

13 bpred_bimod.jr_seen 61 # total number of JR's seen
bpred_bimod.jr_non_ras_hits.PP # total number of address-predicted hits for non-RAS JR's bpred_bimod.jr_non_ras_seen.PP # total number of non-RAS JR's seen bpred_bimod.bpred_addr_rate # branch address-prediction rate (i.e., addr-hits/updates) bpred_bimod.bpred_dir_rate # branch direction-prediction rate (i.e., all-hits/updates) bpred_bimod.bpred_jr_rate # JR address-prediction rate (i.e., JR addr-hits/JRs seen) bpred_bimod.bpred_jr_non_ras_rate.PP # non-RAS JR addr-pred rate (ie, non-RAS JR hits/JRs seen) bpred_bimod.retstack_pushes # total number of address pushed onto ret-addr stack bpred_bimod.retstack_pops # total number of address popped off of ret-addr stack bpred_bimod.used_ras.PP # total number of RAS predictions used bpred_bimod.ras_hits.PP # total number of RAS hits bpred_bimod.ras_rate.PP # RAS prediction rate (i.e., RAS hits/used RAS) il1.accesses # total number of accesses il1.hits # total number of hits il1.misses # total number of misses il1.replacements # total number of replacements il1.writebacks # total number of writebacks il1.invalidations # total number of invalidations il1.miss_rate # miss rate (i.e., misses/ref) il1.repl_rate # replacement rate (i.e., repls/ref) 컴퓨터구조특론 I

14 bpred_bimod.jr_seen 61 # total number of JR's seen
bpred_bimod.jr_non_ras_hits.PP # total number of address-predicted hits for non-RAS JR's bpred_bimod.jr_non_ras_seen.PP # total number of non-RAS JR's seen bpred_bimod.bpred_addr_rate # branch address-prediction rate (i.e., addr-hits/updates) bpred_bimod.bpred_dir_rate # branch direction-prediction rate (i.e., all-hits/updates) bpred_bimod.bpred_jr_rate # JR address-prediction rate (i.e., JR addr-hits/JRs seen) bpred_bimod.bpred_jr_non_ras_rate.PP # non-RAS JR addr-pred rate (ie, non-RAS JR hits/JRs seen) bpred_bimod.retstack_pushes # total number of address pushed onto ret-addr stack bpred_bimod.retstack_pops # total number of address popped off of ret-addr stack bpred_bimod.used_ras.PP # total number of RAS predictions used bpred_bimod.ras_hits.PP # total number of RAS hits bpred_bimod.ras_rate.PP # RAS prediction rate (i.e., RAS hits/used RAS) il1.accesses # total number of accesses il1.hits # total number of hits il1.misses # total number of misses il1.replacements # total number of replacements il1.writebacks # total number of writebacks il1.invalidations # total number of invalidations il1.miss_rate # miss rate (i.e., misses/ref) il1.repl_rate # replacement rate (i.e., repls/ref) 컴퓨터구조특론 I

15 il1.wb_rate 0.0000 # writeback rate (i.e., wrbks/ref)
il1.inv_rate # invalidation rate (i.e., invs/ref) dl1.accesses # total number of accesses dl1.hits # total number of hits dl1.misses # total number of misses dl1.replacements # total number of replacements dl1.writebacks # total number of writebacks dl1.invalidations # total number of invalidations dl1.miss_rate # miss rate (i.e., misses/ref) dl1.repl_rate # replacement rate (i.e., repls/ref) dl1.wb_rate # writeback rate (i.e., wrbks/ref) dl1.inv_rate # invalidation rate (i.e., invs/ref) ul2.accesses # total number of accesses ul2.hits # total number of hits ul2.misses # total number of misses ul2.replacements # total number of replacements ul2.writebacks # total number of writebacks ul2.invalidations # total number of invalidations ul2.miss_rate # miss rate (i.e., misses/ref) ul2.repl_rate # replacement rate (i.e., repls/ref) ul2.wb_rate # writeback rate (i.e., wrbks/ref) ul2.inv_rate # invalidation rate (i.e., invs/ref) itlb.accesses # total number of accesses itlb.hits # total number of hits itlb.misses # total number of misses itlb.replacements # total number of replacements 컴퓨터구조특론 I

16 itlb.writebacks 0 # total number of writebacks
itlb.invalidations # total number of invalidations itlb.miss_rate # miss rate (i.e., misses/ref) itlb.repl_rate # replacement rate (i.e., repls/ref) itlb.wb_rate # writeback rate (i.e., wrbks/ref) itlb.inv_rate # invalidation rate (i.e., invs/ref) dtlb.accesses # total number of accesses dtlb.hits # total number of hits dtlb.misses # total number of misses dtlb.replacements # total number of replacements dtlb.writebacks # total number of writebacks dtlb.invalidations # total number of invalidations dtlb.miss_rate # miss rate (i.e., misses/ref) dtlb.repl_rate # replacement rate (i.e., repls/ref) dtlb.wb_rate # writeback rate (i.e., wrbks/ref) dtlb.inv_rate # invalidation rate (i.e., invs/ref) sim_invalid_addrs # total non-speculative bogus addresses seen (debug var) ld_text_base x # program text (code) segment base ld_text_size # program text (code) size in bytes ld_data_base x # program initialized data segment base ld_data_size # program init'ed `.data' and uninit'ed `.bss' size in bytes ld_stack_base x7fffc000 # program stack segment base (highest address in stack) 컴퓨터구조특론 I

17 ld_stack_size 16384 # program initial stack size
ld_prog_entry x # program entry point (initial PC) ld_environ_base x7fff8000 # program environment base address address ld_target_big_endian # target executable endian-ness, non-zero if big endian mem.page_count # total number of pages allocated mem.page_mem k # total size of memory pages allocated mem.ptab_misses # total first level page table misses mem.ptab_accesses # total page table accesses mem.ptab_miss_rate # first level page table miss rate 컴퓨터구조특론 I

18 Dlite! 디버거 사용 – dlitefactorial.txt 참조
sim: ** starting functional simulation ** [ ] 0x : lw r16,0(r29) DLite! > 0x004001f0: addiu r29,r29,-32 0x004001f8: sw r31,28(r29) 0x : sw r30,24(r29) 0x : addu r30,r0,r29 0x : jal x400508 0x : addiu r2,r0,3 0x : sw r2,16(r30) 0x : lw r4,16(r30) 0x : jal x400290 0x : sw r2,20(r30) 0x : lui r4,0x1000 0x : addiu r4,r4,0 0x : lw r5,16(r30) 0x : lw r6,20(r30) 0x : jal x400730 0x : addu r29,r0,r30 컴퓨터구조특론 I

19 DLite! > 0x : addiu r29,r29,-24 0x : sw r31,20(r29) 0x004002a0: sw r30,16(r29) 0x004002a8: addu r30,r0,r29 0x004002b0: sw r4,24(r30) 0x004002b8: lw r2,24(r30) 0x004002c0: addiu r3,r0,1 0x004002c8: bne r2,r3,0x4002e8 0x004002d0: addiu r2,r0,1 0x004002d8: j x400330 0x004002e0: j x400330 0x004002e8: lw r3,24(r30) 0x004002f0: addiu r2,r3,-1 0x004002f8: addu r4,r0,r2 0x : jal x400290 0x : lw r4,24(r30) DLite! > 3 factorial = 6 컴퓨터구조특론 I


Download ppt "순천향대학교 전산학과 2002. 10. 8 홍종국(siro1hope@hanmail.net) SimpleScalar 예제 프로그램 순천향대학교 전산학과 2002. 10. 8 홍종국(siro1hope@hanmail.net)"

Similar presentations


Ads by Google