================================================================
== Vivado HLS Report for 'matrix_mult'
================================================================
* Date:           Mon Mar 19 10:00:42 2018

* Version:        2017.4 (Build 2086221 on Fri Dec 15 21:13:33 MST 2017)
* Project:        matrix_mult
* Solution:       solution1
* Product family: zynq
* Target device:  xc7z020clg484-1


================================================================
== Performance Estimates
================================================================
+ Timing (ns): 
    * Summary: 
    +--------+-------+----------+------------+
    |  Clock | Target| Estimated| Uncertainty|
    +--------+-------+----------+------------+
    |ap_clk  |  10.00|      8.75|        1.25|
    +--------+-------+----------+------------+

+ Latency (clock cycles): 
    * Summary: 
    +-----+-----+-----+-----+---------+
    |  Latency  |  Interval | Pipeline|
    | min | max | min | max |   Type  |
    +-----+-----+-----+-----+---------+
    |  290|  290|  290|  290|   none  |
    +-----+-----+-----+-----+---------+

    + Detail: 
        * Instance: 
        N/A

        * Loop: 
        +--------------------------------+-----+-----+----------+-----------+-----------+------+----------+
        |                                |  Latency  | Iteration|  Initiation Interval  | Trip |          |
        |            Loop Name           | min | max |  Latency |  achieved |   target  | Count| Pipelined|
        +--------------------------------+-----+-----+----------+-----------+-----------+------+----------+
        |- memcpy.tempA.A                |   65|   65|         3|          1|          1|    64|    yes   |
        |- memcpy.tempB.B                |   65|   65|         3|          1|          1|    64|    yes   |
        |- matrix_mult__outer_loop       |   38|   38|        15|          8|          1|     4|    yes   |
        |- memcpy.result.tempResult.gep  |   65|   65|         3|          1|          1|    64|    yes   |
        +--------------------------------+-----+-----+----------+-----------+-----------+------+----------+

============================================================
+ Verbose Summary: Synthesis Manager
============================================================
InlineROM: 1
ExposeGlobal: 0
============================================================
+ Verbose Summary: CDFG Model
============================================================
IsTopModel: 1
ResetActiveHigh: 1
IsCombinational: 0
IsDatapathOnly: 0
HasWiredReturn: 1
HasMFsm: 0
HasVarLatency: 1
IsPipeline: 0
IsRtlPipelined: 0
IsInstanceOverlapped: 0
IsDontTouch: 0
HasImplIP: 0
IsGatedGlobalClock: 0

+ Individual pipeline summary: 
  * Pipeline-0: initiation interval (II) = 1, depth = 3
  * Pipeline-1: initiation interval (II) = 1, depth = 3
  * Pipeline-2: initiation interval (II) = 8, depth = 15
  * Pipeline-3: initiation interval (II) = 1, depth = 3


============================================================
+ Verbose Summary: Schedule
============================================================
* Number of FSM states : 78
* Pipeline : 4
  Pipeline-0 : II = 1, D = 3, States = { 9 10 11 }
  Pipeline-1 : II = 1, D = 3, States = { 19 20 21 }
  Pipeline-2 : II = 8, D = 15, States = { 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 }
  Pipeline-3 : II = 1, D = 3, States = { 71 72 73 }
* Dataflow Pipeline: 0

* FSM state transitions: 
1 --> 
	2  / true
2 --> 
	3  / true
3 --> 
	4  / true
4 --> 
	5  / true
5 --> 
	6  / true
6 --> 
	7  / true
7 --> 
	8  / true
8 --> 
	9  / true
9 --> 
	12  / (exitcond3)
	10  / (!exitcond3)
10 --> 
	11  / true
11 --> 
	9  / true
12 --> 
	13  / true
13 --> 
	14  / true
14 --> 
	15  / true
15 --> 
	16  / true
16 --> 
	17  / true
17 --> 
	18  / true
18 --> 
	19  / true
19 --> 
	22  / (exitcond4)
	20  / (!exitcond4)
20 --> 
	21  / true
21 --> 
	19  / true
22 --> 
	23  / true
23 --> 
	24  / true
24 --> 
	25  / true
25 --> 
	26  / true
26 --> 
	27  / true
27 --> 
	28  / true
28 --> 
	29  / true
29 --> 
	30  / true
30 --> 
	31  / true
31 --> 
	32  / true
32 --> 
	33  / true
33 --> 
	34  / true
34 --> 
	35  / true
35 --> 
	36  / true
36 --> 
	37  / true
37 --> 
	38  / true
38 --> 
	39  / true
39 --> 
	40  / true
40 --> 
	41  / true
41 --> 
	42  / true
42 --> 
	43  / true
43 --> 
	44  / true
44 --> 
	45  / true
45 --> 
	46  / true
46 --> 
	47  / true
47 --> 
	48  / true
48 --> 
	49  / true
49 --> 
	50  / true
50 --> 
	51  / true
51 --> 
	52  / true
52 --> 
	53  / true
53 --> 
	54  / true
54 --> 
	55  / true
55 --> 
	70  / (exitcond2)
	56  / (!exitcond2)
56 --> 
	57  / true
57 --> 
	58  / true
58 --> 
	59  / true
59 --> 
	60  / true
60 --> 
	61  / true
61 --> 
	62  / true
62 --> 
	63  / true
63 --> 
	64  / true
64 --> 
	65  / true
65 --> 
	66  / true
66 --> 
	67  / true
67 --> 
	68  / true
68 --> 
	69  / true
69 --> 
	55  / true
70 --> 
	71  / true
71 --> 
	74  / (exitcond5)
	72  / (!exitcond5)
72 --> 
	73  / true
73 --> 
	71  / true
74 --> 
	75  / true
75 --> 
	76  / true
76 --> 
	77  / true
77 --> 
	78  / true
78 --> 

* FSM state operations: 

 <State 1> : 1.00ns
ST_1 : Operation 79 [1/1] (1.00ns)   --->   "%result_read = call i32 @_ssdm_op_Read.s_axilite.i32(i32 %result)"   --->   Core 10 's_axilite' <Latency = 0> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write'>
ST_1 : Operation 80 [1/1] (1.00ns)   --->   "%B_read = call i32 @_ssdm_op_Read.s_axilite.i32(i32 %B)"   --->   Core 10 's_axilite' <Latency = 0> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write'>
ST_1 : Operation 81 [1/1] (1.00ns)   --->   "%A_read = call i32 @_ssdm_op_Read.s_axilite.i32(i32 %A)"   --->   Core 10 's_axilite' <Latency = 0> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write'>
ST_1 : Operation 82 [1/1] (0.00ns)   --->   "%result5 = call i30 @_ssdm_op_PartSelect.i30.i32.i32.i32(i32 %result_read, i32 2, i32 31)"
ST_1 : Operation 83 [1/1] (0.00ns)   --->   "%B3 = call i30 @_ssdm_op_PartSelect.i30.i32.i32.i32(i32 %B_read, i32 2, i32 31)"
ST_1 : Operation 84 [1/1] (0.00ns)   --->   "%A1 = call i30 @_ssdm_op_PartSelect.i30.i32.i32.i32(i32 %A_read, i32 2, i32 31)"
ST_1 : Operation 85 [1/1] (0.00ns)   --->   "%tempA = alloca [64 x i32], align 16" [matrix_mult/matrix_mult.cpp:5]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_1 : Operation 86 [1/1] (0.00ns)   --->   "%tempB = alloca [64 x i32], align 16" [matrix_mult/matrix_mult.cpp:5]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_1 : Operation 87 [1/1] (0.00ns)   --->   "%tempResult = alloca [64 x i32], align 16" [matrix_mult/matrix_mult.cpp:5]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 2> : 8.75ns
ST_2 : Operation 88 [1/1] (0.00ns)   --->   "%tmp_7 = zext i30 %A1 to i64"
ST_2 : Operation 89 [1/1] (0.00ns)   --->   "%gmem_addr_2 = getelementptr i32* %gmem, i64 %tmp_7"
ST_2 : Operation 90 [7/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 3> : 8.75ns
ST_3 : Operation 91 [6/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 4> : 8.75ns
ST_4 : Operation 92 [5/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 5> : 8.75ns
ST_5 : Operation 93 [4/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 6> : 8.75ns
ST_6 : Operation 94 [3/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 7> : 8.75ns
ST_7 : Operation 95 [2/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 8> : 8.75ns
ST_8 : Operation 96 [1/1] (0.00ns)   --->   "%tmp_4 = zext i30 %result5 to i64"
ST_8 : Operation 97 [1/1] (0.00ns)   --->   "%gmem_addr = getelementptr i32* %gmem, i64 %tmp_4"
ST_8 : Operation 98 [1/1] (0.00ns)   --->   "%tmp_5 = zext i30 %B3 to i64"
ST_8 : Operation 99 [1/1] (0.00ns)   --->   "%gmem_addr_1 = getelementptr i32* %gmem, i64 %tmp_5"
ST_8 : Operation 100 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecBitsMap(i32* %gmem), !map !11"
ST_8 : Operation 101 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecTopModule([12 x i8]* @matrix_mult_str) nounwind"
ST_8 : Operation 102 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecInterface(i32 %result, [10 x i8]* @mode5, i32 0, i32 0, [1 x i8]* @p_str1, i32 0, i32 32, [1 x i8]* @bundle6, [6 x i8]* @p_str2, [1 x i8]* @p_str1, i32 16, i32 16, i32 16, i32 16, [1 x i8]* @p_str1, [1 x i8]* @p_str1)"
ST_8 : Operation 103 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecInterface(i32 %B, [10 x i8]* @mode3, i32 0, i32 0, [1 x i8]* @p_str1, i32 0, i32 32, [1 x i8]* @bundle4, [6 x i8]* @p_str2, [1 x i8]* @p_str1, i32 16, i32 16, i32 16, i32 16, [1 x i8]* @p_str1, [1 x i8]* @p_str1)"
ST_8 : Operation 104 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecInterface(i32* %gmem, [6 x i8]* @p_str, i32 0, i32 0, [1 x i8]* @p_str1, i32 0, i32 32, [1 x i8]* @p_str1, [6 x i8]* @p_str2, [1 x i8]* @p_str1, i32 16, i32 16, i32 16, i32 16, [1 x i8]* @p_str1, [1 x i8]* @p_str1)"
ST_8 : Operation 105 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecInterface(i32 %A, [10 x i8]* @mode, i32 0, i32 0, [1 x i8]* @p_str1, i32 0, i32 32, [1 x i8]* @bundle, [6 x i8]* @p_str2, [1 x i8]* @p_str1, i32 16, i32 16, i32 16, i32 16, [1 x i8]* @p_str1, [1 x i8]* @p_str1)"
ST_8 : Operation 106 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecInterface(i32 0, [10 x i8]* @p_str3, i32 0, i32 0, [1 x i8]* @p_str1, i32 0, i32 0, [1 x i8]* @p_str1, [1 x i8]* @p_str1, [1 x i8]* @p_str1, i32 0, i32 0, i32 0, i32 0, [1 x i8]* @p_str1, [1 x i8]* @p_str1) nounwind" [matrix_mult/matrix_mult.cpp:5]
ST_8 : Operation 107 [1/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>
ST_8 : Operation 108 [1/1] (1.76ns)   --->   "br label %burst.rd.header"

 <State 9> : 1.87ns
ST_9 : Operation 109 [1/1] (0.00ns)   --->   "%indvar = phi i7 [ 0, %0 ], [ %indvar_next, %burst.rd.body ]"
ST_9 : Operation 110 [1/1] (1.48ns)   --->   "%exitcond3 = icmp eq i7 %indvar, -64"   --->   Core 25 'Cmp' <Latency = 0> <II = 1> <Delay = 1.48> <FuncUnit> <Opcode : 'icmp'> <InPorts = 2> <OutPorts = 1>
ST_9 : Operation 111 [1/1] (1.87ns)   --->   "%indvar_next = add i7 %indvar, 1"   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_9 : Operation 112 [1/1] (0.00ns)   --->   "br i1 %exitcond3, label %burst.rd.header7.preheader, label %burst.rd.body"

 <State 10> : 8.75ns
ST_10 : Operation 113 [1/1] (8.75ns)   --->   "%gmem_addr_2_read = call i32 @_ssdm_op_Read.m_axi.i32P(i32* %gmem_addr_2)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 11> : 3.25ns
ST_11 : Operation 114 [1/1] (0.00ns)   --->   "%empty = call i32 (...)* @_ssdm_op_SpecLoopTripCount(i64 64, i64 64, i64 64) nounwind"
ST_11 : Operation 115 [1/1] (0.00ns)   --->   "%burstread_rbegin = call i32 (...)* @_ssdm_op_SpecRegionBegin([17 x i8]* @burstread_OC_region_s) nounwind"
ST_11 : Operation 116 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecPipeline(i32 1, i32 1, i32 1, i32 0, [1 x i8]* @p_str7)"
ST_11 : Operation 117 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecLoopName([15 x i8]* @memcpy_OC_tempA_OC_A)"
ST_11 : Operation 118 [1/1] (0.00ns)   --->   "%tmp = zext i7 %indvar to i64" [matrix_mult/matrix_mult.cpp:6]
ST_11 : Operation 119 [1/1] (0.00ns)   --->   "%tempA_addr = getelementptr [64 x i32]* %tempA, i64 0, i64 %tmp" [matrix_mult/matrix_mult.cpp:6]
ST_11 : Operation 120 [1/1] (3.25ns)   --->   "store i32 %gmem_addr_2_read, i32* %tempA_addr, align 4" [matrix_mult/matrix_mult.cpp:6]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_11 : Operation 121 [1/1] (0.00ns)   --->   "%burstread_rend = call i32 (...)* @_ssdm_op_SpecRegionEnd([17 x i8]* @burstread_OC_region_s, i32 %burstread_rbegin) nounwind"
ST_11 : Operation 122 [1/1] (0.00ns)   --->   "br label %burst.rd.header"

 <State 12> : 8.75ns
ST_12 : Operation 123 [7/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 13> : 8.75ns
ST_13 : Operation 124 [6/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 14> : 8.75ns
ST_14 : Operation 125 [5/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 15> : 8.75ns
ST_15 : Operation 126 [4/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 16> : 8.75ns
ST_16 : Operation 127 [3/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 17> : 8.75ns
ST_17 : Operation 128 [2/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 18> : 8.75ns
ST_18 : Operation 129 [1/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>
ST_18 : Operation 130 [1/1] (1.76ns)   --->   "br label %burst.rd.header7"

 <State 19> : 1.87ns
ST_19 : Operation 131 [1/1] (0.00ns)   --->   "%indvar9 = phi i7 [ %indvar_next1, %burst.rd.body8 ], [ 0, %burst.rd.header7.preheader ]"
ST_19 : Operation 132 [1/1] (1.48ns)   --->   "%exitcond4 = icmp eq i7 %indvar9, -64"   --->   Core 25 'Cmp' <Latency = 0> <II = 1> <Delay = 1.48> <FuncUnit> <Opcode : 'icmp'> <InPorts = 2> <OutPorts = 1>
ST_19 : Operation 133 [1/1] (1.87ns)   --->   "%indvar_next1 = add i7 %indvar9, 1"   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_19 : Operation 134 [1/1] (0.00ns)   --->   "br i1 %exitcond4, label %burst.rd.end6.0.preheader, label %burst.rd.body8"

 <State 20> : 8.75ns
ST_20 : Operation 135 [1/1] (8.75ns)   --->   "%gmem_addr_1_read = call i32 @_ssdm_op_Read.m_axi.i32P(i32* %gmem_addr_1)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 21> : 3.25ns
ST_21 : Operation 136 [1/1] (0.00ns)   --->   "%empty_5 = call i32 (...)* @_ssdm_op_SpecLoopTripCount(i64 64, i64 64, i64 64) nounwind"
ST_21 : Operation 137 [1/1] (0.00ns)   --->   "%burstread_rbegin1 = call i32 (...)* @_ssdm_op_SpecRegionBegin([17 x i8]* @burstread_OC_region_s) nounwind"
ST_21 : Operation 138 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecPipeline(i32 1, i32 1, i32 1, i32 0, [1 x i8]* @p_str8)"
ST_21 : Operation 139 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecLoopName([15 x i8]* @memcpy_OC_tempB_OC_B)"
ST_21 : Operation 140 [1/1] (0.00ns)   --->   "%tmp_1 = zext i7 %indvar9 to i64" [matrix_mult/matrix_mult.cpp:7]
ST_21 : Operation 141 [1/1] (0.00ns)   --->   "%tempB_addr = getelementptr [64 x i32]* %tempB, i64 0, i64 %tmp_1" [matrix_mult/matrix_mult.cpp:7]
ST_21 : Operation 142 [1/1] (3.25ns)   --->   "store i32 %gmem_addr_1_read, i32* %tempB_addr, align 4" [matrix_mult/matrix_mult.cpp:7]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_21 : Operation 143 [1/1] (0.00ns)   --->   "%burstread_rend14 = call i32 (...)* @_ssdm_op_SpecRegionEnd([17 x i8]* @burstread_OC_region_s, i32 %burstread_rbegin1) nounwind"
ST_21 : Operation 144 [1/1] (0.00ns)   --->   "br label %burst.rd.header7"

 <State 22> : 3.25ns
ST_22 : Operation 145 [1/1] (0.00ns)   --->   "%tempB_addr_1 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 0" [matrix_mult/matrix_mult.cpp:16]
ST_22 : Operation 146 [2/2] (3.25ns)   --->   "%tempB_load = load i32* %tempB_addr_1, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_22 : Operation 147 [1/1] (0.00ns)   --->   "%tempB_addr_2 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 8" [matrix_mult/matrix_mult.cpp:16]
ST_22 : Operation 148 [2/2] (3.25ns)   --->   "%tempB_load_1 = load i32* %tempB_addr_2, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 23> : 3.25ns
ST_23 : Operation 149 [1/2] (3.25ns)   --->   "%tempB_load = load i32* %tempB_addr_1, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_23 : Operation 150 [1/2] (3.25ns)   --->   "%tempB_load_1 = load i32* %tempB_addr_2, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_23 : Operation 151 [1/1] (0.00ns)   --->   "%tempB_addr_3 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 16" [matrix_mult/matrix_mult.cpp:16]
ST_23 : Operation 152 [2/2] (3.25ns)   --->   "%tempB_load_2 = load i32* %tempB_addr_3, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_23 : Operation 153 [1/1] (0.00ns)   --->   "%tempB_addr_4 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 24" [matrix_mult/matrix_mult.cpp:16]
ST_23 : Operation 154 [2/2] (3.25ns)   --->   "%tempB_load_3 = load i32* %tempB_addr_4, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 24> : 3.25ns
ST_24 : Operation 155 [1/2] (3.25ns)   --->   "%tempB_load_2 = load i32* %tempB_addr_3, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_24 : Operation 156 [1/2] (3.25ns)   --->   "%tempB_load_3 = load i32* %tempB_addr_4, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_24 : Operation 157 [1/1] (0.00ns)   --->   "%tempB_addr_5 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 32" [matrix_mult/matrix_mult.cpp:16]
ST_24 : Operation 158 [2/2] (3.25ns)   --->   "%tempB_load_4 = load i32* %tempB_addr_5, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_24 : Operation 159 [1/1] (0.00ns)   --->   "%tempB_addr_6 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 40" [matrix_mult/matrix_mult.cpp:16]
ST_24 : Operation 160 [2/2] (3.25ns)   --->   "%tempB_load_5 = load i32* %tempB_addr_6, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 25> : 3.25ns
ST_25 : Operation 161 [1/2] (3.25ns)   --->   "%tempB_load_4 = load i32* %tempB_addr_5, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_25 : Operation 162 [1/2] (3.25ns)   --->   "%tempB_load_5 = load i32* %tempB_addr_6, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_25 : Operation 163 [1/1] (0.00ns)   --->   "%tempB_addr_7 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 48" [matrix_mult/matrix_mult.cpp:16]
ST_25 : Operation 164 [2/2] (3.25ns)   --->   "%tempB_load_6 = load i32* %tempB_addr_7, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_25 : Operation 165 [1/1] (0.00ns)   --->   "%tempB_addr_8 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 56" [matrix_mult/matrix_mult.cpp:16]
ST_25 : Operation 166 [2/2] (3.25ns)   --->   "%tempB_load_7 = load i32* %tempB_addr_8, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 26> : 3.25ns
ST_26 : Operation 167 [1/2] (3.25ns)   --->   "%tempB_load_6 = load i32* %tempB_addr_7, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_26 : Operation 168 [1/2] (3.25ns)   --->   "%tempB_load_7 = load i32* %tempB_addr_8, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_26 : Operation 169 [1/1] (0.00ns)   --->   "%tempB_addr_9 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 1" [matrix_mult/matrix_mult.cpp:16]
ST_26 : Operation 170 [2/2] (3.25ns)   --->   "%tempB_load_8 = load i32* %tempB_addr_9, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_26 : Operation 171 [1/1] (0.00ns)   --->   "%tempB_addr_10 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 9" [matrix_mult/matrix_mult.cpp:16]
ST_26 : Operation 172 [2/2] (3.25ns)   --->   "%tempB_load_9 = load i32* %tempB_addr_10, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 27> : 3.25ns
ST_27 : Operation 173 [1/2] (3.25ns)   --->   "%tempB_load_8 = load i32* %tempB_addr_9, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_27 : Operation 174 [1/2] (3.25ns)   --->   "%tempB_load_9 = load i32* %tempB_addr_10, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_27 : Operation 175 [1/1] (0.00ns)   --->   "%tempB_addr_11 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 17" [matrix_mult/matrix_mult.cpp:16]
ST_27 : Operation 176 [2/2] (3.25ns)   --->   "%tempB_load_10 = load i32* %tempB_addr_11, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_27 : Operation 177 [1/1] (0.00ns)   --->   "%tempB_addr_12 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 25" [matrix_mult/matrix_mult.cpp:16]
ST_27 : Operation 178 [2/2] (3.25ns)   --->   "%tempB_load_11 = load i32* %tempB_addr_12, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 28> : 3.25ns
ST_28 : Operation 179 [1/2] (3.25ns)   --->   "%tempB_load_10 = load i32* %tempB_addr_11, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_28 : Operation 180 [1/2] (3.25ns)   --->   "%tempB_load_11 = load i32* %tempB_addr_12, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_28 : Operation 181 [1/1] (0.00ns)   --->   "%tempB_addr_13 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 33" [matrix_mult/matrix_mult.cpp:16]
ST_28 : Operation 182 [2/2] (3.25ns)   --->   "%tempB_load_12 = load i32* %tempB_addr_13, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_28 : Operation 183 [1/1] (0.00ns)   --->   "%tempB_addr_14 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 41" [matrix_mult/matrix_mult.cpp:16]
ST_28 : Operation 184 [2/2] (3.25ns)   --->   "%tempB_load_13 = load i32* %tempB_addr_14, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 29> : 3.25ns
ST_29 : Operation 185 [1/2] (3.25ns)   --->   "%tempB_load_12 = load i32* %tempB_addr_13, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_29 : Operation 186 [1/2] (3.25ns)   --->   "%tempB_load_13 = load i32* %tempB_addr_14, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_29 : Operation 187 [1/1] (0.00ns)   --->   "%tempB_addr_15 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 49" [matrix_mult/matrix_mult.cpp:16]
ST_29 : Operation 188 [2/2] (3.25ns)   --->   "%tempB_load_14 = load i32* %tempB_addr_15, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_29 : Operation 189 [1/1] (0.00ns)   --->   "%tempB_addr_16 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 57" [matrix_mult/matrix_mult.cpp:16]
ST_29 : Operation 190 [2/2] (3.25ns)   --->   "%tempB_load_15 = load i32* %tempB_addr_16, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 30> : 3.25ns
ST_30 : Operation 191 [1/2] (3.25ns)   --->   "%tempB_load_14 = load i32* %tempB_addr_15, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_30 : Operation 192 [1/2] (3.25ns)   --->   "%tempB_load_15 = load i32* %tempB_addr_16, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_30 : Operation 193 [1/1] (0.00ns)   --->   "%tempB_addr_17 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 2" [matrix_mult/matrix_mult.cpp:16]
ST_30 : Operation 194 [2/2] (3.25ns)   --->   "%tempB_load_16 = load i32* %tempB_addr_17, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_30 : Operation 195 [1/1] (0.00ns)   --->   "%tempB_addr_18 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 10" [matrix_mult/matrix_mult.cpp:16]
ST_30 : Operation 196 [2/2] (3.25ns)   --->   "%tempB_load_17 = load i32* %tempB_addr_18, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 31> : 3.25ns
ST_31 : Operation 197 [1/2] (3.25ns)   --->   "%tempB_load_16 = load i32* %tempB_addr_17, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_31 : Operation 198 [1/2] (3.25ns)   --->   "%tempB_load_17 = load i32* %tempB_addr_18, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_31 : Operation 199 [1/1] (0.00ns)   --->   "%tempB_addr_19 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 18" [matrix_mult/matrix_mult.cpp:16]
ST_31 : Operation 200 [2/2] (3.25ns)   --->   "%tempB_load_18 = load i32* %tempB_addr_19, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_31 : Operation 201 [1/1] (0.00ns)   --->   "%tempB_addr_20 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 26" [matrix_mult/matrix_mult.cpp:16]
ST_31 : Operation 202 [2/2] (3.25ns)   --->   "%tempB_load_19 = load i32* %tempB_addr_20, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 32> : 3.25ns
ST_32 : Operation 203 [1/2] (3.25ns)   --->   "%tempB_load_18 = load i32* %tempB_addr_19, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_32 : Operation 204 [1/2] (3.25ns)   --->   "%tempB_load_19 = load i32* %tempB_addr_20, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_32 : Operation 205 [1/1] (0.00ns)   --->   "%tempB_addr_21 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 34" [matrix_mult/matrix_mult.cpp:16]
ST_32 : Operation 206 [2/2] (3.25ns)   --->   "%tempB_load_20 = load i32* %tempB_addr_21, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_32 : Operation 207 [1/1] (0.00ns)   --->   "%tempB_addr_22 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 42" [matrix_mult/matrix_mult.cpp:16]
ST_32 : Operation 208 [2/2] (3.25ns)   --->   "%tempB_load_21 = load i32* %tempB_addr_22, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 33> : 3.25ns
ST_33 : Operation 209 [1/2] (3.25ns)   --->   "%tempB_load_20 = load i32* %tempB_addr_21, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_33 : Operation 210 [1/2] (3.25ns)   --->   "%tempB_load_21 = load i32* %tempB_addr_22, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_33 : Operation 211 [1/1] (0.00ns)   --->   "%tempB_addr_23 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 50" [matrix_mult/matrix_mult.cpp:16]
ST_33 : Operation 212 [2/2] (3.25ns)   --->   "%tempB_load_22 = load i32* %tempB_addr_23, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_33 : Operation 213 [1/1] (0.00ns)   --->   "%tempB_addr_24 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 58" [matrix_mult/matrix_mult.cpp:16]
ST_33 : Operation 214 [2/2] (3.25ns)   --->   "%tempB_load_23 = load i32* %tempB_addr_24, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 34> : 3.25ns
ST_34 : Operation 215 [1/2] (3.25ns)   --->   "%tempB_load_22 = load i32* %tempB_addr_23, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_34 : Operation 216 [1/2] (3.25ns)   --->   "%tempB_load_23 = load i32* %tempB_addr_24, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_34 : Operation 217 [1/1] (0.00ns)   --->   "%tempB_addr_25 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 3" [matrix_mult/matrix_mult.cpp:16]
ST_34 : Operation 218 [2/2] (3.25ns)   --->   "%tempB_load_24 = load i32* %tempB_addr_25, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_34 : Operation 219 [1/1] (0.00ns)   --->   "%tempB_addr_26 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 11" [matrix_mult/matrix_mult.cpp:16]
ST_34 : Operation 220 [2/2] (3.25ns)   --->   "%tempB_load_25 = load i32* %tempB_addr_26, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 35> : 3.25ns
ST_35 : Operation 221 [1/2] (3.25ns)   --->   "%tempB_load_24 = load i32* %tempB_addr_25, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_35 : Operation 222 [1/2] (3.25ns)   --->   "%tempB_load_25 = load i32* %tempB_addr_26, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_35 : Operation 223 [1/1] (0.00ns)   --->   "%tempB_addr_27 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 19" [matrix_mult/matrix_mult.cpp:16]
ST_35 : Operation 224 [2/2] (3.25ns)   --->   "%tempB_load_26 = load i32* %tempB_addr_27, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_35 : Operation 225 [1/1] (0.00ns)   --->   "%tempB_addr_28 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 27" [matrix_mult/matrix_mult.cpp:16]
ST_35 : Operation 226 [2/2] (3.25ns)   --->   "%tempB_load_27 = load i32* %tempB_addr_28, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 36> : 3.25ns
ST_36 : Operation 227 [1/2] (3.25ns)   --->   "%tempB_load_26 = load i32* %tempB_addr_27, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_36 : Operation 228 [1/2] (3.25ns)   --->   "%tempB_load_27 = load i32* %tempB_addr_28, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_36 : Operation 229 [1/1] (0.00ns)   --->   "%tempB_addr_29 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 35" [matrix_mult/matrix_mult.cpp:16]
ST_36 : Operation 230 [2/2] (3.25ns)   --->   "%tempB_load_28 = load i32* %tempB_addr_29, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_36 : Operation 231 [1/1] (0.00ns)   --->   "%tempB_addr_30 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 43" [matrix_mult/matrix_mult.cpp:16]
ST_36 : Operation 232 [2/2] (3.25ns)   --->   "%tempB_load_29 = load i32* %tempB_addr_30, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 37> : 3.25ns
ST_37 : Operation 233 [1/2] (3.25ns)   --->   "%tempB_load_28 = load i32* %tempB_addr_29, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_37 : Operation 234 [1/2] (3.25ns)   --->   "%tempB_load_29 = load i32* %tempB_addr_30, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_37 : Operation 235 [1/1] (0.00ns)   --->   "%tempB_addr_31 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 51" [matrix_mult/matrix_mult.cpp:16]
ST_37 : Operation 236 [2/2] (3.25ns)   --->   "%tempB_load_30 = load i32* %tempB_addr_31, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_37 : Operation 237 [1/1] (0.00ns)   --->   "%tempB_addr_32 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 59" [matrix_mult/matrix_mult.cpp:16]
ST_37 : Operation 238 [2/2] (3.25ns)   --->   "%tempB_load_31 = load i32* %tempB_addr_32, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 38> : 3.25ns
ST_38 : Operation 239 [1/2] (3.25ns)   --->   "%tempB_load_30 = load i32* %tempB_addr_31, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_38 : Operation 240 [1/2] (3.25ns)   --->   "%tempB_load_31 = load i32* %tempB_addr_32, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_38 : Operation 241 [1/1] (0.00ns)   --->   "%tempB_addr_33 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 4" [matrix_mult/matrix_mult.cpp:16]
ST_38 : Operation 242 [2/2] (3.25ns)   --->   "%tempB_load_32 = load i32* %tempB_addr_33, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_38 : Operation 243 [1/1] (0.00ns)   --->   "%tempB_addr_34 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 12" [matrix_mult/matrix_mult.cpp:16]
ST_38 : Operation 244 [2/2] (3.25ns)   --->   "%tempB_load_33 = load i32* %tempB_addr_34, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 39> : 3.25ns
ST_39 : Operation 245 [1/2] (3.25ns)   --->   "%tempB_load_32 = load i32* %tempB_addr_33, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_39 : Operation 246 [1/2] (3.25ns)   --->   "%tempB_load_33 = load i32* %tempB_addr_34, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_39 : Operation 247 [1/1] (0.00ns)   --->   "%tempB_addr_35 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 20" [matrix_mult/matrix_mult.cpp:16]
ST_39 : Operation 248 [2/2] (3.25ns)   --->   "%tempB_load_34 = load i32* %tempB_addr_35, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_39 : Operation 249 [1/1] (0.00ns)   --->   "%tempB_addr_36 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 28" [matrix_mult/matrix_mult.cpp:16]
ST_39 : Operation 250 [2/2] (3.25ns)   --->   "%tempB_load_35 = load i32* %tempB_addr_36, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 40> : 3.25ns
ST_40 : Operation 251 [1/2] (3.25ns)   --->   "%tempB_load_34 = load i32* %tempB_addr_35, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_40 : Operation 252 [1/2] (3.25ns)   --->   "%tempB_load_35 = load i32* %tempB_addr_36, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_40 : Operation 253 [1/1] (0.00ns)   --->   "%tempB_addr_37 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 36" [matrix_mult/matrix_mult.cpp:16]
ST_40 : Operation 254 [2/2] (3.25ns)   --->   "%tempB_load_36 = load i32* %tempB_addr_37, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_40 : Operation 255 [1/1] (0.00ns)   --->   "%tempB_addr_38 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 44" [matrix_mult/matrix_mult.cpp:16]
ST_40 : Operation 256 [2/2] (3.25ns)   --->   "%tempB_load_37 = load i32* %tempB_addr_38, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 41> : 3.25ns
ST_41 : Operation 257 [1/2] (3.25ns)   --->   "%tempB_load_36 = load i32* %tempB_addr_37, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_41 : Operation 258 [1/2] (3.25ns)   --->   "%tempB_load_37 = load i32* %tempB_addr_38, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_41 : Operation 259 [1/1] (0.00ns)   --->   "%tempB_addr_39 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 52" [matrix_mult/matrix_mult.cpp:16]
ST_41 : Operation 260 [2/2] (3.25ns)   --->   "%tempB_load_38 = load i32* %tempB_addr_39, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_41 : Operation 261 [1/1] (0.00ns)   --->   "%tempB_addr_40 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 60" [matrix_mult/matrix_mult.cpp:16]
ST_41 : Operation 262 [2/2] (3.25ns)   --->   "%tempB_load_39 = load i32* %tempB_addr_40, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 42> : 3.25ns
ST_42 : Operation 263 [1/2] (3.25ns)   --->   "%tempB_load_38 = load i32* %tempB_addr_39, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_42 : Operation 264 [1/2] (3.25ns)   --->   "%tempB_load_39 = load i32* %tempB_addr_40, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_42 : Operation 265 [1/1] (0.00ns)   --->   "%tempB_addr_41 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 5" [matrix_mult/matrix_mult.cpp:16]
ST_42 : Operation 266 [2/2] (3.25ns)   --->   "%tempB_load_40 = load i32* %tempB_addr_41, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_42 : Operation 267 [1/1] (0.00ns)   --->   "%tempB_addr_42 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 13" [matrix_mult/matrix_mult.cpp:16]
ST_42 : Operation 268 [2/2] (3.25ns)   --->   "%tempB_load_41 = load i32* %tempB_addr_42, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 43> : 3.25ns
ST_43 : Operation 269 [1/2] (3.25ns)   --->   "%tempB_load_40 = load i32* %tempB_addr_41, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_43 : Operation 270 [1/2] (3.25ns)   --->   "%tempB_load_41 = load i32* %tempB_addr_42, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_43 : Operation 271 [1/1] (0.00ns)   --->   "%tempB_addr_43 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 21" [matrix_mult/matrix_mult.cpp:16]
ST_43 : Operation 272 [2/2] (3.25ns)   --->   "%tempB_load_42 = load i32* %tempB_addr_43, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_43 : Operation 273 [1/1] (0.00ns)   --->   "%tempB_addr_44 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 29" [matrix_mult/matrix_mult.cpp:16]
ST_43 : Operation 274 [2/2] (3.25ns)   --->   "%tempB_load_43 = load i32* %tempB_addr_44, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 44> : 3.25ns
ST_44 : Operation 275 [1/2] (3.25ns)   --->   "%tempB_load_42 = load i32* %tempB_addr_43, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_44 : Operation 276 [1/2] (3.25ns)   --->   "%tempB_load_43 = load i32* %tempB_addr_44, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_44 : Operation 277 [1/1] (0.00ns)   --->   "%tempB_addr_45 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 37" [matrix_mult/matrix_mult.cpp:16]
ST_44 : Operation 278 [2/2] (3.25ns)   --->   "%tempB_load_44 = load i32* %tempB_addr_45, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_44 : Operation 279 [1/1] (0.00ns)   --->   "%tempB_addr_46 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 45" [matrix_mult/matrix_mult.cpp:16]
ST_44 : Operation 280 [2/2] (3.25ns)   --->   "%tempB_load_45 = load i32* %tempB_addr_46, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 45> : 3.25ns
ST_45 : Operation 281 [1/2] (3.25ns)   --->   "%tempB_load_44 = load i32* %tempB_addr_45, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_45 : Operation 282 [1/2] (3.25ns)   --->   "%tempB_load_45 = load i32* %tempB_addr_46, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_45 : Operation 283 [1/1] (0.00ns)   --->   "%tempB_addr_47 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 53" [matrix_mult/matrix_mult.cpp:16]
ST_45 : Operation 284 [2/2] (3.25ns)   --->   "%tempB_load_46 = load i32* %tempB_addr_47, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_45 : Operation 285 [1/1] (0.00ns)   --->   "%tempB_addr_48 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 61" [matrix_mult/matrix_mult.cpp:16]
ST_45 : Operation 286 [2/2] (3.25ns)   --->   "%tempB_load_47 = load i32* %tempB_addr_48, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 46> : 3.25ns
ST_46 : Operation 287 [1/2] (3.25ns)   --->   "%tempB_load_46 = load i32* %tempB_addr_47, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_46 : Operation 288 [1/2] (3.25ns)   --->   "%tempB_load_47 = load i32* %tempB_addr_48, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_46 : Operation 289 [1/1] (0.00ns)   --->   "%tempB_addr_49 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 6" [matrix_mult/matrix_mult.cpp:16]
ST_46 : Operation 290 [2/2] (3.25ns)   --->   "%tempB_load_48 = load i32* %tempB_addr_49, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_46 : Operation 291 [1/1] (0.00ns)   --->   "%tempB_addr_50 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 14" [matrix_mult/matrix_mult.cpp:16]
ST_46 : Operation 292 [2/2] (3.25ns)   --->   "%tempB_load_49 = load i32* %tempB_addr_50, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 47> : 3.25ns
ST_47 : Operation 293 [1/2] (3.25ns)   --->   "%tempB_load_48 = load i32* %tempB_addr_49, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_47 : Operation 294 [1/2] (3.25ns)   --->   "%tempB_load_49 = load i32* %tempB_addr_50, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_47 : Operation 295 [1/1] (0.00ns)   --->   "%tempB_addr_51 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 22" [matrix_mult/matrix_mult.cpp:16]
ST_47 : Operation 296 [2/2] (3.25ns)   --->   "%tempB_load_50 = load i32* %tempB_addr_51, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_47 : Operation 297 [1/1] (0.00ns)   --->   "%tempB_addr_52 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 30" [matrix_mult/matrix_mult.cpp:16]
ST_47 : Operation 298 [2/2] (3.25ns)   --->   "%tempB_load_51 = load i32* %tempB_addr_52, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 48> : 3.25ns
ST_48 : Operation 299 [1/2] (3.25ns)   --->   "%tempB_load_50 = load i32* %tempB_addr_51, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_48 : Operation 300 [1/2] (3.25ns)   --->   "%tempB_load_51 = load i32* %tempB_addr_52, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_48 : Operation 301 [1/1] (0.00ns)   --->   "%tempB_addr_53 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 38" [matrix_mult/matrix_mult.cpp:16]
ST_48 : Operation 302 [2/2] (3.25ns)   --->   "%tempB_load_52 = load i32* %tempB_addr_53, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_48 : Operation 303 [1/1] (0.00ns)   --->   "%tempB_addr_54 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 46" [matrix_mult/matrix_mult.cpp:16]
ST_48 : Operation 304 [2/2] (3.25ns)   --->   "%tempB_load_53 = load i32* %tempB_addr_54, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 49> : 3.25ns
ST_49 : Operation 305 [1/2] (3.25ns)   --->   "%tempB_load_52 = load i32* %tempB_addr_53, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_49 : Operation 306 [1/2] (3.25ns)   --->   "%tempB_load_53 = load i32* %tempB_addr_54, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_49 : Operation 307 [1/1] (0.00ns)   --->   "%tempB_addr_55 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 54" [matrix_mult/matrix_mult.cpp:16]
ST_49 : Operation 308 [2/2] (3.25ns)   --->   "%tempB_load_54 = load i32* %tempB_addr_55, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_49 : Operation 309 [1/1] (0.00ns)   --->   "%tempB_addr_56 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 62" [matrix_mult/matrix_mult.cpp:16]
ST_49 : Operation 310 [2/2] (3.25ns)   --->   "%tempB_load_55 = load i32* %tempB_addr_56, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 50> : 3.25ns
ST_50 : Operation 311 [1/2] (3.25ns)   --->   "%tempB_load_54 = load i32* %tempB_addr_55, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_50 : Operation 312 [1/2] (3.25ns)   --->   "%tempB_load_55 = load i32* %tempB_addr_56, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_50 : Operation 313 [1/1] (0.00ns)   --->   "%tempB_addr_57 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 7" [matrix_mult/matrix_mult.cpp:16]
ST_50 : Operation 314 [2/2] (3.25ns)   --->   "%tempB_load_56 = load i32* %tempB_addr_57, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_50 : Operation 315 [1/1] (0.00ns)   --->   "%tempB_addr_58 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 15" [matrix_mult/matrix_mult.cpp:16]
ST_50 : Operation 316 [2/2] (3.25ns)   --->   "%tempB_load_57 = load i32* %tempB_addr_58, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 51> : 3.25ns
ST_51 : Operation 317 [1/2] (3.25ns)   --->   "%tempB_load_56 = load i32* %tempB_addr_57, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_51 : Operation 318 [1/2] (3.25ns)   --->   "%tempB_load_57 = load i32* %tempB_addr_58, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_51 : Operation 319 [1/1] (0.00ns)   --->   "%tempB_addr_59 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 23" [matrix_mult/matrix_mult.cpp:16]
ST_51 : Operation 320 [2/2] (3.25ns)   --->   "%tempB_load_58 = load i32* %tempB_addr_59, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_51 : Operation 321 [1/1] (0.00ns)   --->   "%tempB_addr_60 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 31" [matrix_mult/matrix_mult.cpp:16]
ST_51 : Operation 322 [2/2] (3.25ns)   --->   "%tempB_load_59 = load i32* %tempB_addr_60, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 52> : 3.25ns
ST_52 : Operation 323 [1/2] (3.25ns)   --->   "%tempB_load_58 = load i32* %tempB_addr_59, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_52 : Operation 324 [1/2] (3.25ns)   --->   "%tempB_load_59 = load i32* %tempB_addr_60, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_52 : Operation 325 [1/1] (0.00ns)   --->   "%tempB_addr_61 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 39" [matrix_mult/matrix_mult.cpp:16]
ST_52 : Operation 326 [2/2] (3.25ns)   --->   "%tempB_load_60 = load i32* %tempB_addr_61, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_52 : Operation 327 [1/1] (0.00ns)   --->   "%tempB_addr_62 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 47" [matrix_mult/matrix_mult.cpp:16]
ST_52 : Operation 328 [2/2] (3.25ns)   --->   "%tempB_load_61 = load i32* %tempB_addr_62, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 53> : 3.25ns
ST_53 : Operation 329 [1/2] (3.25ns)   --->   "%tempB_load_60 = load i32* %tempB_addr_61, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_53 : Operation 330 [1/2] (3.25ns)   --->   "%tempB_load_61 = load i32* %tempB_addr_62, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_53 : Operation 331 [1/1] (0.00ns)   --->   "%tempB_addr_63 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 55" [matrix_mult/matrix_mult.cpp:16]
ST_53 : Operation 332 [2/2] (3.25ns)   --->   "%tempB_load_62 = load i32* %tempB_addr_63, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_53 : Operation 333 [1/1] (0.00ns)   --->   "%tempB_addr_64 = getelementptr inbounds [64 x i32]* %tempB, i64 0, i64 63" [matrix_mult/matrix_mult.cpp:16]
ST_53 : Operation 334 [2/2] (3.25ns)   --->   "%tempB_load_63 = load i32* %tempB_addr_64, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 54> : 3.25ns
ST_54 : Operation 335 [1/2] (3.25ns)   --->   "%tempB_load_62 = load i32* %tempB_addr_63, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_54 : Operation 336 [1/2] (3.25ns)   --->   "%tempB_load_63 = load i32* %tempB_addr_64, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_54 : Operation 337 [1/1] (1.76ns)   --->   "br label %burst.rd.end6.0"

 <State 55> : 3.25ns
ST_55 : Operation 338 [1/1] (0.00ns)   --->   "%i = phi i4 [ %i_1_1, %burst.rd.end6.1 ], [ 0, %burst.rd.end6.0.preheader ]" [matrix_mult/matrix_mult.cpp:10]
ST_55 : Operation 339 [1/1] (1.30ns)   --->   "%exitcond2 = icmp eq i4 %i, -8" [matrix_mult/matrix_mult.cpp:10]   --->   Core 25 'Cmp' <Latency = 0> <II = 1> <Delay = 1.48> <FuncUnit> <Opcode : 'icmp'> <InPorts = 2> <OutPorts = 1>
ST_55 : Operation 340 [1/1] (0.00ns)   --->   "br i1 %exitcond2, label %burst.wr.header.preheader, label %burst.rd.end6.1" [matrix_mult/matrix_mult.cpp:10]
ST_55 : Operation 341 [1/1] (0.00ns)   --->   "%tmp_8 = trunc i4 %i to i3" [matrix_mult/matrix_mult.cpp:10]
ST_55 : Operation 342 [1/1] (0.00ns)   --->   "%tmp_s = call i6 @_ssdm_op_BitConcatenate.i6.i3.i3(i3 %tmp_8, i3 0)" [matrix_mult/matrix_mult.cpp:13]
ST_55 : Operation 343 [1/1] (0.00ns)   --->   "%tmp_6 = zext i6 %tmp_s to i64" [matrix_mult/matrix_mult.cpp:13]
ST_55 : Operation 344 [1/1] (0.00ns)   --->   "%tempA_addr_1 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_6" [matrix_mult/matrix_mult.cpp:16]
ST_55 : Operation 345 [2/2] (3.25ns)   --->   "%tempA_load = load i32* %tempA_addr_1, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_55 : Operation 346 [1/1] (0.00ns)   --->   "%tmp_8_0_0_s = or i6 %tmp_s, 1" [matrix_mult/matrix_mult.cpp:16]
ST_55 : Operation 347 [1/1] (0.00ns)   --->   "%tmp_9_0_0_1 = zext i6 %tmp_8_0_0_s to i64" [matrix_mult/matrix_mult.cpp:16]
ST_55 : Operation 348 [1/1] (0.00ns)   --->   "%tempA_addr_2 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_0_0_1" [matrix_mult/matrix_mult.cpp:16]
ST_55 : Operation 349 [2/2] (3.25ns)   --->   "%tempA_load_1 = load i32* %tempA_addr_2, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 56> : 3.25ns
ST_56 : Operation 350 [1/2] (3.25ns)   --->   "%tempA_load = load i32* %tempA_addr_1, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_56 : Operation 351 [1/2] (3.25ns)   --->   "%tempA_load_1 = load i32* %tempA_addr_2, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_56 : Operation 352 [1/1] (0.00ns)   --->   "%tmp_8_0_0_1 = or i6 %tmp_s, 2" [matrix_mult/matrix_mult.cpp:16]
ST_56 : Operation 353 [1/1] (0.00ns)   --->   "%tmp_9_0_0_2 = zext i6 %tmp_8_0_0_1 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_56 : Operation 354 [1/1] (0.00ns)   --->   "%tempA_addr_3 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_0_0_2" [matrix_mult/matrix_mult.cpp:16]
ST_56 : Operation 355 [2/2] (3.25ns)   --->   "%tempA_load_2 = load i32* %tempA_addr_3, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_56 : Operation 356 [1/1] (0.00ns)   --->   "%tmp_8_0_0_2 = or i6 %tmp_s, 3" [matrix_mult/matrix_mult.cpp:16]
ST_56 : Operation 357 [1/1] (0.00ns)   --->   "%tmp_9_0_0_3 = zext i6 %tmp_8_0_0_2 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_56 : Operation 358 [1/1] (0.00ns)   --->   "%tempA_addr_4 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_0_0_3" [matrix_mult/matrix_mult.cpp:16]
ST_56 : Operation 359 [2/2] (3.25ns)   --->   "%tempA_load_3 = load i32* %tempA_addr_4, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 57> : 8.51ns
ST_57 : Operation 360 [1/1] (8.51ns)   --->   "%tmp_9 = mul nsw i32 %tempB_load, %tempA_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 361 [1/1] (8.51ns)   --->   "%tmp_10_0_0_1 = mul nsw i32 %tempB_load_1, %tempA_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 362 [1/2] (3.25ns)   --->   "%tempA_load_2 = load i32* %tempA_addr_3, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_57 : Operation 363 [1/2] (3.25ns)   --->   "%tempA_load_3 = load i32* %tempA_addr_4, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_57 : Operation 364 [1/1] (0.00ns)   --->   "%tmp_8_0_0_3 = or i6 %tmp_s, 4" [matrix_mult/matrix_mult.cpp:16]
ST_57 : Operation 365 [1/1] (0.00ns)   --->   "%tmp_9_0_0_4 = zext i6 %tmp_8_0_0_3 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_57 : Operation 366 [1/1] (0.00ns)   --->   "%tempA_addr_5 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_0_0_4" [matrix_mult/matrix_mult.cpp:16]
ST_57 : Operation 367 [2/2] (3.25ns)   --->   "%tempA_load_4 = load i32* %tempA_addr_5, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_57 : Operation 368 [1/1] (0.00ns)   --->   "%tmp_8_0_0_4 = or i6 %tmp_s, 5" [matrix_mult/matrix_mult.cpp:16]
ST_57 : Operation 369 [1/1] (0.00ns)   --->   "%tmp_9_0_0_5 = zext i6 %tmp_8_0_0_4 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_57 : Operation 370 [1/1] (0.00ns)   --->   "%tempA_addr_6 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_0_0_5" [matrix_mult/matrix_mult.cpp:16]
ST_57 : Operation 371 [2/2] (3.25ns)   --->   "%tempA_load_5 = load i32* %tempA_addr_6, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_57 : Operation 372 [1/1] (8.51ns)   --->   "%tmp_10_0_1 = mul nsw i32 %tempB_load_8, %tempA_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 373 [1/1] (8.51ns)   --->   "%tmp_10_0_1_1 = mul nsw i32 %tempB_load_9, %tempA_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 374 [1/1] (8.51ns)   --->   "%tmp_10_0_2 = mul nsw i32 %tempB_load_16, %tempA_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 375 [1/1] (8.51ns)   --->   "%tmp_10_0_2_1 = mul nsw i32 %tempB_load_17, %tempA_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 376 [1/1] (8.51ns)   --->   "%tmp_10_0_3 = mul nsw i32 %tempB_load_24, %tempA_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 377 [1/1] (8.51ns)   --->   "%tmp_10_0_3_1 = mul nsw i32 %tempB_load_25, %tempA_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 378 [1/1] (8.51ns)   --->   "%tmp_10_0_4 = mul nsw i32 %tempB_load_32, %tempA_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 379 [1/1] (8.51ns)   --->   "%tmp_10_0_4_1 = mul nsw i32 %tempB_load_33, %tempA_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 380 [1/1] (8.51ns)   --->   "%tmp_10_0_5 = mul nsw i32 %tempB_load_40, %tempA_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 381 [1/1] (8.51ns)   --->   "%tmp_10_0_5_1 = mul nsw i32 %tempB_load_41, %tempA_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 382 [1/1] (8.51ns)   --->   "%tmp_10_0_6 = mul nsw i32 %tempB_load_48, %tempA_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 383 [1/1] (8.51ns)   --->   "%tmp_10_0_6_1 = mul nsw i32 %tempB_load_49, %tempA_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 384 [1/1] (8.51ns)   --->   "%tmp_10_0_7 = mul nsw i32 %tempB_load_56, %tempA_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_57 : Operation 385 [1/1] (8.51ns)   --->   "%tmp_10_0_7_1 = mul nsw i32 %tempB_load_57, %tempA_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>

 <State 58> : 8.51ns
ST_58 : Operation 386 [1/1] (8.51ns)   --->   "%tmp_10_0_0_2 = mul nsw i32 %tempB_load_2, %tempA_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 387 [1/1] (8.51ns)   --->   "%tmp_10_0_0_3 = mul nsw i32 %tempB_load_3, %tempA_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 388 [1/2] (3.25ns)   --->   "%tempA_load_4 = load i32* %tempA_addr_5, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_58 : Operation 389 [1/2] (3.25ns)   --->   "%tempA_load_5 = load i32* %tempA_addr_6, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_58 : Operation 390 [1/1] (0.00ns)   --->   "%tmp_8_0_0_5 = or i6 %tmp_s, 6" [matrix_mult/matrix_mult.cpp:16]
ST_58 : Operation 391 [1/1] (0.00ns)   --->   "%tmp_9_0_0_6 = zext i6 %tmp_8_0_0_5 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_58 : Operation 392 [1/1] (0.00ns)   --->   "%tempA_addr_7 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_0_0_6" [matrix_mult/matrix_mult.cpp:16]
ST_58 : Operation 393 [2/2] (3.25ns)   --->   "%tempA_load_6 = load i32* %tempA_addr_7, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_58 : Operation 394 [1/1] (0.00ns)   --->   "%tmp_8_0_0_6 = or i6 %tmp_s, 7" [matrix_mult/matrix_mult.cpp:16]
ST_58 : Operation 395 [1/1] (0.00ns)   --->   "%tmp_9_0_0_7 = zext i6 %tmp_8_0_0_6 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_58 : Operation 396 [1/1] (0.00ns)   --->   "%tempA_addr_8 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_0_0_7" [matrix_mult/matrix_mult.cpp:16]
ST_58 : Operation 397 [2/2] (3.25ns)   --->   "%tempA_load_7 = load i32* %tempA_addr_8, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_58 : Operation 398 [1/1] (2.55ns)   --->   "%tmp2 = add i32 %tmp_9, %tmp_10_0_0_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 399 [1/1] (8.51ns)   --->   "%tmp_10_0_1_2 = mul nsw i32 %tempB_load_10, %tempA_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 400 [1/1] (8.51ns)   --->   "%tmp_10_0_1_3 = mul nsw i32 %tempB_load_11, %tempA_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 401 [1/1] (2.55ns)   --->   "%tmp8 = add i32 %tmp_10_0_1, %tmp_10_0_1_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 402 [1/1] (8.51ns)   --->   "%tmp_10_0_2_2 = mul nsw i32 %tempB_load_18, %tempA_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 403 [1/1] (8.51ns)   --->   "%tmp_10_0_2_3 = mul nsw i32 %tempB_load_19, %tempA_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 404 [1/1] (2.55ns)   --->   "%tmp14 = add i32 %tmp_10_0_2, %tmp_10_0_2_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 405 [1/1] (8.51ns)   --->   "%tmp_10_0_3_2 = mul nsw i32 %tempB_load_26, %tempA_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 406 [1/1] (8.51ns)   --->   "%tmp_10_0_3_3 = mul nsw i32 %tempB_load_27, %tempA_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 407 [1/1] (2.55ns)   --->   "%tmp20 = add i32 %tmp_10_0_3, %tmp_10_0_3_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 408 [1/1] (8.51ns)   --->   "%tmp_10_0_4_2 = mul nsw i32 %tempB_load_34, %tempA_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 409 [1/1] (8.51ns)   --->   "%tmp_10_0_4_3 = mul nsw i32 %tempB_load_35, %tempA_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 410 [1/1] (2.55ns)   --->   "%tmp26 = add i32 %tmp_10_0_4, %tmp_10_0_4_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 411 [1/1] (8.51ns)   --->   "%tmp_10_0_5_2 = mul nsw i32 %tempB_load_42, %tempA_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 412 [1/1] (8.51ns)   --->   "%tmp_10_0_5_3 = mul nsw i32 %tempB_load_43, %tempA_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 413 [1/1] (2.55ns)   --->   "%tmp32 = add i32 %tmp_10_0_5, %tmp_10_0_5_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 414 [1/1] (8.51ns)   --->   "%tmp_10_0_6_2 = mul nsw i32 %tempB_load_50, %tempA_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 415 [1/1] (8.51ns)   --->   "%tmp_10_0_6_3 = mul nsw i32 %tempB_load_51, %tempA_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 416 [1/1] (2.55ns)   --->   "%tmp38 = add i32 %tmp_10_0_6, %tmp_10_0_6_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 417 [1/1] (8.51ns)   --->   "%tmp_10_0_7_2 = mul nsw i32 %tempB_load_58, %tempA_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 418 [1/1] (8.51ns)   --->   "%tmp_10_0_7_3 = mul nsw i32 %tempB_load_59, %tempA_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_58 : Operation 419 [1/1] (2.55ns)   --->   "%tmp44 = add i32 %tmp_10_0_7, %tmp_10_0_7_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>

 <State 59> : 8.51ns
ST_59 : Operation 420 [1/1] (8.51ns)   --->   "%tmp_10_0_0_4 = mul nsw i32 %tempB_load_4, %tempA_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 421 [1/1] (8.51ns)   --->   "%tmp_10_0_0_5 = mul nsw i32 %tempB_load_5, %tempA_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 422 [1/2] (3.25ns)   --->   "%tempA_load_6 = load i32* %tempA_addr_7, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_59 : Operation 423 [1/2] (3.25ns)   --->   "%tempA_load_7 = load i32* %tempA_addr_8, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_59 : Operation 424 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp3 = add i32 %tmp_10_0_0_3, %tmp_10_0_0_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 425 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp1 = add i32 %tmp2, %tmp3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 426 [1/1] (8.51ns)   --->   "%tmp_10_0_1_4 = mul nsw i32 %tempB_load_12, %tempA_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 427 [1/1] (8.51ns)   --->   "%tmp_10_0_1_5 = mul nsw i32 %tempB_load_13, %tempA_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 428 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp9 = add i32 %tmp_10_0_1_3, %tmp_10_0_1_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 429 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp7 = add i32 %tmp8, %tmp9" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 430 [1/1] (8.51ns)   --->   "%tmp_10_0_2_4 = mul nsw i32 %tempB_load_20, %tempA_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 431 [1/1] (8.51ns)   --->   "%tmp_10_0_2_5 = mul nsw i32 %tempB_load_21, %tempA_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 432 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp15 = add i32 %tmp_10_0_2_3, %tmp_10_0_2_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 433 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp13 = add i32 %tmp14, %tmp15" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 434 [1/1] (8.51ns)   --->   "%tmp_10_0_3_4 = mul nsw i32 %tempB_load_28, %tempA_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 435 [1/1] (8.51ns)   --->   "%tmp_10_0_3_5 = mul nsw i32 %tempB_load_29, %tempA_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 436 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp21 = add i32 %tmp_10_0_3_3, %tmp_10_0_3_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 437 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp19 = add i32 %tmp20, %tmp21" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 438 [1/1] (8.51ns)   --->   "%tmp_10_0_4_4 = mul nsw i32 %tempB_load_36, %tempA_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 439 [1/1] (8.51ns)   --->   "%tmp_10_0_4_5 = mul nsw i32 %tempB_load_37, %tempA_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 440 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp27 = add i32 %tmp_10_0_4_3, %tmp_10_0_4_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 441 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp25 = add i32 %tmp26, %tmp27" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 442 [1/1] (8.51ns)   --->   "%tmp_10_0_5_4 = mul nsw i32 %tempB_load_44, %tempA_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 443 [1/1] (8.51ns)   --->   "%tmp_10_0_5_5 = mul nsw i32 %tempB_load_45, %tempA_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 444 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp33 = add i32 %tmp_10_0_5_3, %tmp_10_0_5_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 445 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp31 = add i32 %tmp32, %tmp33" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 446 [1/1] (8.51ns)   --->   "%tmp_10_0_6_4 = mul nsw i32 %tempB_load_52, %tempA_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 447 [1/1] (8.51ns)   --->   "%tmp_10_0_6_5 = mul nsw i32 %tempB_load_53, %tempA_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 448 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp39 = add i32 %tmp_10_0_6_3, %tmp_10_0_6_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 449 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp37 = add i32 %tmp38, %tmp39" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 450 [1/1] (8.51ns)   --->   "%tmp_10_0_7_4 = mul nsw i32 %tempB_load_60, %tempA_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 451 [1/1] (8.51ns)   --->   "%tmp_10_0_7_5 = mul nsw i32 %tempB_load_61, %tempA_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_59 : Operation 452 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp45 = add i32 %tmp_10_0_7_3, %tmp_10_0_7_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 453 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp43 = add i32 %tmp44, %tmp45" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_59 : Operation 454 [1/1] (0.00ns)   --->   "%tmp_2_1 = or i6 %tmp_s, 8" [matrix_mult/matrix_mult.cpp:13]
ST_59 : Operation 455 [1/1] (0.00ns)   --->   "%tmp_6_1 = zext i6 %tmp_2_1 to i64" [matrix_mult/matrix_mult.cpp:13]
ST_59 : Operation 456 [1/1] (0.00ns)   --->   "%tempA_addr_9 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_6_1" [matrix_mult/matrix_mult.cpp:16]
ST_59 : Operation 457 [2/2] (3.25ns)   --->   "%tempA_load_8 = load i32* %tempA_addr_9, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_59 : Operation 458 [1/1] (0.00ns)   --->   "%tmp_8_1_0_s = or i6 %tmp_s, 9" [matrix_mult/matrix_mult.cpp:16]
ST_59 : Operation 459 [1/1] (0.00ns)   --->   "%tmp_9_1_0_1 = zext i6 %tmp_8_1_0_s to i64" [matrix_mult/matrix_mult.cpp:16]
ST_59 : Operation 460 [1/1] (0.00ns)   --->   "%tempA_addr_10 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_1_0_1" [matrix_mult/matrix_mult.cpp:16]
ST_59 : Operation 461 [2/2] (3.25ns)   --->   "%tempA_load_9 = load i32* %tempA_addr_10, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 60> : 8.51ns
ST_60 : Operation 462 [1/1] (8.51ns)   --->   "%tmp_10_0_0_6 = mul nsw i32 %tempB_load_6, %tempA_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 463 [1/1] (8.51ns)   --->   "%tmp_10_0_0_7 = mul nsw i32 %tempB_load_7, %tempA_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 464 [1/1] (2.55ns)   --->   "%tmp5 = add i32 %tmp_10_0_0_5, %tmp_10_0_0_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 465 [1/1] (8.51ns)   --->   "%tmp_10_0_1_6 = mul nsw i32 %tempB_load_14, %tempA_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 466 [1/1] (8.51ns)   --->   "%tmp_10_0_1_7 = mul nsw i32 %tempB_load_15, %tempA_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 467 [1/1] (2.55ns)   --->   "%tmp11 = add i32 %tmp_10_0_1_5, %tmp_10_0_1_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 468 [1/1] (8.51ns)   --->   "%tmp_10_0_2_6 = mul nsw i32 %tempB_load_22, %tempA_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 469 [1/1] (8.51ns)   --->   "%tmp_10_0_2_7 = mul nsw i32 %tempB_load_23, %tempA_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 470 [1/1] (2.55ns)   --->   "%tmp17 = add i32 %tmp_10_0_2_5, %tmp_10_0_2_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 471 [1/1] (8.51ns)   --->   "%tmp_10_0_3_6 = mul nsw i32 %tempB_load_30, %tempA_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 472 [1/1] (8.51ns)   --->   "%tmp_10_0_3_7 = mul nsw i32 %tempB_load_31, %tempA_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 473 [1/1] (2.55ns)   --->   "%tmp23 = add i32 %tmp_10_0_3_5, %tmp_10_0_3_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 474 [1/1] (8.51ns)   --->   "%tmp_10_0_4_6 = mul nsw i32 %tempB_load_38, %tempA_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 475 [1/1] (8.51ns)   --->   "%tmp_10_0_4_7 = mul nsw i32 %tempB_load_39, %tempA_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 476 [1/1] (2.55ns)   --->   "%tmp29 = add i32 %tmp_10_0_4_5, %tmp_10_0_4_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 477 [1/1] (8.51ns)   --->   "%tmp_10_0_5_6 = mul nsw i32 %tempB_load_46, %tempA_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 478 [1/1] (8.51ns)   --->   "%tmp_10_0_5_7 = mul nsw i32 %tempB_load_47, %tempA_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 479 [1/1] (2.55ns)   --->   "%tmp35 = add i32 %tmp_10_0_5_5, %tmp_10_0_5_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 480 [1/1] (8.51ns)   --->   "%tmp_10_0_6_6 = mul nsw i32 %tempB_load_54, %tempA_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 481 [1/1] (8.51ns)   --->   "%tmp_10_0_6_7 = mul nsw i32 %tempB_load_55, %tempA_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 482 [1/1] (2.55ns)   --->   "%tmp41 = add i32 %tmp_10_0_6_5, %tmp_10_0_6_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 483 [1/1] (8.51ns)   --->   "%tmp_10_0_7_6 = mul nsw i32 %tempB_load_62, %tempA_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 484 [1/1] (8.51ns)   --->   "%tmp_10_0_7_7 = mul nsw i32 %tempB_load_63, %tempA_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 485 [1/1] (2.55ns)   --->   "%tmp47 = add i32 %tmp_10_0_7_5, %tmp_10_0_7_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_60 : Operation 486 [1/2] (3.25ns)   --->   "%tempA_load_8 = load i32* %tempA_addr_9, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_60 : Operation 487 [1/2] (3.25ns)   --->   "%tempA_load_9 = load i32* %tempA_addr_10, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_60 : Operation 488 [1/1] (0.00ns)   --->   "%tmp_8_1_0_1 = or i6 %tmp_s, 10" [matrix_mult/matrix_mult.cpp:16]
ST_60 : Operation 489 [1/1] (0.00ns)   --->   "%tmp_9_1_0_2 = zext i6 %tmp_8_1_0_1 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_60 : Operation 490 [1/1] (0.00ns)   --->   "%tempA_addr_11 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_1_0_2" [matrix_mult/matrix_mult.cpp:16]
ST_60 : Operation 491 [2/2] (3.25ns)   --->   "%tempA_load_10 = load i32* %tempA_addr_11, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_60 : Operation 492 [1/1] (0.00ns)   --->   "%tmp_8_1_0_2 = or i6 %tmp_s, 11" [matrix_mult/matrix_mult.cpp:16]
ST_60 : Operation 493 [1/1] (0.00ns)   --->   "%tmp_9_1_0_3 = zext i6 %tmp_8_1_0_2 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_60 : Operation 494 [1/1] (0.00ns)   --->   "%tempA_addr_12 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_1_0_3" [matrix_mult/matrix_mult.cpp:16]
ST_60 : Operation 495 [2/2] (3.25ns)   --->   "%tempA_load_11 = load i32* %tempA_addr_12, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 61> : 8.51ns
ST_61 : Operation 496 [1/1] (2.55ns)   --->   "%tmp6 = add i32 %tmp_10_0_0_7, %tmp_10_0_0_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 497 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp4 = add i32 %tmp5, %tmp6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 498 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_0_7 = add nsw i32 %tmp1, %tmp4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 499 [1/1] (2.55ns)   --->   "%tmp12 = add i32 %tmp_10_0_1_7, %tmp_10_0_1_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 500 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp10 = add i32 %tmp11, %tmp12" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 501 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_1_7 = add nsw i32 %tmp7, %tmp10" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 502 [1/1] (2.55ns)   --->   "%tmp18 = add i32 %tmp_10_0_2_7, %tmp_10_0_2_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 503 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp16 = add i32 %tmp17, %tmp18" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 504 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_2_7 = add nsw i32 %tmp13, %tmp16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 505 [1/1] (2.55ns)   --->   "%tmp24 = add i32 %tmp_10_0_3_7, %tmp_10_0_3_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 506 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp22 = add i32 %tmp23, %tmp24" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 507 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_3_7 = add nsw i32 %tmp19, %tmp22" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 508 [1/1] (2.55ns)   --->   "%tmp30 = add i32 %tmp_10_0_4_7, %tmp_10_0_4_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 509 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp28 = add i32 %tmp29, %tmp30" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 510 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_4_7 = add nsw i32 %tmp25, %tmp28" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 511 [1/1] (2.55ns)   --->   "%tmp36 = add i32 %tmp_10_0_5_7, %tmp_10_0_5_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 512 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp34 = add i32 %tmp35, %tmp36" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 513 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_5_7 = add nsw i32 %tmp31, %tmp34" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 514 [1/1] (2.55ns)   --->   "%tmp42 = add i32 %tmp_10_0_6_7, %tmp_10_0_6_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 515 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp40 = add i32 %tmp41, %tmp42" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 516 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_6_7 = add nsw i32 %tmp37, %tmp40" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 517 [1/1] (2.55ns)   --->   "%tmp48 = add i32 %tmp_10_0_7_7, %tmp_10_0_7_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 518 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp46 = add i32 %tmp47, %tmp48" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 519 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_7_7 = add nsw i32 %tmp43, %tmp46" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_61 : Operation 520 [1/1] (8.51ns)   --->   "%tmp_10_1 = mul nsw i32 %tempB_load, %tempA_load_8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 521 [1/1] (8.51ns)   --->   "%tmp_10_1_0_1 = mul nsw i32 %tempB_load_1, %tempA_load_9" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 522 [1/2] (3.25ns)   --->   "%tempA_load_10 = load i32* %tempA_addr_11, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_61 : Operation 523 [1/2] (3.25ns)   --->   "%tempA_load_11 = load i32* %tempA_addr_12, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_61 : Operation 524 [1/1] (0.00ns)   --->   "%tmp_8_1_0_3 = or i6 %tmp_s, 12" [matrix_mult/matrix_mult.cpp:16]
ST_61 : Operation 525 [1/1] (0.00ns)   --->   "%tmp_9_1_0_4 = zext i6 %tmp_8_1_0_3 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_61 : Operation 526 [1/1] (0.00ns)   --->   "%tempA_addr_13 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_1_0_4" [matrix_mult/matrix_mult.cpp:16]
ST_61 : Operation 527 [2/2] (3.25ns)   --->   "%tempA_load_12 = load i32* %tempA_addr_13, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_61 : Operation 528 [1/1] (0.00ns)   --->   "%tmp_8_1_0_4 = or i6 %tmp_s, 13" [matrix_mult/matrix_mult.cpp:16]
ST_61 : Operation 529 [1/1] (0.00ns)   --->   "%tmp_9_1_0_5 = zext i6 %tmp_8_1_0_4 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_61 : Operation 530 [1/1] (0.00ns)   --->   "%tempA_addr_14 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_1_0_5" [matrix_mult/matrix_mult.cpp:16]
ST_61 : Operation 531 [2/2] (3.25ns)   --->   "%tempA_load_13 = load i32* %tempA_addr_14, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_61 : Operation 532 [1/1] (8.51ns)   --->   "%tmp_10_1_1 = mul nsw i32 %tempB_load_8, %tempA_load_8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 533 [1/1] (8.51ns)   --->   "%tmp_10_1_1_1 = mul nsw i32 %tempB_load_9, %tempA_load_9" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 534 [1/1] (8.51ns)   --->   "%tmp_10_1_2 = mul nsw i32 %tempB_load_16, %tempA_load_8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 535 [1/1] (8.51ns)   --->   "%tmp_10_1_2_1 = mul nsw i32 %tempB_load_17, %tempA_load_9" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 536 [1/1] (8.51ns)   --->   "%tmp_10_1_3 = mul nsw i32 %tempB_load_24, %tempA_load_8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 537 [1/1] (8.51ns)   --->   "%tmp_10_1_3_1 = mul nsw i32 %tempB_load_25, %tempA_load_9" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 538 [1/1] (8.51ns)   --->   "%tmp_10_1_4 = mul nsw i32 %tempB_load_32, %tempA_load_8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 539 [1/1] (8.51ns)   --->   "%tmp_10_1_4_1 = mul nsw i32 %tempB_load_33, %tempA_load_9" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 540 [1/1] (8.51ns)   --->   "%tmp_10_1_5 = mul nsw i32 %tempB_load_40, %tempA_load_8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 541 [1/1] (8.51ns)   --->   "%tmp_10_1_5_1 = mul nsw i32 %tempB_load_41, %tempA_load_9" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 542 [1/1] (8.51ns)   --->   "%tmp_10_1_6 = mul nsw i32 %tempB_load_48, %tempA_load_8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 543 [1/1] (8.51ns)   --->   "%tmp_10_1_6_1 = mul nsw i32 %tempB_load_49, %tempA_load_9" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 544 [1/1] (8.51ns)   --->   "%tmp_10_1_7 = mul nsw i32 %tempB_load_56, %tempA_load_8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_61 : Operation 545 [1/1] (8.51ns)   --->   "%tmp_10_1_7_1 = mul nsw i32 %tempB_load_57, %tempA_load_9" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>

 <State 62> : 8.51ns
ST_62 : Operation 546 [1/1] (0.00ns)   --->   "%tempResult_addr_1 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_6" [matrix_mult/matrix_mult.cpp:13]
ST_62 : Operation 547 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_0_7, i32* %tempResult_addr_1, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_62 : Operation 548 [1/1] (0.00ns)   --->   "%tempResult_addr_2 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_0_0_1" [matrix_mult/matrix_mult.cpp:13]
ST_62 : Operation 549 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_1_7, i32* %tempResult_addr_2, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_62 : Operation 550 [1/1] (8.51ns)   --->   "%tmp_10_1_0_2 = mul nsw i32 %tempB_load_2, %tempA_load_10" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 551 [1/1] (8.51ns)   --->   "%tmp_10_1_0_3 = mul nsw i32 %tempB_load_3, %tempA_load_11" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 552 [1/2] (3.25ns)   --->   "%tempA_load_12 = load i32* %tempA_addr_13, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_62 : Operation 553 [1/2] (3.25ns)   --->   "%tempA_load_13 = load i32* %tempA_addr_14, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_62 : Operation 554 [1/1] (0.00ns)   --->   "%tmp_8_1_0_5 = or i6 %tmp_s, 14" [matrix_mult/matrix_mult.cpp:16]
ST_62 : Operation 555 [1/1] (0.00ns)   --->   "%tmp_9_1_0_6 = zext i6 %tmp_8_1_0_5 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_62 : Operation 556 [1/1] (0.00ns)   --->   "%tempA_addr_15 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_1_0_6" [matrix_mult/matrix_mult.cpp:16]
ST_62 : Operation 557 [2/2] (3.25ns)   --->   "%tempA_load_14 = load i32* %tempA_addr_15, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_62 : Operation 558 [1/1] (0.00ns)   --->   "%tmp_8_1_0_6 = or i6 %tmp_s, 15" [matrix_mult/matrix_mult.cpp:16]
ST_62 : Operation 559 [1/1] (0.00ns)   --->   "%tmp_9_1_0_7 = zext i6 %tmp_8_1_0_6 to i64" [matrix_mult/matrix_mult.cpp:16]
ST_62 : Operation 560 [1/1] (0.00ns)   --->   "%tempA_addr_16 = getelementptr inbounds [64 x i32]* %tempA, i64 0, i64 %tmp_9_1_0_7" [matrix_mult/matrix_mult.cpp:16]
ST_62 : Operation 561 [2/2] (3.25ns)   --->   "%tempA_load_15 = load i32* %tempA_addr_16, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_62 : Operation 562 [1/1] (2.55ns)   --->   "%tmp50 = add i32 %tmp_10_1, %tmp_10_1_0_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 563 [1/1] (8.51ns)   --->   "%tmp_10_1_1_2 = mul nsw i32 %tempB_load_10, %tempA_load_10" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 564 [1/1] (8.51ns)   --->   "%tmp_10_1_1_3 = mul nsw i32 %tempB_load_11, %tempA_load_11" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 565 [1/1] (2.55ns)   --->   "%tmp56 = add i32 %tmp_10_1_1, %tmp_10_1_1_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 566 [1/1] (8.51ns)   --->   "%tmp_10_1_2_2 = mul nsw i32 %tempB_load_18, %tempA_load_10" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 567 [1/1] (8.51ns)   --->   "%tmp_10_1_2_3 = mul nsw i32 %tempB_load_19, %tempA_load_11" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 568 [1/1] (2.55ns)   --->   "%tmp62 = add i32 %tmp_10_1_2, %tmp_10_1_2_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 569 [1/1] (8.51ns)   --->   "%tmp_10_1_3_2 = mul nsw i32 %tempB_load_26, %tempA_load_10" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 570 [1/1] (8.51ns)   --->   "%tmp_10_1_3_3 = mul nsw i32 %tempB_load_27, %tempA_load_11" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 571 [1/1] (2.55ns)   --->   "%tmp68 = add i32 %tmp_10_1_3, %tmp_10_1_3_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 572 [1/1] (8.51ns)   --->   "%tmp_10_1_4_2 = mul nsw i32 %tempB_load_34, %tempA_load_10" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 573 [1/1] (8.51ns)   --->   "%tmp_10_1_4_3 = mul nsw i32 %tempB_load_35, %tempA_load_11" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 574 [1/1] (2.55ns)   --->   "%tmp74 = add i32 %tmp_10_1_4, %tmp_10_1_4_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 575 [1/1] (8.51ns)   --->   "%tmp_10_1_5_2 = mul nsw i32 %tempB_load_42, %tempA_load_10" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 576 [1/1] (8.51ns)   --->   "%tmp_10_1_5_3 = mul nsw i32 %tempB_load_43, %tempA_load_11" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 577 [1/1] (2.55ns)   --->   "%tmp80 = add i32 %tmp_10_1_5, %tmp_10_1_5_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 578 [1/1] (8.51ns)   --->   "%tmp_10_1_6_2 = mul nsw i32 %tempB_load_50, %tempA_load_10" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 579 [1/1] (8.51ns)   --->   "%tmp_10_1_6_3 = mul nsw i32 %tempB_load_51, %tempA_load_11" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 580 [1/1] (2.55ns)   --->   "%tmp86 = add i32 %tmp_10_1_6, %tmp_10_1_6_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 581 [1/1] (8.51ns)   --->   "%tmp_10_1_7_2 = mul nsw i32 %tempB_load_58, %tempA_load_10" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 582 [1/1] (8.51ns)   --->   "%tmp_10_1_7_3 = mul nsw i32 %tempB_load_59, %tempA_load_11" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 583 [1/1] (2.55ns)   --->   "%tmp92 = add i32 %tmp_10_1_7, %tmp_10_1_7_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_62 : Operation 584 [1/1] (1.73ns)   --->   "%i_1_1 = add i4 2, %i" [matrix_mult/matrix_mult.cpp:10]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>

 <State 63> : 8.51ns
ST_63 : Operation 585 [1/1] (0.00ns)   --->   "%tempResult_addr_3 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_0_0_2" [matrix_mult/matrix_mult.cpp:13]
ST_63 : Operation 586 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_2_7, i32* %tempResult_addr_3, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_63 : Operation 587 [1/1] (0.00ns)   --->   "%tempResult_addr_4 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_0_0_3" [matrix_mult/matrix_mult.cpp:13]
ST_63 : Operation 588 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_3_7, i32* %tempResult_addr_4, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_63 : Operation 589 [1/1] (8.51ns)   --->   "%tmp_10_1_0_4 = mul nsw i32 %tempB_load_4, %tempA_load_12" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 590 [1/1] (8.51ns)   --->   "%tmp_10_1_0_5 = mul nsw i32 %tempB_load_5, %tempA_load_13" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 591 [1/2] (3.25ns)   --->   "%tempA_load_14 = load i32* %tempA_addr_15, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_63 : Operation 592 [1/2] (3.25ns)   --->   "%tempA_load_15 = load i32* %tempA_addr_16, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_63 : Operation 593 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp51 = add i32 %tmp_10_1_0_3, %tmp_10_1_0_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 594 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp49 = add i32 %tmp50, %tmp51" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 595 [1/1] (8.51ns)   --->   "%tmp_10_1_1_4 = mul nsw i32 %tempB_load_12, %tempA_load_12" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 596 [1/1] (8.51ns)   --->   "%tmp_10_1_1_5 = mul nsw i32 %tempB_load_13, %tempA_load_13" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 597 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp57 = add i32 %tmp_10_1_1_3, %tmp_10_1_1_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 598 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp55 = add i32 %tmp56, %tmp57" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 599 [1/1] (8.51ns)   --->   "%tmp_10_1_2_4 = mul nsw i32 %tempB_load_20, %tempA_load_12" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 600 [1/1] (8.51ns)   --->   "%tmp_10_1_2_5 = mul nsw i32 %tempB_load_21, %tempA_load_13" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 601 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp63 = add i32 %tmp_10_1_2_3, %tmp_10_1_2_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 602 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp61 = add i32 %tmp62, %tmp63" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 603 [1/1] (8.51ns)   --->   "%tmp_10_1_3_4 = mul nsw i32 %tempB_load_28, %tempA_load_12" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 604 [1/1] (8.51ns)   --->   "%tmp_10_1_3_5 = mul nsw i32 %tempB_load_29, %tempA_load_13" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 605 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp69 = add i32 %tmp_10_1_3_3, %tmp_10_1_3_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 606 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp67 = add i32 %tmp68, %tmp69" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 607 [1/1] (8.51ns)   --->   "%tmp_10_1_4_4 = mul nsw i32 %tempB_load_36, %tempA_load_12" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 608 [1/1] (8.51ns)   --->   "%tmp_10_1_4_5 = mul nsw i32 %tempB_load_37, %tempA_load_13" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 609 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp75 = add i32 %tmp_10_1_4_3, %tmp_10_1_4_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 610 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp73 = add i32 %tmp74, %tmp75" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 611 [1/1] (8.51ns)   --->   "%tmp_10_1_5_4 = mul nsw i32 %tempB_load_44, %tempA_load_12" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 612 [1/1] (8.51ns)   --->   "%tmp_10_1_5_5 = mul nsw i32 %tempB_load_45, %tempA_load_13" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 613 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp81 = add i32 %tmp_10_1_5_3, %tmp_10_1_5_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 614 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp79 = add i32 %tmp80, %tmp81" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 615 [1/1] (8.51ns)   --->   "%tmp_10_1_6_4 = mul nsw i32 %tempB_load_52, %tempA_load_12" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 616 [1/1] (8.51ns)   --->   "%tmp_10_1_6_5 = mul nsw i32 %tempB_load_53, %tempA_load_13" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 617 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp87 = add i32 %tmp_10_1_6_3, %tmp_10_1_6_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 618 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp85 = add i32 %tmp86, %tmp87" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 619 [1/1] (8.51ns)   --->   "%tmp_10_1_7_4 = mul nsw i32 %tempB_load_60, %tempA_load_12" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 620 [1/1] (8.51ns)   --->   "%tmp_10_1_7_5 = mul nsw i32 %tempB_load_61, %tempA_load_13" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_63 : Operation 621 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp93 = add i32 %tmp_10_1_7_3, %tmp_10_1_7_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_63 : Operation 622 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp91 = add i32 %tmp92, %tmp93" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>

 <State 64> : 8.51ns
ST_64 : Operation 623 [1/1] (0.00ns)   --->   "%tempResult_addr_5 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_0_0_4" [matrix_mult/matrix_mult.cpp:13]
ST_64 : Operation 624 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_4_7, i32* %tempResult_addr_5, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_64 : Operation 625 [1/1] (0.00ns)   --->   "%tempResult_addr_6 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_0_0_5" [matrix_mult/matrix_mult.cpp:13]
ST_64 : Operation 626 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_5_7, i32* %tempResult_addr_6, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_64 : Operation 627 [1/1] (8.51ns)   --->   "%tmp_10_1_0_6 = mul nsw i32 %tempB_load_6, %tempA_load_14" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 628 [1/1] (8.51ns)   --->   "%tmp_10_1_0_7 = mul nsw i32 %tempB_load_7, %tempA_load_15" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 629 [1/1] (2.55ns)   --->   "%tmp53 = add i32 %tmp_10_1_0_5, %tmp_10_1_0_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 630 [1/1] (8.51ns)   --->   "%tmp_10_1_1_6 = mul nsw i32 %tempB_load_14, %tempA_load_14" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 631 [1/1] (8.51ns)   --->   "%tmp_10_1_1_7 = mul nsw i32 %tempB_load_15, %tempA_load_15" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 632 [1/1] (2.55ns)   --->   "%tmp59 = add i32 %tmp_10_1_1_5, %tmp_10_1_1_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 633 [1/1] (8.51ns)   --->   "%tmp_10_1_2_6 = mul nsw i32 %tempB_load_22, %tempA_load_14" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 634 [1/1] (8.51ns)   --->   "%tmp_10_1_2_7 = mul nsw i32 %tempB_load_23, %tempA_load_15" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 635 [1/1] (2.55ns)   --->   "%tmp65 = add i32 %tmp_10_1_2_5, %tmp_10_1_2_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 636 [1/1] (8.51ns)   --->   "%tmp_10_1_3_6 = mul nsw i32 %tempB_load_30, %tempA_load_14" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 637 [1/1] (8.51ns)   --->   "%tmp_10_1_3_7 = mul nsw i32 %tempB_load_31, %tempA_load_15" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 638 [1/1] (2.55ns)   --->   "%tmp71 = add i32 %tmp_10_1_3_5, %tmp_10_1_3_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 639 [1/1] (8.51ns)   --->   "%tmp_10_1_4_6 = mul nsw i32 %tempB_load_38, %tempA_load_14" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 640 [1/1] (8.51ns)   --->   "%tmp_10_1_4_7 = mul nsw i32 %tempB_load_39, %tempA_load_15" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 641 [1/1] (2.55ns)   --->   "%tmp77 = add i32 %tmp_10_1_4_5, %tmp_10_1_4_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 642 [1/1] (8.51ns)   --->   "%tmp_10_1_5_6 = mul nsw i32 %tempB_load_46, %tempA_load_14" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 643 [1/1] (8.51ns)   --->   "%tmp_10_1_5_7 = mul nsw i32 %tempB_load_47, %tempA_load_15" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 644 [1/1] (2.55ns)   --->   "%tmp83 = add i32 %tmp_10_1_5_5, %tmp_10_1_5_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 645 [1/1] (8.51ns)   --->   "%tmp_10_1_6_6 = mul nsw i32 %tempB_load_54, %tempA_load_14" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 646 [1/1] (8.51ns)   --->   "%tmp_10_1_6_7 = mul nsw i32 %tempB_load_55, %tempA_load_15" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 647 [1/1] (2.55ns)   --->   "%tmp89 = add i32 %tmp_10_1_6_5, %tmp_10_1_6_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 648 [1/1] (8.51ns)   --->   "%tmp_10_1_7_6 = mul nsw i32 %tempB_load_62, %tempA_load_14" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 649 [1/1] (8.51ns)   --->   "%tmp_10_1_7_7 = mul nsw i32 %tempB_load_63, %tempA_load_15" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_64 : Operation 650 [1/1] (2.55ns)   --->   "%tmp95 = add i32 %tmp_10_1_7_5, %tmp_10_1_7_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>

 <State 65> : 6.92ns
ST_65 : Operation 651 [1/1] (0.00ns)   --->   "%tempResult_addr_7 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_0_0_6" [matrix_mult/matrix_mult.cpp:13]
ST_65 : Operation 652 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_6_7, i32* %tempResult_addr_7, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_65 : Operation 653 [1/1] (0.00ns)   --->   "%tempResult_addr_8 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_0_0_7" [matrix_mult/matrix_mult.cpp:13]
ST_65 : Operation 654 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_7_7, i32* %tempResult_addr_8, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_65 : Operation 655 [1/1] (2.55ns)   --->   "%tmp54 = add i32 %tmp_10_1_0_7, %tmp_10_1_0_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_65 : Operation 656 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp52 = add i32 %tmp53, %tmp54" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 657 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_0_7 = add nsw i32 %tmp49, %tmp52" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 658 [1/1] (2.55ns)   --->   "%tmp60 = add i32 %tmp_10_1_1_7, %tmp_10_1_1_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_65 : Operation 659 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp58 = add i32 %tmp59, %tmp60" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 660 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_1_7 = add nsw i32 %tmp55, %tmp58" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 661 [1/1] (2.55ns)   --->   "%tmp66 = add i32 %tmp_10_1_2_7, %tmp_10_1_2_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_65 : Operation 662 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp64 = add i32 %tmp65, %tmp66" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 663 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_2_7 = add nsw i32 %tmp61, %tmp64" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 664 [1/1] (2.55ns)   --->   "%tmp72 = add i32 %tmp_10_1_3_7, %tmp_10_1_3_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_65 : Operation 665 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp70 = add i32 %tmp71, %tmp72" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 666 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_3_7 = add nsw i32 %tmp67, %tmp70" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 667 [1/1] (2.55ns)   --->   "%tmp78 = add i32 %tmp_10_1_4_7, %tmp_10_1_4_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_65 : Operation 668 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp76 = add i32 %tmp77, %tmp78" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 669 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_4_7 = add nsw i32 %tmp73, %tmp76" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 670 [1/1] (2.55ns)   --->   "%tmp84 = add i32 %tmp_10_1_5_7, %tmp_10_1_5_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_65 : Operation 671 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp82 = add i32 %tmp83, %tmp84" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 672 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_5_7 = add nsw i32 %tmp79, %tmp82" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 673 [1/1] (2.55ns)   --->   "%tmp90 = add i32 %tmp_10_1_6_7, %tmp_10_1_6_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_65 : Operation 674 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp88 = add i32 %tmp89, %tmp90" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 675 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_6_7 = add nsw i32 %tmp85, %tmp88" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 676 [1/1] (2.55ns)   --->   "%tmp96 = add i32 %tmp_10_1_7_7, %tmp_10_1_7_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_65 : Operation 677 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp94 = add i32 %tmp95, %tmp96" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_65 : Operation 678 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_7_7 = add nsw i32 %tmp91, %tmp94" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>

 <State 66> : 3.25ns
ST_66 : Operation 679 [1/1] (0.00ns)   --->   "%tempResult_addr_9 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_6_1" [matrix_mult/matrix_mult.cpp:13]
ST_66 : Operation 680 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_0_7, i32* %tempResult_addr_9, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_66 : Operation 681 [1/1] (0.00ns)   --->   "%tempResult_addr_10 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_1_0_1" [matrix_mult/matrix_mult.cpp:13]
ST_66 : Operation 682 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_1_7, i32* %tempResult_addr_10, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 67> : 3.25ns
ST_67 : Operation 683 [1/1] (0.00ns)   --->   "%tempResult_addr_11 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_1_0_2" [matrix_mult/matrix_mult.cpp:13]
ST_67 : Operation 684 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_2_7, i32* %tempResult_addr_11, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_67 : Operation 685 [1/1] (0.00ns)   --->   "%tempResult_addr_12 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_1_0_3" [matrix_mult/matrix_mult.cpp:13]
ST_67 : Operation 686 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_3_7, i32* %tempResult_addr_12, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 68> : 3.25ns
ST_68 : Operation 687 [1/1] (0.00ns)   --->   "%tempResult_addr_13 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_1_0_4" [matrix_mult/matrix_mult.cpp:13]
ST_68 : Operation 688 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_4_7, i32* %tempResult_addr_13, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_68 : Operation 689 [1/1] (0.00ns)   --->   "%tempResult_addr_14 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_1_0_5" [matrix_mult/matrix_mult.cpp:13]
ST_68 : Operation 690 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_5_7, i32* %tempResult_addr_14, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 69> : 3.25ns
ST_69 : Operation 691 [1/1] (0.00ns)   --->   "%empty_6 = call i32 (...)* @_ssdm_op_SpecLoopTripCount(i64 4, i64 4, i64 4) nounwind"
ST_69 : Operation 692 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecLoopName([24 x i8]* @p_str4) nounwind" [matrix_mult/matrix_mult.cpp:12]
ST_69 : Operation 693 [1/1] (0.00ns)   --->   "%tmp_2 = call i32 (...)* @_ssdm_op_SpecRegionBegin([24 x i8]* @p_str4) nounwind" [matrix_mult/matrix_mult.cpp:12]
ST_69 : Operation 694 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecPipeline(i32 -1, i32 1, i32 1, i32 0, [1 x i8]* @p_str1) nounwind" [matrix_mult/matrix_mult.cpp:12]
ST_69 : Operation 695 [1/1] (0.00ns)   --->   "%empty_7 = call i32 (...)* @_ssdm_op_SpecRegionEnd([24 x i8]* @p_str4, i32 %tmp_2) nounwind" [matrix_mult/matrix_mult.cpp:16]
ST_69 : Operation 696 [1/1] (0.00ns)   --->   "%tempResult_addr_15 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_1_0_6" [matrix_mult/matrix_mult.cpp:13]
ST_69 : Operation 697 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_6_7, i32* %tempResult_addr_15, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_69 : Operation 698 [1/1] (0.00ns)   --->   "%tempResult_addr_16 = getelementptr inbounds [64 x i32]* %tempResult, i64 0, i64 %tmp_9_1_0_7" [matrix_mult/matrix_mult.cpp:13]
ST_69 : Operation 699 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_7_7, i32* %tempResult_addr_16, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>
ST_69 : Operation 700 [1/1] (0.00ns)   --->   "br label %burst.rd.end6.0" [matrix_mult/matrix_mult.cpp:10]

 <State 70> : 8.75ns
ST_70 : Operation 701 [1/1] (8.75ns)   --->   "%gmem_addr_wr_req = call i1 @_ssdm_op_WriteReq.m_axi.i32P(i32* %gmem_addr, i32 64)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>
ST_70 : Operation 702 [1/1] (1.76ns)   --->   "br label %burst.wr.header"

 <State 71> : 3.25ns
ST_71 : Operation 703 [1/1] (0.00ns)   --->   "%indvar1 = phi i7 [ %indvar_next2, %burst.wr.body ], [ 0, %burst.wr.header.preheader ]"
ST_71 : Operation 704 [1/1] (1.48ns)   --->   "%exitcond5 = icmp eq i7 %indvar1, -64"   --->   Core 25 'Cmp' <Latency = 0> <II = 1> <Delay = 1.48> <FuncUnit> <Opcode : 'icmp'> <InPorts = 2> <OutPorts = 1>
ST_71 : Operation 705 [1/1] (1.87ns)   --->   "%indvar_next2 = add i7 %indvar1, 1"   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_71 : Operation 706 [1/1] (0.00ns)   --->   "br i1 %exitcond5, label %memcpy.tail, label %burst.wr.body"
ST_71 : Operation 707 [1/1] (0.00ns)   --->   "%tmp_3 = zext i7 %indvar1 to i64" [matrix_mult/matrix_mult.cpp:18]
ST_71 : Operation 708 [1/1] (0.00ns)   --->   "%tempResult_addr = getelementptr [64 x i32]* %tempResult, i64 0, i64 %tmp_3" [matrix_mult/matrix_mult.cpp:18]
ST_71 : Operation 709 [2/2] (3.25ns)   --->   "%tempResult_load = load i32* %tempResult_addr, align 4" [matrix_mult/matrix_mult.cpp:18]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 72> : 3.25ns
ST_72 : Operation 710 [1/2] (3.25ns)   --->   "%tempResult_load = load i32* %tempResult_addr, align 4" [matrix_mult/matrix_mult.cpp:18]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 64> <RAM>

 <State 73> : 8.75ns
ST_73 : Operation 711 [1/1] (0.00ns)   --->   "%empty_8 = call i32 (...)* @_ssdm_op_SpecLoopTripCount(i64 64, i64 64, i64 64) nounwind"
ST_73 : Operation 712 [1/1] (0.00ns)   --->   "%burstwrite_rbegin = call i32 (...)* @_ssdm_op_SpecRegionBegin([18 x i8]* @burstwrite_OC_region) nounwind"
ST_73 : Operation 713 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecPipeline(i32 1, i32 1, i32 1, i32 0, [1 x i8]* @p_str9)"
ST_73 : Operation 714 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecLoopName([29 x i8]* @memcpy_OC_result_OC_s)"
ST_73 : Operation 715 [1/1] (8.75ns)   --->   "call void @_ssdm_op_Write.m_axi.i32P(i32* %gmem_addr, i32 %tempResult_load, i4 -1)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>
ST_73 : Operation 716 [1/1] (0.00ns)   --->   "%burstwrite_rend = call i32 (...)* @_ssdm_op_SpecRegionEnd([18 x i8]* @burstwrite_OC_region, i32 %burstwrite_rbegin) nounwind"
ST_73 : Operation 717 [1/1] (0.00ns)   --->   "br label %burst.wr.header"

 <State 74> : 8.75ns
ST_74 : Operation 718 [5/5] (8.75ns)   --->   "%gmem_addr_wr_resp = call i1 @_ssdm_op_WriteResp.m_axi.i32P(i32* %gmem_addr)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 75> : 8.75ns
ST_75 : Operation 719 [4/5] (8.75ns)   --->   "%gmem_addr_wr_resp = call i1 @_ssdm_op_WriteResp.m_axi.i32P(i32* %gmem_addr)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 76> : 8.75ns
ST_76 : Operation 720 [3/5] (8.75ns)   --->   "%gmem_addr_wr_resp = call i1 @_ssdm_op_WriteResp.m_axi.i32P(i32* %gmem_addr)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 77> : 8.75ns
ST_77 : Operation 721 [2/5] (8.75ns)   --->   "%gmem_addr_wr_resp = call i1 @_ssdm_op_WriteResp.m_axi.i32P(i32* %gmem_addr)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 78> : 8.75ns
ST_78 : Operation 722 [1/5] (8.75ns)   --->   "%gmem_addr_wr_resp = call i1 @_ssdm_op_WriteResp.m_axi.i32P(i32* %gmem_addr)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>
ST_78 : Operation 723 [1/1] (0.00ns)   --->   "ret void" [matrix_mult/matrix_mult.cpp:19]


============================================================
+ Verbose Summary: Timing violations
============================================================
Target clock period: 10ns, clock uncertainty: 1.25ns.

 <State 1>: 1ns
The critical path consists of the following:
	s_axi read on port 'result' [5]  (1 ns)

 <State 2>: 8.75ns
The critical path consists of the following:
	'getelementptr' operation ('gmem_addr_2') [16]  (0 ns)
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [27]  (8.75 ns)

 <State 3>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [27]  (8.75 ns)

 <State 4>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [27]  (8.75 ns)

 <State 5>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [27]  (8.75 ns)

 <State 6>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [27]  (8.75 ns)

 <State 7>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [27]  (8.75 ns)

 <State 8>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [27]  (8.75 ns)

 <State 9>: 1.87ns
The critical path consists of the following:
	'phi' operation ('indvar') with incoming values : ('indvar_next') [30]  (0 ns)
	'add' operation ('indvar_next') [32]  (1.87 ns)

 <State 10>: 8.75ns
The critical path consists of the following:
	bus read on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [40]  (8.75 ns)

 <State 11>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempA_addr', matrix_mult/matrix_mult.cpp:6) [41]  (0 ns)
	'store' operation (matrix_mult/matrix_mult.cpp:6) of variable 'gmem_addr_2_read', matrix_mult/matrix_mult.cpp:6 on array 'tempA', matrix_mult/matrix_mult.cpp:5 [42]  (3.25 ns)

 <State 12>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [46]  (8.75 ns)

 <State 13>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [46]  (8.75 ns)

 <State 14>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [46]  (8.75 ns)

 <State 15>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [46]  (8.75 ns)

 <State 16>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [46]  (8.75 ns)

 <State 17>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [46]  (8.75 ns)

 <State 18>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [46]  (8.75 ns)

 <State 19>: 1.87ns
The critical path consists of the following:
	'phi' operation ('indvar9') with incoming values : ('indvar_next1') [49]  (0 ns)
	'add' operation ('indvar_next1') [51]  (1.87 ns)

 <State 20>: 8.75ns
The critical path consists of the following:
	bus read on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [59]  (8.75 ns)

 <State 21>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempB_addr', matrix_mult/matrix_mult.cpp:7) [60]  (0 ns)
	'store' operation (matrix_mult/matrix_mult.cpp:7) of variable 'gmem_addr_1_read', matrix_mult/matrix_mult.cpp:7 on array 'tempB', matrix_mult/matrix_mult.cpp:5 [61]  (3.25 ns)

 <State 22>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempB_addr_1', matrix_mult/matrix_mult.cpp:16) [65]  (0 ns)
	'load' operation ('tempB_load', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [66]  (3.25 ns)

 <State 23>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [66]  (3.25 ns)

 <State 24>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_2', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [70]  (3.25 ns)

 <State 25>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_4', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [74]  (3.25 ns)

 <State 26>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_6', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [78]  (3.25 ns)

 <State 27>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_8', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [82]  (3.25 ns)

 <State 28>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_10', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [86]  (3.25 ns)

 <State 29>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_12', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [90]  (3.25 ns)

 <State 30>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_14', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [94]  (3.25 ns)

 <State 31>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_16', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [98]  (3.25 ns)

 <State 32>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_18', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [102]  (3.25 ns)

 <State 33>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_20', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [106]  (3.25 ns)

 <State 34>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_22', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [110]  (3.25 ns)

 <State 35>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_24', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [114]  (3.25 ns)

 <State 36>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_26', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [118]  (3.25 ns)

 <State 37>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_28', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [122]  (3.25 ns)

 <State 38>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_30', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [126]  (3.25 ns)

 <State 39>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_32', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [130]  (3.25 ns)

 <State 40>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_34', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [134]  (3.25 ns)

 <State 41>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_36', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [138]  (3.25 ns)

 <State 42>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_38', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [142]  (3.25 ns)

 <State 43>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_40', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [146]  (3.25 ns)

 <State 44>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_42', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [150]  (3.25 ns)

 <State 45>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_44', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [154]  (3.25 ns)

 <State 46>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_46', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [158]  (3.25 ns)

 <State 47>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_48', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [162]  (3.25 ns)

 <State 48>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_50', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [166]  (3.25 ns)

 <State 49>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_52', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [170]  (3.25 ns)

 <State 50>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_54', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [174]  (3.25 ns)

 <State 51>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_56', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [178]  (3.25 ns)

 <State 52>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_58', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [182]  (3.25 ns)

 <State 53>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_60', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [186]  (3.25 ns)

 <State 54>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_load_62', matrix_mult/matrix_mult.cpp:16) on array 'tempB', matrix_mult/matrix_mult.cpp:5 [190]  (3.25 ns)

 <State 55>: 3.25ns
The critical path consists of the following:
	'phi' operation ('i', matrix_mult/matrix_mult.cpp:10) with incoming values : ('i_1_1', matrix_mult/matrix_mult.cpp:10) [195]  (0 ns)
	'getelementptr' operation ('tempA_addr_1', matrix_mult/matrix_mult.cpp:16) [207]  (0 ns)
	'load' operation ('tempA_load', matrix_mult/matrix_mult.cpp:16) on array 'tempA', matrix_mult/matrix_mult.cpp:5 [208]  (3.25 ns)

 <State 56>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempA_load', matrix_mult/matrix_mult.cpp:16) on array 'tempA', matrix_mult/matrix_mult.cpp:5 [208]  (3.25 ns)

 <State 57>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_9', matrix_mult/matrix_mult.cpp:16) [209]  (8.51 ns)

 <State 58>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_10_0_0_2', matrix_mult/matrix_mult.cpp:16) [219]  (8.51 ns)

 <State 59>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_10_0_0_4', matrix_mult/matrix_mult.cpp:16) [229]  (8.51 ns)

 <State 60>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_10_0_0_6', matrix_mult/matrix_mult.cpp:16) [239]  (8.51 ns)

 <State 61>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_10_1', matrix_mult/matrix_mult.cpp:16) [378]  (8.51 ns)

 <State 62>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_10_1_0_2', matrix_mult/matrix_mult.cpp:16) [388]  (8.51 ns)

 <State 63>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_10_1_0_4', matrix_mult/matrix_mult.cpp:16) [398]  (8.51 ns)

 <State 64>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_10_1_0_6', matrix_mult/matrix_mult.cpp:16) [408]  (8.51 ns)

 <State 65>: 6.92ns
The critical path consists of the following:
	'add' operation ('tmp54', matrix_mult/matrix_mult.cpp:16) [418]  (2.55 ns)
	'add' operation ('tmp52', matrix_mult/matrix_mult.cpp:16) [419]  (0 ns)
	'add' operation ('tmp_11_1_0_7', matrix_mult/matrix_mult.cpp:16) [420]  (4.37 ns)

 <State 66>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempResult_addr_9', matrix_mult/matrix_mult.cpp:13) [375]  (0 ns)
	'store' operation (matrix_mult/matrix_mult.cpp:16) of variable 'tmp_11_1_0_7', matrix_mult/matrix_mult.cpp:16 on array 'tempResult', matrix_mult/matrix_mult.cpp:5 [421]  (3.25 ns)

 <State 67>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempResult_addr_11', matrix_mult/matrix_mult.cpp:13) [439]  (0 ns)
	'store' operation (matrix_mult/matrix_mult.cpp:16) of variable 'tmp_11_1_2_7', matrix_mult/matrix_mult.cpp:16 on array 'tempResult', matrix_mult/matrix_mult.cpp:5 [455]  (3.25 ns)

 <State 68>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempResult_addr_13', matrix_mult/matrix_mult.cpp:13) [473]  (0 ns)
	'store' operation (matrix_mult/matrix_mult.cpp:16) of variable 'tmp_11_1_4_7', matrix_mult/matrix_mult.cpp:16 on array 'tempResult', matrix_mult/matrix_mult.cpp:5 [489]  (3.25 ns)

 <State 69>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempResult_addr_15', matrix_mult/matrix_mult.cpp:13) [507]  (0 ns)
	'store' operation (matrix_mult/matrix_mult.cpp:16) of variable 'tmp_11_1_6_7', matrix_mult/matrix_mult.cpp:16 on array 'tempResult', matrix_mult/matrix_mult.cpp:5 [523]  (3.25 ns)

 <State 70>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [544]  (8.75 ns)

 <State 71>: 3.25ns
The critical path consists of the following:
	'phi' operation ('indvar1') with incoming values : ('indvar_next2') [547]  (0 ns)
	'getelementptr' operation ('tempResult_addr', matrix_mult/matrix_mult.cpp:18) [557]  (0 ns)
	'load' operation ('tempResult_load', matrix_mult/matrix_mult.cpp:18) on array 'tempResult', matrix_mult/matrix_mult.cpp:5 [558]  (3.25 ns)

 <State 72>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempResult_load', matrix_mult/matrix_mult.cpp:18) on array 'tempResult', matrix_mult/matrix_mult.cpp:5 [558]  (3.25 ns)

 <State 73>: 8.75ns
The critical path consists of the following:
	bus write on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [559]  (8.75 ns)

 <State 74>: 8.75ns
The critical path consists of the following:
	bus access on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [563]  (8.75 ns)

 <State 75>: 8.75ns
The critical path consists of the following:
	bus access on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [563]  (8.75 ns)

 <State 76>: 8.75ns
The critical path consists of the following:
	bus access on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [563]  (8.75 ns)

 <State 77>: 8.75ns
The critical path consists of the following:
	bus access on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [563]  (8.75 ns)

 <State 78>: 8.75ns
The critical path consists of the following:
	bus access on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [563]  (8.75 ns)


============================================================
+ Verbose Summary: Binding
============================================================
N/A
* FSMD analyzer results:
  - Output states:
 - Input state : 
  - Chain level:
	State 1
	State 2
	State 3
	State 4
	State 5
	State 6
	State 7
	State 8
	State 9
	State 10
	State 11
	State 12
	State 13
	State 14
	State 15
	State 16
	State 17
	State 18
	State 19
	State 20
	State 21
	State 22
	State 23
	State 24
	State 25
	State 26
	State 27
	State 28
	State 29
	State 30
	State 31
	State 32
	State 33
	State 34
	State 35
	State 36
	State 37
	State 38
	State 39
	State 40
	State 41
	State 42
	State 43
	State 44
	State 45
	State 46
	State 47
	State 48
	State 49
	State 50
	State 51
	State 52
	State 53
	State 54
	State 55
	State 56
	State 57
	State 58
	State 59
	State 60
	State 61
	State 62
	State 63
	State 64
	State 65
	State 66
	State 67
	State 68
	State 69
	State 70
	State 71
	State 72
	State 73
	State 74
	State 75
	State 76
	State 77
	State 78


============================================================
+ Verbose Summary: Datapath Resource usage 
============================================================
N/A