================================================================
== Vivado HLS Report for 'matrix_mult'
================================================================
* Date:           Mon Mar 19 10:00:08 2018

* Version:        2017.4 (Build 2086221 on Fri Dec 15 21:13:33 MST 2017)
* Project:        matrix_mult
* Solution:       solution3
* Product family: zynq
* Target device:  xc7z020clg484-1


================================================================
== Performance Estimates
================================================================
+ Timing (ns): 
    * Summary: 
    +--------+-------+----------+------------+
    |  Clock | Target| Estimated| Uncertainty|
    +--------+-------+----------+------------+
    |ap_clk  |  10.00|      8.75|        1.25|
    +--------+-------+----------+------------+

+ Latency (clock cycles): 
    * Summary: 
    +-----+-----+-----+-----+---------+
    |  Latency  |  Interval | Pipeline|
    | min | max | min | max |   Type  |
    +-----+-----+-----+-----+---------+
    |  256|  256|  256|  256|   none  |
    +-----+-----+-----+-----+---------+

    + Detail: 
        * Instance: 
        N/A

        * Loop: 
        +--------------------------------+-----+-----+----------+-----------+-----------+------+----------+
        |                                |  Latency  | Iteration|  Initiation Interval  | Trip |          |
        |            Loop Name           | min | max |  Latency |  achieved |   target  | Count| Pipelined|
        +--------------------------------+-----+-----+----------+-----------+-----------+------+----------+
        |- memcpy.tempA.A                |   65|   65|         3|          1|          1|    64|    yes   |
        |- memcpy.tempB.B                |   65|   65|         3|          1|          1|    64|    yes   |
        |- matrix_mult__outer_loop       |   20|   20|         9|          4|          1|     4|    yes   |
        |- memcpy.result.tempResult.gep  |   65|   65|         3|          1|          1|    64|    yes   |
        +--------------------------------+-----+-----+----------+-----------+-----------+------+----------+

============================================================
+ Verbose Summary: Synthesis Manager
============================================================
InlineROM: 1
ExposeGlobal: 0
============================================================
+ Verbose Summary: CDFG Model
============================================================
IsTopModel: 1
ResetActiveHigh: 1
IsCombinational: 0
IsDatapathOnly: 0
HasWiredReturn: 1
HasMFsm: 0
HasVarLatency: 1
IsPipeline: 0
IsRtlPipelined: 0
IsInstanceOverlapped: 0
IsDontTouch: 0
HasImplIP: 0
IsGatedGlobalClock: 0

+ Individual pipeline summary: 
  * Pipeline-0: initiation interval (II) = 1, depth = 3
  * Pipeline-1: initiation interval (II) = 1, depth = 3
  * Pipeline-2: initiation interval (II) = 4, depth = 9
  * Pipeline-3: initiation interval (II) = 1, depth = 3


============================================================
+ Verbose Summary: Schedule
============================================================
* Number of FSM states : 56
* Pipeline : 4
  Pipeline-0 : II = 1, D = 3, States = { 9 10 11 }
  Pipeline-1 : II = 1, D = 3, States = { 19 20 21 }
  Pipeline-2 : II = 4, D = 9, States = { 39 40 41 42 43 44 45 46 47 }
  Pipeline-3 : II = 1, D = 3, States = { 49 50 51 }
* Dataflow Pipeline: 0

* FSM state transitions: 
1 --> 
	2  / true
2 --> 
	3  / true
3 --> 
	4  / true
4 --> 
	5  / true
5 --> 
	6  / true
6 --> 
	7  / true
7 --> 
	8  / true
8 --> 
	9  / true
9 --> 
	12  / (exitcond3)
	10  / (!exitcond3)
10 --> 
	11  / true
11 --> 
	9  / true
12 --> 
	13  / true
13 --> 
	14  / true
14 --> 
	15  / true
15 --> 
	16  / true
16 --> 
	17  / true
17 --> 
	18  / true
18 --> 
	19  / true
19 --> 
	22  / (exitcond4)
	20  / (!exitcond4)
20 --> 
	21  / true
21 --> 
	19  / true
22 --> 
	23  / true
23 --> 
	24  / true
24 --> 
	25  / true
25 --> 
	26  / true
26 --> 
	27  / true
27 --> 
	28  / true
28 --> 
	29  / true
29 --> 
	30  / true
30 --> 
	31  / true
31 --> 
	32  / true
32 --> 
	33  / true
33 --> 
	34  / true
34 --> 
	35  / true
35 --> 
	36  / true
36 --> 
	37  / true
37 --> 
	38  / true
38 --> 
	39  / true
39 --> 
	48  / (exitcond2)
	40  / (!exitcond2)
40 --> 
	41  / true
41 --> 
	42  / true
42 --> 
	43  / true
43 --> 
	44  / true
44 --> 
	45  / true
45 --> 
	46  / true
46 --> 
	47  / true
47 --> 
	39  / true
48 --> 
	49  / true
49 --> 
	52  / (exitcond5)
	50  / (!exitcond5)
50 --> 
	51  / true
51 --> 
	49  / true
52 --> 
	53  / true
53 --> 
	54  / true
54 --> 
	55  / true
55 --> 
	56  / true
56 --> 

* FSM state operations: 

 <State 1> : 1.00ns
ST_1 : Operation 57 [1/1] (1.00ns)   --->   "%result_read = call i32 @_ssdm_op_Read.s_axilite.i32(i32 %result)"   --->   Core 10 's_axilite' <Latency = 0> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write'>
ST_1 : Operation 58 [1/1] (1.00ns)   --->   "%B_read = call i32 @_ssdm_op_Read.s_axilite.i32(i32 %B)"   --->   Core 10 's_axilite' <Latency = 0> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write'>
ST_1 : Operation 59 [1/1] (1.00ns)   --->   "%A_read = call i32 @_ssdm_op_Read.s_axilite.i32(i32 %A)"   --->   Core 10 's_axilite' <Latency = 0> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write'>
ST_1 : Operation 60 [1/1] (0.00ns)   --->   "%result5 = call i30 @_ssdm_op_PartSelect.i30.i32.i32.i32(i32 %result_read, i32 2, i32 31)"
ST_1 : Operation 61 [1/1] (0.00ns)   --->   "%B3 = call i30 @_ssdm_op_PartSelect.i30.i32.i32.i32(i32 %B_read, i32 2, i32 31)"
ST_1 : Operation 62 [1/1] (0.00ns)   --->   "%A1 = call i30 @_ssdm_op_PartSelect.i30.i32.i32.i32(i32 %A_read, i32 2, i32 31)"
ST_1 : Operation 63 [1/1] (0.00ns)   --->   "%tempA_0 = alloca [32 x i32], align 4" [matrix_mult/matrix_mult.cpp:5]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_1 : Operation 64 [1/1] (0.00ns)   --->   "%tempA_1 = alloca [32 x i32], align 4" [matrix_mult/matrix_mult.cpp:5]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_1 : Operation 65 [1/1] (0.00ns)   --->   "%tempB_0 = alloca [32 x i32], align 4" [matrix_mult/matrix_mult.cpp:5]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_1 : Operation 66 [1/1] (0.00ns)   --->   "%tempB_1 = alloca [32 x i32], align 4" [matrix_mult/matrix_mult.cpp:5]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_1 : Operation 67 [1/1] (0.00ns)   --->   "%tempResult_0 = alloca [32 x i32], align 4" [matrix_mult/matrix_mult.cpp:5]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_1 : Operation 68 [1/1] (0.00ns)   --->   "%tempResult_1 = alloca [32 x i32], align 4" [matrix_mult/matrix_mult.cpp:5]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 2> : 8.75ns
ST_2 : Operation 69 [1/1] (0.00ns)   --->   "%tmp_7 = zext i30 %A1 to i64"
ST_2 : Operation 70 [1/1] (0.00ns)   --->   "%gmem_addr_2 = getelementptr i32* %gmem, i64 %tmp_7"
ST_2 : Operation 71 [7/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 3> : 8.75ns
ST_3 : Operation 72 [6/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 4> : 8.75ns
ST_4 : Operation 73 [5/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 5> : 8.75ns
ST_5 : Operation 74 [4/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 6> : 8.75ns
ST_6 : Operation 75 [3/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 7> : 8.75ns
ST_7 : Operation 76 [2/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 8> : 8.75ns
ST_8 : Operation 77 [1/1] (0.00ns)   --->   "%tmp_5 = zext i30 %result5 to i64"
ST_8 : Operation 78 [1/1] (0.00ns)   --->   "%gmem_addr = getelementptr i32* %gmem, i64 %tmp_5"
ST_8 : Operation 79 [1/1] (0.00ns)   --->   "%tmp_6 = zext i30 %B3 to i64"
ST_8 : Operation 80 [1/1] (0.00ns)   --->   "%gmem_addr_1 = getelementptr i32* %gmem, i64 %tmp_6"
ST_8 : Operation 81 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecBitsMap(i32* %gmem), !map !11"
ST_8 : Operation 82 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecTopModule([12 x i8]* @matrix_mult_str) nounwind"
ST_8 : Operation 83 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecInterface(i32 %result, [10 x i8]* @mode5, i32 0, i32 0, [1 x i8]* @p_str1, i32 0, i32 32, [1 x i8]* @bundle6, [6 x i8]* @p_str2, [1 x i8]* @p_str1, i32 16, i32 16, i32 16, i32 16, [1 x i8]* @p_str1, [1 x i8]* @p_str1)"
ST_8 : Operation 84 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecInterface(i32 %B, [10 x i8]* @mode3, i32 0, i32 0, [1 x i8]* @p_str1, i32 0, i32 32, [1 x i8]* @bundle4, [6 x i8]* @p_str2, [1 x i8]* @p_str1, i32 16, i32 16, i32 16, i32 16, [1 x i8]* @p_str1, [1 x i8]* @p_str1)"
ST_8 : Operation 85 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecInterface(i32* %gmem, [6 x i8]* @p_str, i32 0, i32 0, [1 x i8]* @p_str1, i32 0, i32 32, [1 x i8]* @p_str1, [6 x i8]* @p_str2, [1 x i8]* @p_str1, i32 16, i32 16, i32 16, i32 16, [1 x i8]* @p_str1, [1 x i8]* @p_str1)"
ST_8 : Operation 86 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecInterface(i32 %A, [10 x i8]* @mode, i32 0, i32 0, [1 x i8]* @p_str1, i32 0, i32 32, [1 x i8]* @bundle, [6 x i8]* @p_str2, [1 x i8]* @p_str1, i32 16, i32 16, i32 16, i32 16, [1 x i8]* @p_str1, [1 x i8]* @p_str1)"
ST_8 : Operation 87 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecInterface(i32 0, [10 x i8]* @p_str3, i32 0, i32 0, [1 x i8]* @p_str1, i32 0, i32 0, [1 x i8]* @p_str1, [1 x i8]* @p_str1, [1 x i8]* @p_str1, i32 0, i32 0, i32 0, i32 0, [1 x i8]* @p_str1, [1 x i8]* @p_str1) nounwind" [matrix_mult/matrix_mult.cpp:5]
ST_8 : Operation 88 [1/7] (8.75ns)   --->   "%gmem_addr_2_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_2, i32 64)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>
ST_8 : Operation 89 [1/1] (1.76ns)   --->   "br label %burst.rd.header"

 <State 9> : 2.42ns
ST_9 : Operation 90 [1/1] (0.00ns)   --->   "%indvar = phi i7 [ 0, %0 ], [ %indvar_next, %burst.rd.body506 ]"
ST_9 : Operation 91 [1/1] (1.48ns)   --->   "%exitcond3 = icmp eq i7 %indvar, -64"   --->   Core 25 'Cmp' <Latency = 0> <II = 1> <Delay = 1.48> <FuncUnit> <Opcode : 'icmp'> <InPorts = 2> <OutPorts = 1>
ST_9 : Operation 92 [1/1] (1.87ns)   --->   "%indvar_next = add i7 %indvar, 1"   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_9 : Operation 93 [1/1] (0.00ns)   --->   "br i1 %exitcond3, label %burst.rd.header7.preheader, label %burst.rd.body"
ST_9 : Operation 94 [1/1] (0.00ns)   --->   "%burstread_rbegin = call i32 (...)* @_ssdm_op_SpecRegionBegin([17 x i8]* @burstread_OC_region_s) nounwind"
ST_9 : Operation 95 [1/1] (0.00ns)   --->   "%tmp = trunc i7 %indvar to i1"
ST_9 : Operation 96 [1/1] (0.00ns)   --->   "%newIndex = call i6 @_ssdm_op_PartSelect.i6.i7.i32.i32(i7 %indvar, i32 1, i32 6)"
ST_9 : Operation 97 [1/1] (0.00ns)   --->   "br i1 %tmp, label %branch5, label %branch4" [matrix_mult/matrix_mult.cpp:6]
ST_9 : Operation 98 [1/1] (0.00ns)   --->   "%burstread_rend = call i32 (...)* @_ssdm_op_SpecRegionEnd([17 x i8]* @burstread_OC_region_s, i32 %burstread_rbegin) nounwind"
ST_9 : Operation 99 [1/1] (0.00ns)   --->   "br label %burst.rd.header"

 <State 10> : 8.75ns
ST_10 : Operation 100 [1/1] (8.75ns)   --->   "%gmem_addr_2_read = call i32 @_ssdm_op_Read.m_axi.i32P(i32* %gmem_addr_2)" [matrix_mult/matrix_mult.cpp:6]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 11> : 3.25ns
ST_11 : Operation 101 [1/1] (0.00ns)   --->   "%empty = call i32 (...)* @_ssdm_op_SpecLoopTripCount(i64 64, i64 64, i64 64) nounwind"
ST_11 : Operation 102 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecPipeline(i32 1, i32 1, i32 1, i32 0, [1 x i8]* @p_str8)"
ST_11 : Operation 103 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecLoopName([15 x i8]* @memcpy_OC_tempA_OC_A)"
ST_11 : Operation 104 [1/1] (0.00ns)   --->   "%newIndex1 = zext i6 %newIndex to i64"
ST_11 : Operation 105 [1/1] (0.00ns)   --->   "%tempA_0_addr = getelementptr [32 x i32]* %tempA_0, i64 0, i64 %newIndex1" [matrix_mult/matrix_mult.cpp:6]
ST_11 : Operation 106 [1/1] (0.00ns)   --->   "%tempA_1_addr = getelementptr [32 x i32]* %tempA_1, i64 0, i64 %newIndex1" [matrix_mult/matrix_mult.cpp:6]
ST_11 : Operation 107 [1/1] (3.25ns)   --->   "store i32 %gmem_addr_2_read, i32* %tempA_0_addr, align 4" [matrix_mult/matrix_mult.cpp:6]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_11 : Operation 108 [1/1] (0.00ns)   --->   "br label %burst.rd.body506" [matrix_mult/matrix_mult.cpp:6]
ST_11 : Operation 109 [1/1] (3.25ns)   --->   "store i32 %gmem_addr_2_read, i32* %tempA_1_addr, align 4" [matrix_mult/matrix_mult.cpp:6]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_11 : Operation 110 [1/1] (0.00ns)   --->   "br label %burst.rd.body506" [matrix_mult/matrix_mult.cpp:6]

 <State 12> : 8.75ns
ST_12 : Operation 111 [7/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 13> : 8.75ns
ST_13 : Operation 112 [6/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 14> : 8.75ns
ST_14 : Operation 113 [5/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 15> : 8.75ns
ST_15 : Operation 114 [4/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 16> : 8.75ns
ST_16 : Operation 115 [3/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 17> : 8.75ns
ST_17 : Operation 116 [2/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 18> : 8.75ns
ST_18 : Operation 117 [1/7] (8.75ns)   --->   "%gmem_addr_1_rd_req = call i1 @_ssdm_op_ReadReq.m_axi.i32P(i32* %gmem_addr_1, i32 64)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>
ST_18 : Operation 118 [1/1] (1.76ns)   --->   "br label %burst.rd.header7"

 <State 19> : 2.42ns
ST_19 : Operation 119 [1/1] (0.00ns)   --->   "%indvar9 = phi i7 [ %indvar_next1, %burst.rd.body8420 ], [ 0, %burst.rd.header7.preheader ]"
ST_19 : Operation 120 [1/1] (1.48ns)   --->   "%exitcond4 = icmp eq i7 %indvar9, -64"   --->   Core 25 'Cmp' <Latency = 0> <II = 1> <Delay = 1.48> <FuncUnit> <Opcode : 'icmp'> <InPorts = 2> <OutPorts = 1>
ST_19 : Operation 121 [1/1] (1.87ns)   --->   "%indvar_next1 = add i7 %indvar9, 1"   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_19 : Operation 122 [1/1] (0.00ns)   --->   "br i1 %exitcond4, label %burst.rd.end6.0.preheader, label %burst.rd.body8"
ST_19 : Operation 123 [1/1] (0.00ns)   --->   "%burstread_rbegin1 = call i32 (...)* @_ssdm_op_SpecRegionBegin([17 x i8]* @burstread_OC_region_s) nounwind"
ST_19 : Operation 124 [1/1] (0.00ns)   --->   "%tmp_1 = trunc i7 %indvar9 to i1"
ST_19 : Operation 125 [1/1] (0.00ns)   --->   "%newIndex2 = call i6 @_ssdm_op_PartSelect.i6.i7.i32.i32(i7 %indvar9, i32 1, i32 6)"
ST_19 : Operation 126 [1/1] (0.00ns)   --->   "br i1 %tmp_1, label %branch3, label %branch2" [matrix_mult/matrix_mult.cpp:7]
ST_19 : Operation 127 [1/1] (0.00ns)   --->   "%burstread_rend14 = call i32 (...)* @_ssdm_op_SpecRegionEnd([17 x i8]* @burstread_OC_region_s, i32 %burstread_rbegin1) nounwind"
ST_19 : Operation 128 [1/1] (0.00ns)   --->   "br label %burst.rd.header7"

 <State 20> : 8.75ns
ST_20 : Operation 129 [1/1] (8.75ns)   --->   "%gmem_addr_1_read = call i32 @_ssdm_op_Read.m_axi.i32P(i32* %gmem_addr_1)" [matrix_mult/matrix_mult.cpp:7]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 21> : 3.25ns
ST_21 : Operation 130 [1/1] (0.00ns)   --->   "%empty_7 = call i32 (...)* @_ssdm_op_SpecLoopTripCount(i64 64, i64 64, i64 64) nounwind"
ST_21 : Operation 131 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecPipeline(i32 1, i32 1, i32 1, i32 0, [1 x i8]* @p_str9)"
ST_21 : Operation 132 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecLoopName([15 x i8]* @memcpy_OC_tempB_OC_B)"
ST_21 : Operation 133 [1/1] (0.00ns)   --->   "%newIndex3 = zext i6 %newIndex2 to i64"
ST_21 : Operation 134 [1/1] (0.00ns)   --->   "%tempB_0_addr = getelementptr [32 x i32]* %tempB_0, i64 0, i64 %newIndex3" [matrix_mult/matrix_mult.cpp:7]
ST_21 : Operation 135 [1/1] (0.00ns)   --->   "%tempB_1_addr = getelementptr [32 x i32]* %tempB_1, i64 0, i64 %newIndex3" [matrix_mult/matrix_mult.cpp:7]
ST_21 : Operation 136 [1/1] (3.25ns)   --->   "store i32 %gmem_addr_1_read, i32* %tempB_0_addr, align 4" [matrix_mult/matrix_mult.cpp:7]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_21 : Operation 137 [1/1] (0.00ns)   --->   "br label %burst.rd.body8420" [matrix_mult/matrix_mult.cpp:7]
ST_21 : Operation 138 [1/1] (3.25ns)   --->   "store i32 %gmem_addr_1_read, i32* %tempB_1_addr, align 4" [matrix_mult/matrix_mult.cpp:7]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_21 : Operation 139 [1/1] (0.00ns)   --->   "br label %burst.rd.body8420" [matrix_mult/matrix_mult.cpp:7]

 <State 22> : 3.25ns
ST_22 : Operation 140 [1/1] (0.00ns)   --->   "%tempB_0_addr_1 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 0"
ST_22 : Operation 141 [2/2] (3.25ns)   --->   "%tempB_0_load = load i32* %tempB_0_addr_1, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_22 : Operation 142 [1/1] (0.00ns)   --->   "%tempB_0_addr_2 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 4"
ST_22 : Operation 143 [2/2] (3.25ns)   --->   "%tempB_0_load_1 = load i32* %tempB_0_addr_2, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_22 : Operation 144 [1/1] (0.00ns)   --->   "%tempB_1_addr_1 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 0"
ST_22 : Operation 145 [2/2] (3.25ns)   --->   "%tempB_1_load = load i32* %tempB_1_addr_1, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_22 : Operation 146 [1/1] (0.00ns)   --->   "%tempB_1_addr_2 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 4"
ST_22 : Operation 147 [2/2] (3.25ns)   --->   "%tempB_1_load_1 = load i32* %tempB_1_addr_2, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 23> : 3.25ns
ST_23 : Operation 148 [1/2] (3.25ns)   --->   "%tempB_0_load = load i32* %tempB_0_addr_1, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_23 : Operation 149 [1/2] (3.25ns)   --->   "%tempB_0_load_1 = load i32* %tempB_0_addr_2, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_23 : Operation 150 [1/1] (0.00ns)   --->   "%tempB_0_addr_3 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 8"
ST_23 : Operation 151 [2/2] (3.25ns)   --->   "%tempB_0_load_2 = load i32* %tempB_0_addr_3, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_23 : Operation 152 [1/1] (0.00ns)   --->   "%tempB_0_addr_4 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 12"
ST_23 : Operation 153 [2/2] (3.25ns)   --->   "%tempB_0_load_3 = load i32* %tempB_0_addr_4, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_23 : Operation 154 [1/2] (3.25ns)   --->   "%tempB_1_load = load i32* %tempB_1_addr_1, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_23 : Operation 155 [1/2] (3.25ns)   --->   "%tempB_1_load_1 = load i32* %tempB_1_addr_2, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_23 : Operation 156 [1/1] (0.00ns)   --->   "%tempB_1_addr_3 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 8"
ST_23 : Operation 157 [2/2] (3.25ns)   --->   "%tempB_1_load_2 = load i32* %tempB_1_addr_3, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_23 : Operation 158 [1/1] (0.00ns)   --->   "%tempB_1_addr_4 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 12"
ST_23 : Operation 159 [2/2] (3.25ns)   --->   "%tempB_1_load_3 = load i32* %tempB_1_addr_4, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 24> : 3.25ns
ST_24 : Operation 160 [1/2] (3.25ns)   --->   "%tempB_0_load_2 = load i32* %tempB_0_addr_3, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_24 : Operation 161 [1/2] (3.25ns)   --->   "%tempB_0_load_3 = load i32* %tempB_0_addr_4, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_24 : Operation 162 [1/1] (0.00ns)   --->   "%tempB_0_addr_5 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 16"
ST_24 : Operation 163 [2/2] (3.25ns)   --->   "%tempB_0_load_4 = load i32* %tempB_0_addr_5, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_24 : Operation 164 [1/1] (0.00ns)   --->   "%tempB_0_addr_6 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 20"
ST_24 : Operation 165 [2/2] (3.25ns)   --->   "%tempB_0_load_5 = load i32* %tempB_0_addr_6, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_24 : Operation 166 [1/2] (3.25ns)   --->   "%tempB_1_load_2 = load i32* %tempB_1_addr_3, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_24 : Operation 167 [1/2] (3.25ns)   --->   "%tempB_1_load_3 = load i32* %tempB_1_addr_4, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_24 : Operation 168 [1/1] (0.00ns)   --->   "%tempB_1_addr_5 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 16"
ST_24 : Operation 169 [2/2] (3.25ns)   --->   "%tempB_1_load_4 = load i32* %tempB_1_addr_5, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_24 : Operation 170 [1/1] (0.00ns)   --->   "%tempB_1_addr_6 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 20"
ST_24 : Operation 171 [2/2] (3.25ns)   --->   "%tempB_1_load_5 = load i32* %tempB_1_addr_6, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 25> : 3.25ns
ST_25 : Operation 172 [1/2] (3.25ns)   --->   "%tempB_0_load_4 = load i32* %tempB_0_addr_5, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_25 : Operation 173 [1/2] (3.25ns)   --->   "%tempB_0_load_5 = load i32* %tempB_0_addr_6, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_25 : Operation 174 [1/1] (0.00ns)   --->   "%tempB_0_addr_7 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 24"
ST_25 : Operation 175 [2/2] (3.25ns)   --->   "%tempB_0_load_6 = load i32* %tempB_0_addr_7, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_25 : Operation 176 [1/1] (0.00ns)   --->   "%tempB_0_addr_8 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 28"
ST_25 : Operation 177 [2/2] (3.25ns)   --->   "%tempB_0_load_7 = load i32* %tempB_0_addr_8, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_25 : Operation 178 [1/2] (3.25ns)   --->   "%tempB_1_load_4 = load i32* %tempB_1_addr_5, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_25 : Operation 179 [1/2] (3.25ns)   --->   "%tempB_1_load_5 = load i32* %tempB_1_addr_6, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_25 : Operation 180 [1/1] (0.00ns)   --->   "%tempB_1_addr_7 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 24"
ST_25 : Operation 181 [2/2] (3.25ns)   --->   "%tempB_1_load_6 = load i32* %tempB_1_addr_7, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_25 : Operation 182 [1/1] (0.00ns)   --->   "%tempB_1_addr_8 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 28"
ST_25 : Operation 183 [2/2] (3.25ns)   --->   "%tempB_1_load_7 = load i32* %tempB_1_addr_8, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 26> : 3.25ns
ST_26 : Operation 184 [1/2] (3.25ns)   --->   "%tempB_0_load_6 = load i32* %tempB_0_addr_7, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_26 : Operation 185 [1/2] (3.25ns)   --->   "%tempB_0_load_7 = load i32* %tempB_0_addr_8, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_26 : Operation 186 [1/2] (3.25ns)   --->   "%tempB_1_load_6 = load i32* %tempB_1_addr_7, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_26 : Operation 187 [1/2] (3.25ns)   --->   "%tempB_1_load_7 = load i32* %tempB_1_addr_8, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_26 : Operation 188 [1/1] (0.00ns)   --->   "%tempB_0_addr_9 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 1"
ST_26 : Operation 189 [2/2] (3.25ns)   --->   "%tempB_0_load_8 = load i32* %tempB_0_addr_9, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_26 : Operation 190 [1/1] (0.00ns)   --->   "%tempB_0_addr_10 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 5"
ST_26 : Operation 191 [2/2] (3.25ns)   --->   "%tempB_0_load_9 = load i32* %tempB_0_addr_10, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_26 : Operation 192 [1/1] (0.00ns)   --->   "%tempB_1_addr_9 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 1"
ST_26 : Operation 193 [2/2] (3.25ns)   --->   "%tempB_1_load_8 = load i32* %tempB_1_addr_9, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_26 : Operation 194 [1/1] (0.00ns)   --->   "%tempB_1_addr_10 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 5"
ST_26 : Operation 195 [2/2] (3.25ns)   --->   "%tempB_1_load_9 = load i32* %tempB_1_addr_10, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 27> : 3.25ns
ST_27 : Operation 196 [1/2] (3.25ns)   --->   "%tempB_0_load_8 = load i32* %tempB_0_addr_9, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_27 : Operation 197 [1/2] (3.25ns)   --->   "%tempB_0_load_9 = load i32* %tempB_0_addr_10, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_27 : Operation 198 [1/1] (0.00ns)   --->   "%tempB_0_addr_11 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 9"
ST_27 : Operation 199 [2/2] (3.25ns)   --->   "%tempB_0_load_10 = load i32* %tempB_0_addr_11, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_27 : Operation 200 [1/1] (0.00ns)   --->   "%tempB_0_addr_12 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 13"
ST_27 : Operation 201 [2/2] (3.25ns)   --->   "%tempB_0_load_11 = load i32* %tempB_0_addr_12, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_27 : Operation 202 [1/2] (3.25ns)   --->   "%tempB_1_load_8 = load i32* %tempB_1_addr_9, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_27 : Operation 203 [1/2] (3.25ns)   --->   "%tempB_1_load_9 = load i32* %tempB_1_addr_10, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_27 : Operation 204 [1/1] (0.00ns)   --->   "%tempB_1_addr_11 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 9"
ST_27 : Operation 205 [2/2] (3.25ns)   --->   "%tempB_1_load_10 = load i32* %tempB_1_addr_11, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_27 : Operation 206 [1/1] (0.00ns)   --->   "%tempB_1_addr_12 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 13"
ST_27 : Operation 207 [2/2] (3.25ns)   --->   "%tempB_1_load_11 = load i32* %tempB_1_addr_12, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 28> : 3.25ns
ST_28 : Operation 208 [1/2] (3.25ns)   --->   "%tempB_0_load_10 = load i32* %tempB_0_addr_11, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_28 : Operation 209 [1/2] (3.25ns)   --->   "%tempB_0_load_11 = load i32* %tempB_0_addr_12, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_28 : Operation 210 [1/1] (0.00ns)   --->   "%tempB_0_addr_13 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 17"
ST_28 : Operation 211 [2/2] (3.25ns)   --->   "%tempB_0_load_12 = load i32* %tempB_0_addr_13, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_28 : Operation 212 [1/1] (0.00ns)   --->   "%tempB_0_addr_14 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 21"
ST_28 : Operation 213 [2/2] (3.25ns)   --->   "%tempB_0_load_13 = load i32* %tempB_0_addr_14, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_28 : Operation 214 [1/2] (3.25ns)   --->   "%tempB_1_load_10 = load i32* %tempB_1_addr_11, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_28 : Operation 215 [1/2] (3.25ns)   --->   "%tempB_1_load_11 = load i32* %tempB_1_addr_12, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_28 : Operation 216 [1/1] (0.00ns)   --->   "%tempB_1_addr_13 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 17"
ST_28 : Operation 217 [2/2] (3.25ns)   --->   "%tempB_1_load_12 = load i32* %tempB_1_addr_13, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_28 : Operation 218 [1/1] (0.00ns)   --->   "%tempB_1_addr_14 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 21"
ST_28 : Operation 219 [2/2] (3.25ns)   --->   "%tempB_1_load_13 = load i32* %tempB_1_addr_14, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 29> : 3.25ns
ST_29 : Operation 220 [1/2] (3.25ns)   --->   "%tempB_0_load_12 = load i32* %tempB_0_addr_13, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_29 : Operation 221 [1/2] (3.25ns)   --->   "%tempB_0_load_13 = load i32* %tempB_0_addr_14, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_29 : Operation 222 [1/1] (0.00ns)   --->   "%tempB_0_addr_15 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 25"
ST_29 : Operation 223 [2/2] (3.25ns)   --->   "%tempB_0_load_14 = load i32* %tempB_0_addr_15, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_29 : Operation 224 [1/1] (0.00ns)   --->   "%tempB_0_addr_16 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 29"
ST_29 : Operation 225 [2/2] (3.25ns)   --->   "%tempB_0_load_15 = load i32* %tempB_0_addr_16, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_29 : Operation 226 [1/2] (3.25ns)   --->   "%tempB_1_load_12 = load i32* %tempB_1_addr_13, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_29 : Operation 227 [1/2] (3.25ns)   --->   "%tempB_1_load_13 = load i32* %tempB_1_addr_14, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_29 : Operation 228 [1/1] (0.00ns)   --->   "%tempB_1_addr_15 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 25"
ST_29 : Operation 229 [2/2] (3.25ns)   --->   "%tempB_1_load_14 = load i32* %tempB_1_addr_15, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_29 : Operation 230 [1/1] (0.00ns)   --->   "%tempB_1_addr_16 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 29"
ST_29 : Operation 231 [2/2] (3.25ns)   --->   "%tempB_1_load_15 = load i32* %tempB_1_addr_16, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 30> : 3.25ns
ST_30 : Operation 232 [1/2] (3.25ns)   --->   "%tempB_0_load_14 = load i32* %tempB_0_addr_15, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_30 : Operation 233 [1/2] (3.25ns)   --->   "%tempB_0_load_15 = load i32* %tempB_0_addr_16, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_30 : Operation 234 [1/2] (3.25ns)   --->   "%tempB_1_load_14 = load i32* %tempB_1_addr_15, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_30 : Operation 235 [1/2] (3.25ns)   --->   "%tempB_1_load_15 = load i32* %tempB_1_addr_16, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_30 : Operation 236 [1/1] (0.00ns)   --->   "%tempB_0_addr_17 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 2"
ST_30 : Operation 237 [2/2] (3.25ns)   --->   "%tempB_0_load_16 = load i32* %tempB_0_addr_17, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_30 : Operation 238 [1/1] (0.00ns)   --->   "%tempB_0_addr_18 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 6"
ST_30 : Operation 239 [2/2] (3.25ns)   --->   "%tempB_0_load_17 = load i32* %tempB_0_addr_18, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_30 : Operation 240 [1/1] (0.00ns)   --->   "%tempB_1_addr_17 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 2"
ST_30 : Operation 241 [2/2] (3.25ns)   --->   "%tempB_1_load_16 = load i32* %tempB_1_addr_17, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_30 : Operation 242 [1/1] (0.00ns)   --->   "%tempB_1_addr_18 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 6"
ST_30 : Operation 243 [2/2] (3.25ns)   --->   "%tempB_1_load_17 = load i32* %tempB_1_addr_18, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 31> : 3.25ns
ST_31 : Operation 244 [1/2] (3.25ns)   --->   "%tempB_0_load_16 = load i32* %tempB_0_addr_17, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_31 : Operation 245 [1/2] (3.25ns)   --->   "%tempB_0_load_17 = load i32* %tempB_0_addr_18, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_31 : Operation 246 [1/1] (0.00ns)   --->   "%tempB_0_addr_19 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 10"
ST_31 : Operation 247 [2/2] (3.25ns)   --->   "%tempB_0_load_18 = load i32* %tempB_0_addr_19, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_31 : Operation 248 [1/1] (0.00ns)   --->   "%tempB_0_addr_20 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 14"
ST_31 : Operation 249 [2/2] (3.25ns)   --->   "%tempB_0_load_19 = load i32* %tempB_0_addr_20, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_31 : Operation 250 [1/2] (3.25ns)   --->   "%tempB_1_load_16 = load i32* %tempB_1_addr_17, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_31 : Operation 251 [1/2] (3.25ns)   --->   "%tempB_1_load_17 = load i32* %tempB_1_addr_18, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_31 : Operation 252 [1/1] (0.00ns)   --->   "%tempB_1_addr_19 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 10"
ST_31 : Operation 253 [2/2] (3.25ns)   --->   "%tempB_1_load_18 = load i32* %tempB_1_addr_19, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_31 : Operation 254 [1/1] (0.00ns)   --->   "%tempB_1_addr_20 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 14"
ST_31 : Operation 255 [2/2] (3.25ns)   --->   "%tempB_1_load_19 = load i32* %tempB_1_addr_20, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 32> : 3.25ns
ST_32 : Operation 256 [1/2] (3.25ns)   --->   "%tempB_0_load_18 = load i32* %tempB_0_addr_19, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_32 : Operation 257 [1/2] (3.25ns)   --->   "%tempB_0_load_19 = load i32* %tempB_0_addr_20, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_32 : Operation 258 [1/1] (0.00ns)   --->   "%tempB_0_addr_21 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 18"
ST_32 : Operation 259 [2/2] (3.25ns)   --->   "%tempB_0_load_20 = load i32* %tempB_0_addr_21, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_32 : Operation 260 [1/1] (0.00ns)   --->   "%tempB_0_addr_22 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 22"
ST_32 : Operation 261 [2/2] (3.25ns)   --->   "%tempB_0_load_21 = load i32* %tempB_0_addr_22, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_32 : Operation 262 [1/2] (3.25ns)   --->   "%tempB_1_load_18 = load i32* %tempB_1_addr_19, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_32 : Operation 263 [1/2] (3.25ns)   --->   "%tempB_1_load_19 = load i32* %tempB_1_addr_20, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_32 : Operation 264 [1/1] (0.00ns)   --->   "%tempB_1_addr_21 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 18"
ST_32 : Operation 265 [2/2] (3.25ns)   --->   "%tempB_1_load_20 = load i32* %tempB_1_addr_21, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_32 : Operation 266 [1/1] (0.00ns)   --->   "%tempB_1_addr_22 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 22"
ST_32 : Operation 267 [2/2] (3.25ns)   --->   "%tempB_1_load_21 = load i32* %tempB_1_addr_22, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 33> : 3.25ns
ST_33 : Operation 268 [1/2] (3.25ns)   --->   "%tempB_0_load_20 = load i32* %tempB_0_addr_21, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_33 : Operation 269 [1/2] (3.25ns)   --->   "%tempB_0_load_21 = load i32* %tempB_0_addr_22, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_33 : Operation 270 [1/1] (0.00ns)   --->   "%tempB_0_addr_23 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 26"
ST_33 : Operation 271 [2/2] (3.25ns)   --->   "%tempB_0_load_22 = load i32* %tempB_0_addr_23, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_33 : Operation 272 [1/1] (0.00ns)   --->   "%tempB_0_addr_24 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 30"
ST_33 : Operation 273 [2/2] (3.25ns)   --->   "%tempB_0_load_23 = load i32* %tempB_0_addr_24, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_33 : Operation 274 [1/2] (3.25ns)   --->   "%tempB_1_load_20 = load i32* %tempB_1_addr_21, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_33 : Operation 275 [1/2] (3.25ns)   --->   "%tempB_1_load_21 = load i32* %tempB_1_addr_22, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_33 : Operation 276 [1/1] (0.00ns)   --->   "%tempB_1_addr_23 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 26"
ST_33 : Operation 277 [2/2] (3.25ns)   --->   "%tempB_1_load_22 = load i32* %tempB_1_addr_23, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_33 : Operation 278 [1/1] (0.00ns)   --->   "%tempB_1_addr_24 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 30"
ST_33 : Operation 279 [2/2] (3.25ns)   --->   "%tempB_1_load_23 = load i32* %tempB_1_addr_24, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 34> : 3.25ns
ST_34 : Operation 280 [1/2] (3.25ns)   --->   "%tempB_0_load_22 = load i32* %tempB_0_addr_23, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_34 : Operation 281 [1/2] (3.25ns)   --->   "%tempB_0_load_23 = load i32* %tempB_0_addr_24, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_34 : Operation 282 [1/2] (3.25ns)   --->   "%tempB_1_load_22 = load i32* %tempB_1_addr_23, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_34 : Operation 283 [1/2] (3.25ns)   --->   "%tempB_1_load_23 = load i32* %tempB_1_addr_24, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_34 : Operation 284 [1/1] (0.00ns)   --->   "%tempB_0_addr_25 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 3"
ST_34 : Operation 285 [2/2] (3.25ns)   --->   "%tempB_0_load_24 = load i32* %tempB_0_addr_25, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_34 : Operation 286 [1/1] (0.00ns)   --->   "%tempB_0_addr_26 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 7"
ST_34 : Operation 287 [2/2] (3.25ns)   --->   "%tempB_0_load_25 = load i32* %tempB_0_addr_26, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_34 : Operation 288 [1/1] (0.00ns)   --->   "%tempB_1_addr_25 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 3"
ST_34 : Operation 289 [2/2] (3.25ns)   --->   "%tempB_1_load_24 = load i32* %tempB_1_addr_25, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_34 : Operation 290 [1/1] (0.00ns)   --->   "%tempB_1_addr_26 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 7"
ST_34 : Operation 291 [2/2] (3.25ns)   --->   "%tempB_1_load_25 = load i32* %tempB_1_addr_26, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 35> : 3.25ns
ST_35 : Operation 292 [1/2] (3.25ns)   --->   "%tempB_0_load_24 = load i32* %tempB_0_addr_25, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_35 : Operation 293 [1/2] (3.25ns)   --->   "%tempB_0_load_25 = load i32* %tempB_0_addr_26, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_35 : Operation 294 [1/1] (0.00ns)   --->   "%tempB_0_addr_27 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 11"
ST_35 : Operation 295 [2/2] (3.25ns)   --->   "%tempB_0_load_26 = load i32* %tempB_0_addr_27, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_35 : Operation 296 [1/1] (0.00ns)   --->   "%tempB_0_addr_28 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 15"
ST_35 : Operation 297 [2/2] (3.25ns)   --->   "%tempB_0_load_27 = load i32* %tempB_0_addr_28, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_35 : Operation 298 [1/2] (3.25ns)   --->   "%tempB_1_load_24 = load i32* %tempB_1_addr_25, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_35 : Operation 299 [1/2] (3.25ns)   --->   "%tempB_1_load_25 = load i32* %tempB_1_addr_26, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_35 : Operation 300 [1/1] (0.00ns)   --->   "%tempB_1_addr_27 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 11"
ST_35 : Operation 301 [2/2] (3.25ns)   --->   "%tempB_1_load_26 = load i32* %tempB_1_addr_27, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_35 : Operation 302 [1/1] (0.00ns)   --->   "%tempB_1_addr_28 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 15"
ST_35 : Operation 303 [2/2] (3.25ns)   --->   "%tempB_1_load_27 = load i32* %tempB_1_addr_28, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 36> : 3.25ns
ST_36 : Operation 304 [1/2] (3.25ns)   --->   "%tempB_0_load_26 = load i32* %tempB_0_addr_27, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_36 : Operation 305 [1/2] (3.25ns)   --->   "%tempB_0_load_27 = load i32* %tempB_0_addr_28, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_36 : Operation 306 [1/1] (0.00ns)   --->   "%tempB_0_addr_29 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 19"
ST_36 : Operation 307 [2/2] (3.25ns)   --->   "%tempB_0_load_28 = load i32* %tempB_0_addr_29, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_36 : Operation 308 [1/1] (0.00ns)   --->   "%tempB_0_addr_30 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 23"
ST_36 : Operation 309 [2/2] (3.25ns)   --->   "%tempB_0_load_29 = load i32* %tempB_0_addr_30, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_36 : Operation 310 [1/2] (3.25ns)   --->   "%tempB_1_load_26 = load i32* %tempB_1_addr_27, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_36 : Operation 311 [1/2] (3.25ns)   --->   "%tempB_1_load_27 = load i32* %tempB_1_addr_28, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_36 : Operation 312 [1/1] (0.00ns)   --->   "%tempB_1_addr_29 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 19"
ST_36 : Operation 313 [2/2] (3.25ns)   --->   "%tempB_1_load_28 = load i32* %tempB_1_addr_29, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_36 : Operation 314 [1/1] (0.00ns)   --->   "%tempB_1_addr_30 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 23"
ST_36 : Operation 315 [2/2] (3.25ns)   --->   "%tempB_1_load_29 = load i32* %tempB_1_addr_30, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 37> : 3.25ns
ST_37 : Operation 316 [1/2] (3.25ns)   --->   "%tempB_0_load_28 = load i32* %tempB_0_addr_29, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_37 : Operation 317 [1/2] (3.25ns)   --->   "%tempB_0_load_29 = load i32* %tempB_0_addr_30, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_37 : Operation 318 [1/1] (0.00ns)   --->   "%tempB_0_addr_31 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 27"
ST_37 : Operation 319 [2/2] (3.25ns)   --->   "%tempB_0_load_30 = load i32* %tempB_0_addr_31, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_37 : Operation 320 [1/1] (0.00ns)   --->   "%tempB_0_addr_32 = getelementptr [32 x i32]* %tempB_0, i64 0, i64 31"
ST_37 : Operation 321 [2/2] (3.25ns)   --->   "%tempB_0_load_31 = load i32* %tempB_0_addr_32, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_37 : Operation 322 [1/2] (3.25ns)   --->   "%tempB_1_load_28 = load i32* %tempB_1_addr_29, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_37 : Operation 323 [1/2] (3.25ns)   --->   "%tempB_1_load_29 = load i32* %tempB_1_addr_30, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_37 : Operation 324 [1/1] (0.00ns)   --->   "%tempB_1_addr_31 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 27"
ST_37 : Operation 325 [2/2] (3.25ns)   --->   "%tempB_1_load_30 = load i32* %tempB_1_addr_31, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_37 : Operation 326 [1/1] (0.00ns)   --->   "%tempB_1_addr_32 = getelementptr [32 x i32]* %tempB_1, i64 0, i64 31"
ST_37 : Operation 327 [2/2] (3.25ns)   --->   "%tempB_1_load_31 = load i32* %tempB_1_addr_32, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 38> : 3.25ns
ST_38 : Operation 328 [1/2] (3.25ns)   --->   "%tempB_0_load_30 = load i32* %tempB_0_addr_31, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_38 : Operation 329 [1/2] (3.25ns)   --->   "%tempB_0_load_31 = load i32* %tempB_0_addr_32, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_38 : Operation 330 [1/2] (3.25ns)   --->   "%tempB_1_load_30 = load i32* %tempB_1_addr_31, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_38 : Operation 331 [1/2] (3.25ns)   --->   "%tempB_1_load_31 = load i32* %tempB_1_addr_32, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_38 : Operation 332 [1/1] (1.76ns)   --->   "br label %burst.rd.end6.0"

 <State 39> : 3.25ns
ST_39 : Operation 333 [1/1] (0.00ns)   --->   "%i = phi i4 [ %i_1_1, %burst.rd.end6.1 ], [ 0, %burst.rd.end6.0.preheader ]" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 334 [1/1] (1.30ns)   --->   "%exitcond2 = icmp eq i4 %i, -8" [matrix_mult/matrix_mult.cpp:10]   --->   Core 25 'Cmp' <Latency = 0> <II = 1> <Delay = 1.48> <FuncUnit> <Opcode : 'icmp'> <InPorts = 2> <OutPorts = 1>
ST_39 : Operation 335 [1/1] (0.00ns)   --->   "br i1 %exitcond2, label %burst.wr.header.preheader, label %burst.rd.end6.1" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 336 [1/1] (0.00ns)   --->   "%tmp_3 = trunc i4 %i to i3" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 337 [1/1] (0.00ns)   --->   "%tmp_4 = call i5 @_ssdm_op_BitConcatenate.i5.i3.i2(i3 %tmp_3, i2 0)" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 338 [1/1] (0.00ns)   --->   "%newIndex5 = zext i5 %tmp_4 to i64" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 339 [1/1] (0.00ns)   --->   "%tempA_0_addr_1 = getelementptr [32 x i32]* %tempA_0, i64 0, i64 %newIndex5" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 340 [2/2] (3.25ns)   --->   "%tempA_0_load = load i32* %tempA_0_addr_1, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_39 : Operation 341 [1/1] (0.00ns)   --->   "%tempA_1_addr_1 = getelementptr [32 x i32]* %tempA_1, i64 0, i64 %newIndex5" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 342 [2/2] (3.25ns)   --->   "%tempA_1_load = load i32* %tempA_1_addr_1, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_39 : Operation 343 [1/1] (0.00ns)   --->   "%newIndex6 = or i5 %tmp_4, 1" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 344 [1/1] (0.00ns)   --->   "%newIndex7 = zext i5 %newIndex6 to i64" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 345 [1/1] (0.00ns)   --->   "%tempA_0_addr_2 = getelementptr [32 x i32]* %tempA_0, i64 0, i64 %newIndex7" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 346 [2/2] (3.25ns)   --->   "%tempA_0_load_1 = load i32* %tempA_0_addr_2, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_39 : Operation 347 [1/1] (0.00ns)   --->   "%tempA_1_addr_2 = getelementptr [32 x i32]* %tempA_1, i64 0, i64 %newIndex7" [matrix_mult/matrix_mult.cpp:10]
ST_39 : Operation 348 [2/2] (3.25ns)   --->   "%tempA_1_load_1 = load i32* %tempA_1_addr_2, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 40> : 3.25ns
ST_40 : Operation 349 [1/2] (3.25ns)   --->   "%tempA_0_load = load i32* %tempA_0_addr_1, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_40 : Operation 350 [1/2] (3.25ns)   --->   "%tempA_1_load = load i32* %tempA_1_addr_1, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_40 : Operation 351 [1/2] (3.25ns)   --->   "%tempA_0_load_1 = load i32* %tempA_0_addr_2, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_40 : Operation 352 [1/2] (3.25ns)   --->   "%tempA_1_load_1 = load i32* %tempA_1_addr_2, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_40 : Operation 353 [1/1] (0.00ns)   --->   "%newIndex8 = or i5 %tmp_4, 2" [matrix_mult/matrix_mult.cpp:10]
ST_40 : Operation 354 [1/1] (0.00ns)   --->   "%newIndex9 = zext i5 %newIndex8 to i64" [matrix_mult/matrix_mult.cpp:10]
ST_40 : Operation 355 [1/1] (0.00ns)   --->   "%tempA_0_addr_3 = getelementptr [32 x i32]* %tempA_0, i64 0, i64 %newIndex9" [matrix_mult/matrix_mult.cpp:10]
ST_40 : Operation 356 [2/2] (3.25ns)   --->   "%tempA_0_load_2 = load i32* %tempA_0_addr_3, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_40 : Operation 357 [1/1] (0.00ns)   --->   "%tempA_1_addr_3 = getelementptr [32 x i32]* %tempA_1, i64 0, i64 %newIndex9" [matrix_mult/matrix_mult.cpp:10]
ST_40 : Operation 358 [2/2] (3.25ns)   --->   "%tempA_1_load_2 = load i32* %tempA_1_addr_3, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_40 : Operation 359 [1/1] (0.00ns)   --->   "%newIndex4 = or i5 %tmp_4, 3" [matrix_mult/matrix_mult.cpp:10]
ST_40 : Operation 360 [1/1] (0.00ns)   --->   "%newIndex10 = zext i5 %newIndex4 to i64" [matrix_mult/matrix_mult.cpp:10]
ST_40 : Operation 361 [1/1] (0.00ns)   --->   "%tempA_0_addr_4 = getelementptr [32 x i32]* %tempA_0, i64 0, i64 %newIndex10" [matrix_mult/matrix_mult.cpp:10]
ST_40 : Operation 362 [2/2] (3.25ns)   --->   "%tempA_0_load_3 = load i32* %tempA_0_addr_4, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_40 : Operation 363 [1/1] (0.00ns)   --->   "%tempA_1_addr_4 = getelementptr [32 x i32]* %tempA_1, i64 0, i64 %newIndex10" [matrix_mult/matrix_mult.cpp:10]
ST_40 : Operation 364 [2/2] (3.25ns)   --->   "%tempA_1_load_3 = load i32* %tempA_1_addr_4, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 41> : 8.51ns
ST_41 : Operation 365 [1/1] (8.51ns)   --->   "%tmp_s = mul nsw i32 %tempB_0_load, %tempA_0_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 366 [1/1] (8.51ns)   --->   "%tmp_10_0_0_1 = mul nsw i32 %tempB_0_load_1, %tempA_1_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 367 [1/1] (8.51ns)   --->   "%tmp_10_0_0_2 = mul nsw i32 %tempB_0_load_2, %tempA_0_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 368 [1/1] (8.51ns)   --->   "%tmp_10_0_0_3 = mul nsw i32 %tempB_0_load_3, %tempA_1_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 369 [1/2] (3.25ns)   --->   "%tempA_0_load_2 = load i32* %tempA_0_addr_3, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_41 : Operation 370 [1/2] (3.25ns)   --->   "%tempA_1_load_2 = load i32* %tempA_1_addr_3, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_41 : Operation 371 [1/2] (3.25ns)   --->   "%tempA_0_load_3 = load i32* %tempA_0_addr_4, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_41 : Operation 372 [1/2] (3.25ns)   --->   "%tempA_1_load_3 = load i32* %tempA_1_addr_4, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_41 : Operation 373 [1/1] (8.51ns)   --->   "%tmp_10_0_1 = mul nsw i32 %tempB_1_load, %tempA_0_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 374 [1/1] (8.51ns)   --->   "%tmp_10_0_1_1 = mul nsw i32 %tempB_1_load_1, %tempA_1_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 375 [1/1] (8.51ns)   --->   "%tmp_10_0_1_2 = mul nsw i32 %tempB_1_load_2, %tempA_0_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 376 [1/1] (8.51ns)   --->   "%tmp_10_0_1_3 = mul nsw i32 %tempB_1_load_3, %tempA_1_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 377 [1/1] (8.51ns)   --->   "%tmp_10_0_2 = mul nsw i32 %tempB_0_load_8, %tempA_0_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 378 [1/1] (8.51ns)   --->   "%tmp_10_0_2_1 = mul nsw i32 %tempB_0_load_9, %tempA_1_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 379 [1/1] (8.51ns)   --->   "%tmp_10_0_2_2 = mul nsw i32 %tempB_0_load_10, %tempA_0_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 380 [1/1] (8.51ns)   --->   "%tmp_10_0_2_3 = mul nsw i32 %tempB_0_load_11, %tempA_1_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 381 [1/1] (8.51ns)   --->   "%tmp_10_0_3 = mul nsw i32 %tempB_1_load_8, %tempA_0_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 382 [1/1] (8.51ns)   --->   "%tmp_10_0_3_1 = mul nsw i32 %tempB_1_load_9, %tempA_1_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 383 [1/1] (8.51ns)   --->   "%tmp_10_0_3_2 = mul nsw i32 %tempB_1_load_10, %tempA_0_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 384 [1/1] (8.51ns)   --->   "%tmp_10_0_3_3 = mul nsw i32 %tempB_1_load_11, %tempA_1_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 385 [1/1] (8.51ns)   --->   "%tmp_10_0_4 = mul nsw i32 %tempB_0_load_16, %tempA_0_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 386 [1/1] (8.51ns)   --->   "%tmp_10_0_4_1 = mul nsw i32 %tempB_0_load_17, %tempA_1_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 387 [1/1] (8.51ns)   --->   "%tmp_10_0_4_2 = mul nsw i32 %tempB_0_load_18, %tempA_0_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 388 [1/1] (8.51ns)   --->   "%tmp_10_0_4_3 = mul nsw i32 %tempB_0_load_19, %tempA_1_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 389 [1/1] (8.51ns)   --->   "%tmp_10_0_5 = mul nsw i32 %tempB_1_load_16, %tempA_0_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 390 [1/1] (8.51ns)   --->   "%tmp_10_0_5_1 = mul nsw i32 %tempB_1_load_17, %tempA_1_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 391 [1/1] (8.51ns)   --->   "%tmp_10_0_5_2 = mul nsw i32 %tempB_1_load_18, %tempA_0_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 392 [1/1] (8.51ns)   --->   "%tmp_10_0_5_3 = mul nsw i32 %tempB_1_load_19, %tempA_1_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 393 [1/1] (8.51ns)   --->   "%tmp_10_0_6 = mul nsw i32 %tempB_0_load_24, %tempA_0_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 394 [1/1] (8.51ns)   --->   "%tmp_10_0_6_1 = mul nsw i32 %tempB_0_load_25, %tempA_1_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 395 [1/1] (8.51ns)   --->   "%tmp_10_0_6_2 = mul nsw i32 %tempB_0_load_26, %tempA_0_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 396 [1/1] (8.51ns)   --->   "%tmp_10_0_6_3 = mul nsw i32 %tempB_0_load_27, %tempA_1_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 397 [1/1] (8.51ns)   --->   "%tmp_10_0_7 = mul nsw i32 %tempB_1_load_24, %tempA_0_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 398 [1/1] (8.51ns)   --->   "%tmp_10_0_7_1 = mul nsw i32 %tempB_1_load_25, %tempA_1_load" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 399 [1/1] (8.51ns)   --->   "%tmp_10_0_7_2 = mul nsw i32 %tempB_1_load_26, %tempA_0_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 400 [1/1] (8.51ns)   --->   "%tmp_10_0_7_3 = mul nsw i32 %tempB_1_load_27, %tempA_1_load_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_41 : Operation 401 [1/1] (0.00ns)   --->   "%newIndex11 = or i5 %tmp_4, 4" [matrix_mult/matrix_mult.cpp:10]
ST_41 : Operation 402 [1/1] (0.00ns)   --->   "%newIndex12 = zext i5 %newIndex11 to i64" [matrix_mult/matrix_mult.cpp:10]
ST_41 : Operation 403 [1/1] (0.00ns)   --->   "%tempA_0_addr_5 = getelementptr [32 x i32]* %tempA_0, i64 0, i64 %newIndex12" [matrix_mult/matrix_mult.cpp:10]
ST_41 : Operation 404 [2/2] (3.25ns)   --->   "%tempA_0_load_4 = load i32* %tempA_0_addr_5, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_41 : Operation 405 [1/1] (0.00ns)   --->   "%tempA_1_addr_5 = getelementptr [32 x i32]* %tempA_1, i64 0, i64 %newIndex12" [matrix_mult/matrix_mult.cpp:10]
ST_41 : Operation 406 [2/2] (3.25ns)   --->   "%tempA_1_load_4 = load i32* %tempA_1_addr_5, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_41 : Operation 407 [1/1] (0.00ns)   --->   "%newIndex13 = or i5 %tmp_4, 5" [matrix_mult/matrix_mult.cpp:10]
ST_41 : Operation 408 [1/1] (0.00ns)   --->   "%newIndex14 = zext i5 %newIndex13 to i64" [matrix_mult/matrix_mult.cpp:10]
ST_41 : Operation 409 [1/1] (0.00ns)   --->   "%tempA_0_addr_6 = getelementptr [32 x i32]* %tempA_0, i64 0, i64 %newIndex14" [matrix_mult/matrix_mult.cpp:10]
ST_41 : Operation 410 [2/2] (3.25ns)   --->   "%tempA_0_load_5 = load i32* %tempA_0_addr_6, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_41 : Operation 411 [1/1] (0.00ns)   --->   "%tempA_1_addr_6 = getelementptr [32 x i32]* %tempA_1, i64 0, i64 %newIndex14" [matrix_mult/matrix_mult.cpp:10]
ST_41 : Operation 412 [2/2] (3.25ns)   --->   "%tempA_1_load_5 = load i32* %tempA_1_addr_6, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 42> : 8.51ns
ST_42 : Operation 413 [1/1] (8.51ns)   --->   "%tmp_10_0_0_4 = mul nsw i32 %tempB_0_load_4, %tempA_0_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 414 [1/1] (8.51ns)   --->   "%tmp_10_0_0_5 = mul nsw i32 %tempB_0_load_5, %tempA_1_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 415 [1/1] (8.51ns)   --->   "%tmp_10_0_0_6 = mul nsw i32 %tempB_0_load_6, %tempA_0_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 416 [1/1] (8.51ns)   --->   "%tmp_10_0_0_7 = mul nsw i32 %tempB_0_load_7, %tempA_1_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 417 [1/1] (2.55ns)   --->   "%tmp2 = add i32 %tmp_s, %tmp_10_0_0_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 418 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp3 = add i32 %tmp_10_0_0_3, %tmp_10_0_0_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 419 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp1 = add i32 %tmp2, %tmp3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 420 [1/1] (8.51ns)   --->   "%tmp_10_0_1_4 = mul nsw i32 %tempB_1_load_4, %tempA_0_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 421 [1/1] (8.51ns)   --->   "%tmp_10_0_1_5 = mul nsw i32 %tempB_1_load_5, %tempA_1_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 422 [1/1] (8.51ns)   --->   "%tmp_10_0_1_6 = mul nsw i32 %tempB_1_load_6, %tempA_0_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 423 [1/1] (8.51ns)   --->   "%tmp_10_0_1_7 = mul nsw i32 %tempB_1_load_7, %tempA_1_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 424 [1/1] (2.55ns)   --->   "%tmp8 = add i32 %tmp_10_0_1, %tmp_10_0_1_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 425 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp9 = add i32 %tmp_10_0_1_3, %tmp_10_0_1_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 426 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp7 = add i32 %tmp8, %tmp9" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 427 [1/1] (8.51ns)   --->   "%tmp_10_0_2_4 = mul nsw i32 %tempB_0_load_12, %tempA_0_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 428 [1/1] (8.51ns)   --->   "%tmp_10_0_2_5 = mul nsw i32 %tempB_0_load_13, %tempA_1_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 429 [1/1] (8.51ns)   --->   "%tmp_10_0_2_6 = mul nsw i32 %tempB_0_load_14, %tempA_0_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 430 [1/1] (8.51ns)   --->   "%tmp_10_0_2_7 = mul nsw i32 %tempB_0_load_15, %tempA_1_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 431 [1/1] (2.55ns)   --->   "%tmp14 = add i32 %tmp_10_0_2, %tmp_10_0_2_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 432 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp15 = add i32 %tmp_10_0_2_3, %tmp_10_0_2_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 433 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp13 = add i32 %tmp14, %tmp15" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 434 [1/1] (8.51ns)   --->   "%tmp_10_0_3_4 = mul nsw i32 %tempB_1_load_12, %tempA_0_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 435 [1/1] (8.51ns)   --->   "%tmp_10_0_3_5 = mul nsw i32 %tempB_1_load_13, %tempA_1_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 436 [1/1] (8.51ns)   --->   "%tmp_10_0_3_6 = mul nsw i32 %tempB_1_load_14, %tempA_0_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 437 [1/1] (8.51ns)   --->   "%tmp_10_0_3_7 = mul nsw i32 %tempB_1_load_15, %tempA_1_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 438 [1/1] (2.55ns)   --->   "%tmp20 = add i32 %tmp_10_0_3, %tmp_10_0_3_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 439 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp21 = add i32 %tmp_10_0_3_3, %tmp_10_0_3_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 440 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp19 = add i32 %tmp20, %tmp21" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 441 [1/1] (8.51ns)   --->   "%tmp_10_0_4_4 = mul nsw i32 %tempB_0_load_20, %tempA_0_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 442 [1/1] (8.51ns)   --->   "%tmp_10_0_4_5 = mul nsw i32 %tempB_0_load_21, %tempA_1_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 443 [1/1] (8.51ns)   --->   "%tmp_10_0_4_6 = mul nsw i32 %tempB_0_load_22, %tempA_0_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 444 [1/1] (8.51ns)   --->   "%tmp_10_0_4_7 = mul nsw i32 %tempB_0_load_23, %tempA_1_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 445 [1/1] (2.55ns)   --->   "%tmp26 = add i32 %tmp_10_0_4, %tmp_10_0_4_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 446 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp27 = add i32 %tmp_10_0_4_3, %tmp_10_0_4_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 447 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp25 = add i32 %tmp26, %tmp27" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 448 [1/1] (8.51ns)   --->   "%tmp_10_0_5_4 = mul nsw i32 %tempB_1_load_20, %tempA_0_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 449 [1/1] (8.51ns)   --->   "%tmp_10_0_5_5 = mul nsw i32 %tempB_1_load_21, %tempA_1_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 450 [1/1] (8.51ns)   --->   "%tmp_10_0_5_6 = mul nsw i32 %tempB_1_load_22, %tempA_0_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 451 [1/1] (8.51ns)   --->   "%tmp_10_0_5_7 = mul nsw i32 %tempB_1_load_23, %tempA_1_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 452 [1/1] (2.55ns)   --->   "%tmp32 = add i32 %tmp_10_0_5, %tmp_10_0_5_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 453 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp33 = add i32 %tmp_10_0_5_3, %tmp_10_0_5_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 454 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp31 = add i32 %tmp32, %tmp33" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 455 [1/1] (8.51ns)   --->   "%tmp_10_0_6_4 = mul nsw i32 %tempB_0_load_28, %tempA_0_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 456 [1/1] (8.51ns)   --->   "%tmp_10_0_6_5 = mul nsw i32 %tempB_0_load_29, %tempA_1_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 457 [1/1] (8.51ns)   --->   "%tmp_10_0_6_6 = mul nsw i32 %tempB_0_load_30, %tempA_0_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 458 [1/1] (8.51ns)   --->   "%tmp_10_0_6_7 = mul nsw i32 %tempB_0_load_31, %tempA_1_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 459 [1/1] (2.55ns)   --->   "%tmp38 = add i32 %tmp_10_0_6, %tmp_10_0_6_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 460 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp39 = add i32 %tmp_10_0_6_3, %tmp_10_0_6_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 461 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp37 = add i32 %tmp38, %tmp39" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 462 [1/1] (8.51ns)   --->   "%tmp_10_0_7_4 = mul nsw i32 %tempB_1_load_28, %tempA_0_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 463 [1/1] (8.51ns)   --->   "%tmp_10_0_7_5 = mul nsw i32 %tempB_1_load_29, %tempA_1_load_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 464 [1/1] (8.51ns)   --->   "%tmp_10_0_7_6 = mul nsw i32 %tempB_1_load_30, %tempA_0_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 465 [1/1] (8.51ns)   --->   "%tmp_10_0_7_7 = mul nsw i32 %tempB_1_load_31, %tempA_1_load_3" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 466 [1/1] (2.55ns)   --->   "%tmp44 = add i32 %tmp_10_0_7, %tmp_10_0_7_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_42 : Operation 467 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp45 = add i32 %tmp_10_0_7_3, %tmp_10_0_7_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 468 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp43 = add i32 %tmp44, %tmp45" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_42 : Operation 469 [1/2] (3.25ns)   --->   "%tempA_0_load_4 = load i32* %tempA_0_addr_5, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_42 : Operation 470 [1/2] (3.25ns)   --->   "%tempA_1_load_4 = load i32* %tempA_1_addr_5, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_42 : Operation 471 [1/2] (3.25ns)   --->   "%tempA_0_load_5 = load i32* %tempA_0_addr_6, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_42 : Operation 472 [1/2] (3.25ns)   --->   "%tempA_1_load_5 = load i32* %tempA_1_addr_6, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_42 : Operation 473 [1/1] (0.00ns)   --->   "%newIndex15 = or i5 %tmp_4, 6" [matrix_mult/matrix_mult.cpp:10]
ST_42 : Operation 474 [1/1] (0.00ns)   --->   "%newIndex16 = zext i5 %newIndex15 to i64" [matrix_mult/matrix_mult.cpp:10]
ST_42 : Operation 475 [1/1] (0.00ns)   --->   "%tempA_0_addr_7 = getelementptr [32 x i32]* %tempA_0, i64 0, i64 %newIndex16" [matrix_mult/matrix_mult.cpp:10]
ST_42 : Operation 476 [2/2] (3.25ns)   --->   "%tempA_0_load_6 = load i32* %tempA_0_addr_7, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_42 : Operation 477 [1/1] (0.00ns)   --->   "%tempA_1_addr_7 = getelementptr [32 x i32]* %tempA_1, i64 0, i64 %newIndex16" [matrix_mult/matrix_mult.cpp:10]
ST_42 : Operation 478 [2/2] (3.25ns)   --->   "%tempA_1_load_6 = load i32* %tempA_1_addr_7, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_42 : Operation 479 [1/1] (0.00ns)   --->   "%newIndex17 = or i5 %tmp_4, 7" [matrix_mult/matrix_mult.cpp:10]
ST_42 : Operation 480 [1/1] (0.00ns)   --->   "%newIndex18 = zext i5 %newIndex17 to i64" [matrix_mult/matrix_mult.cpp:10]
ST_42 : Operation 481 [1/1] (0.00ns)   --->   "%tempA_0_addr_8 = getelementptr [32 x i32]* %tempA_0, i64 0, i64 %newIndex18" [matrix_mult/matrix_mult.cpp:10]
ST_42 : Operation 482 [2/2] (3.25ns)   --->   "%tempA_0_load_7 = load i32* %tempA_0_addr_8, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_42 : Operation 483 [1/1] (0.00ns)   --->   "%tempA_1_addr_8 = getelementptr [32 x i32]* %tempA_1, i64 0, i64 %newIndex18" [matrix_mult/matrix_mult.cpp:10]
ST_42 : Operation 484 [2/2] (3.25ns)   --->   "%tempA_1_load_7 = load i32* %tempA_1_addr_8, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_42 : Operation 485 [1/1] (1.73ns)   --->   "%i_1_1 = add i4 2, %i" [matrix_mult/matrix_mult.cpp:10]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>

 <State 43> : 8.51ns
ST_43 : Operation 486 [1/1] (2.55ns)   --->   "%tmp5 = add i32 %tmp_10_0_0_5, %tmp_10_0_0_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 487 [1/1] (2.55ns)   --->   "%tmp6 = add i32 %tmp_10_0_0_7, %tmp_10_0_0_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 488 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp4 = add i32 %tmp5, %tmp6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 489 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_0_7 = add nsw i32 %tmp1, %tmp4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 490 [1/1] (2.55ns)   --->   "%tmp11 = add i32 %tmp_10_0_1_5, %tmp_10_0_1_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 491 [1/1] (2.55ns)   --->   "%tmp12 = add i32 %tmp_10_0_1_7, %tmp_10_0_1_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 492 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp10 = add i32 %tmp11, %tmp12" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 493 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_1_7 = add nsw i32 %tmp7, %tmp10" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 494 [1/1] (2.55ns)   --->   "%tmp17 = add i32 %tmp_10_0_2_5, %tmp_10_0_2_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 495 [1/1] (2.55ns)   --->   "%tmp18 = add i32 %tmp_10_0_2_7, %tmp_10_0_2_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 496 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp16 = add i32 %tmp17, %tmp18" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 497 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_2_7 = add nsw i32 %tmp13, %tmp16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 498 [1/1] (2.55ns)   --->   "%tmp23 = add i32 %tmp_10_0_3_5, %tmp_10_0_3_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 499 [1/1] (2.55ns)   --->   "%tmp24 = add i32 %tmp_10_0_3_7, %tmp_10_0_3_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 500 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp22 = add i32 %tmp23, %tmp24" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 501 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_3_7 = add nsw i32 %tmp19, %tmp22" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 502 [1/1] (2.55ns)   --->   "%tmp29 = add i32 %tmp_10_0_4_5, %tmp_10_0_4_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 503 [1/1] (2.55ns)   --->   "%tmp30 = add i32 %tmp_10_0_4_7, %tmp_10_0_4_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 504 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp28 = add i32 %tmp29, %tmp30" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 505 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_4_7 = add nsw i32 %tmp25, %tmp28" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 506 [1/1] (2.55ns)   --->   "%tmp35 = add i32 %tmp_10_0_5_5, %tmp_10_0_5_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 507 [1/1] (2.55ns)   --->   "%tmp36 = add i32 %tmp_10_0_5_7, %tmp_10_0_5_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 508 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp34 = add i32 %tmp35, %tmp36" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 509 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_5_7 = add nsw i32 %tmp31, %tmp34" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 510 [1/1] (2.55ns)   --->   "%tmp41 = add i32 %tmp_10_0_6_5, %tmp_10_0_6_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 511 [1/1] (2.55ns)   --->   "%tmp42 = add i32 %tmp_10_0_6_7, %tmp_10_0_6_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 512 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp40 = add i32 %tmp41, %tmp42" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 513 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_6_7 = add nsw i32 %tmp37, %tmp40" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 514 [1/1] (2.55ns)   --->   "%tmp47 = add i32 %tmp_10_0_7_5, %tmp_10_0_7_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 515 [1/1] (2.55ns)   --->   "%tmp48 = add i32 %tmp_10_0_7_7, %tmp_10_0_7_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 516 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp46 = add i32 %tmp47, %tmp48" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 517 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_0_7_7 = add nsw i32 %tmp43, %tmp46" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_43 : Operation 518 [1/1] (8.51ns)   --->   "%tmp_10_1 = mul nsw i32 %tempB_0_load, %tempA_0_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 519 [1/1] (8.51ns)   --->   "%tmp_10_1_0_1 = mul nsw i32 %tempB_0_load_1, %tempA_1_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 520 [1/1] (8.51ns)   --->   "%tmp_10_1_0_2 = mul nsw i32 %tempB_0_load_2, %tempA_0_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 521 [1/1] (8.51ns)   --->   "%tmp_10_1_0_3 = mul nsw i32 %tempB_0_load_3, %tempA_1_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 522 [1/2] (3.25ns)   --->   "%tempA_0_load_6 = load i32* %tempA_0_addr_7, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_43 : Operation 523 [1/2] (3.25ns)   --->   "%tempA_1_load_6 = load i32* %tempA_1_addr_7, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_43 : Operation 524 [1/2] (3.25ns)   --->   "%tempA_0_load_7 = load i32* %tempA_0_addr_8, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_43 : Operation 525 [1/2] (3.25ns)   --->   "%tempA_1_load_7 = load i32* %tempA_1_addr_8, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_43 : Operation 526 [1/1] (8.51ns)   --->   "%tmp_10_1_1 = mul nsw i32 %tempB_1_load, %tempA_0_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 527 [1/1] (8.51ns)   --->   "%tmp_10_1_1_1 = mul nsw i32 %tempB_1_load_1, %tempA_1_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 528 [1/1] (8.51ns)   --->   "%tmp_10_1_1_2 = mul nsw i32 %tempB_1_load_2, %tempA_0_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 529 [1/1] (8.51ns)   --->   "%tmp_10_1_1_3 = mul nsw i32 %tempB_1_load_3, %tempA_1_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 530 [1/1] (8.51ns)   --->   "%tmp_10_1_2 = mul nsw i32 %tempB_0_load_8, %tempA_0_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 531 [1/1] (8.51ns)   --->   "%tmp_10_1_2_1 = mul nsw i32 %tempB_0_load_9, %tempA_1_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 532 [1/1] (8.51ns)   --->   "%tmp_10_1_2_2 = mul nsw i32 %tempB_0_load_10, %tempA_0_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 533 [1/1] (8.51ns)   --->   "%tmp_10_1_2_3 = mul nsw i32 %tempB_0_load_11, %tempA_1_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 534 [1/1] (8.51ns)   --->   "%tmp_10_1_3 = mul nsw i32 %tempB_1_load_8, %tempA_0_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 535 [1/1] (8.51ns)   --->   "%tmp_10_1_3_1 = mul nsw i32 %tempB_1_load_9, %tempA_1_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 536 [1/1] (8.51ns)   --->   "%tmp_10_1_3_2 = mul nsw i32 %tempB_1_load_10, %tempA_0_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 537 [1/1] (8.51ns)   --->   "%tmp_10_1_3_3 = mul nsw i32 %tempB_1_load_11, %tempA_1_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 538 [1/1] (8.51ns)   --->   "%tmp_10_1_4 = mul nsw i32 %tempB_0_load_16, %tempA_0_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 539 [1/1] (8.51ns)   --->   "%tmp_10_1_4_1 = mul nsw i32 %tempB_0_load_17, %tempA_1_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 540 [1/1] (8.51ns)   --->   "%tmp_10_1_4_2 = mul nsw i32 %tempB_0_load_18, %tempA_0_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 541 [1/1] (8.51ns)   --->   "%tmp_10_1_4_3 = mul nsw i32 %tempB_0_load_19, %tempA_1_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 542 [1/1] (8.51ns)   --->   "%tmp_10_1_5 = mul nsw i32 %tempB_1_load_16, %tempA_0_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 543 [1/1] (8.51ns)   --->   "%tmp_10_1_5_1 = mul nsw i32 %tempB_1_load_17, %tempA_1_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 544 [1/1] (8.51ns)   --->   "%tmp_10_1_5_2 = mul nsw i32 %tempB_1_load_18, %tempA_0_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 545 [1/1] (8.51ns)   --->   "%tmp_10_1_5_3 = mul nsw i32 %tempB_1_load_19, %tempA_1_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 546 [1/1] (8.51ns)   --->   "%tmp_10_1_6 = mul nsw i32 %tempB_0_load_24, %tempA_0_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 547 [1/1] (8.51ns)   --->   "%tmp_10_1_6_1 = mul nsw i32 %tempB_0_load_25, %tempA_1_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 548 [1/1] (8.51ns)   --->   "%tmp_10_1_6_2 = mul nsw i32 %tempB_0_load_26, %tempA_0_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 549 [1/1] (8.51ns)   --->   "%tmp_10_1_6_3 = mul nsw i32 %tempB_0_load_27, %tempA_1_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 550 [1/1] (8.51ns)   --->   "%tmp_10_1_7 = mul nsw i32 %tempB_1_load_24, %tempA_0_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 551 [1/1] (8.51ns)   --->   "%tmp_10_1_7_1 = mul nsw i32 %tempB_1_load_25, %tempA_1_load_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 552 [1/1] (8.51ns)   --->   "%tmp_10_1_7_2 = mul nsw i32 %tempB_1_load_26, %tempA_0_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_43 : Operation 553 [1/1] (8.51ns)   --->   "%tmp_10_1_7_3 = mul nsw i32 %tempB_1_load_27, %tempA_1_load_5" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>

 <State 44> : 8.51ns
ST_44 : Operation 554 [1/1] (0.00ns)   --->   "%tempResult_0_addr = getelementptr [32 x i32]* %tempResult_0, i64 0, i64 %newIndex5" [matrix_mult/matrix_mult.cpp:10]
ST_44 : Operation 555 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_0_7, i32* %tempResult_0_addr, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_44 : Operation 556 [1/1] (0.00ns)   --->   "%tempResult_1_addr = getelementptr [32 x i32]* %tempResult_1, i64 0, i64 %newIndex5" [matrix_mult/matrix_mult.cpp:10]
ST_44 : Operation 557 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_1_7, i32* %tempResult_1_addr, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_44 : Operation 558 [1/1] (0.00ns)   --->   "%tempResult_0_addr_1 = getelementptr [32 x i32]* %tempResult_0, i64 0, i64 %newIndex7" [matrix_mult/matrix_mult.cpp:10]
ST_44 : Operation 559 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_2_7, i32* %tempResult_0_addr_1, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_44 : Operation 560 [1/1] (0.00ns)   --->   "%tempResult_1_addr_1 = getelementptr [32 x i32]* %tempResult_1, i64 0, i64 %newIndex7" [matrix_mult/matrix_mult.cpp:10]
ST_44 : Operation 561 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_3_7, i32* %tempResult_1_addr_1, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_44 : Operation 562 [1/1] (8.51ns)   --->   "%tmp_10_1_0_4 = mul nsw i32 %tempB_0_load_4, %tempA_0_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 563 [1/1] (8.51ns)   --->   "%tmp_10_1_0_5 = mul nsw i32 %tempB_0_load_5, %tempA_1_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 564 [1/1] (8.51ns)   --->   "%tmp_10_1_0_6 = mul nsw i32 %tempB_0_load_6, %tempA_0_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 565 [1/1] (8.51ns)   --->   "%tmp_10_1_0_7 = mul nsw i32 %tempB_0_load_7, %tempA_1_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 566 [1/1] (2.55ns)   --->   "%tmp50 = add i32 %tmp_10_1, %tmp_10_1_0_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 567 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp51 = add i32 %tmp_10_1_0_3, %tmp_10_1_0_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 568 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp49 = add i32 %tmp50, %tmp51" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 569 [1/1] (8.51ns)   --->   "%tmp_10_1_1_4 = mul nsw i32 %tempB_1_load_4, %tempA_0_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 570 [1/1] (8.51ns)   --->   "%tmp_10_1_1_5 = mul nsw i32 %tempB_1_load_5, %tempA_1_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 571 [1/1] (8.51ns)   --->   "%tmp_10_1_1_6 = mul nsw i32 %tempB_1_load_6, %tempA_0_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 572 [1/1] (8.51ns)   --->   "%tmp_10_1_1_7 = mul nsw i32 %tempB_1_load_7, %tempA_1_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 573 [1/1] (2.55ns)   --->   "%tmp56 = add i32 %tmp_10_1_1, %tmp_10_1_1_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 574 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp57 = add i32 %tmp_10_1_1_3, %tmp_10_1_1_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 575 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp55 = add i32 %tmp56, %tmp57" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 576 [1/1] (8.51ns)   --->   "%tmp_10_1_2_4 = mul nsw i32 %tempB_0_load_12, %tempA_0_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 577 [1/1] (8.51ns)   --->   "%tmp_10_1_2_5 = mul nsw i32 %tempB_0_load_13, %tempA_1_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 578 [1/1] (8.51ns)   --->   "%tmp_10_1_2_6 = mul nsw i32 %tempB_0_load_14, %tempA_0_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 579 [1/1] (8.51ns)   --->   "%tmp_10_1_2_7 = mul nsw i32 %tempB_0_load_15, %tempA_1_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 580 [1/1] (2.55ns)   --->   "%tmp62 = add i32 %tmp_10_1_2, %tmp_10_1_2_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 581 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp63 = add i32 %tmp_10_1_2_3, %tmp_10_1_2_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 582 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp61 = add i32 %tmp62, %tmp63" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 583 [1/1] (8.51ns)   --->   "%tmp_10_1_3_4 = mul nsw i32 %tempB_1_load_12, %tempA_0_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 584 [1/1] (8.51ns)   --->   "%tmp_10_1_3_5 = mul nsw i32 %tempB_1_load_13, %tempA_1_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 585 [1/1] (8.51ns)   --->   "%tmp_10_1_3_6 = mul nsw i32 %tempB_1_load_14, %tempA_0_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 586 [1/1] (8.51ns)   --->   "%tmp_10_1_3_7 = mul nsw i32 %tempB_1_load_15, %tempA_1_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 587 [1/1] (2.55ns)   --->   "%tmp68 = add i32 %tmp_10_1_3, %tmp_10_1_3_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 588 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp69 = add i32 %tmp_10_1_3_3, %tmp_10_1_3_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 589 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp67 = add i32 %tmp68, %tmp69" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 590 [1/1] (8.51ns)   --->   "%tmp_10_1_4_4 = mul nsw i32 %tempB_0_load_20, %tempA_0_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 591 [1/1] (8.51ns)   --->   "%tmp_10_1_4_5 = mul nsw i32 %tempB_0_load_21, %tempA_1_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 592 [1/1] (8.51ns)   --->   "%tmp_10_1_4_6 = mul nsw i32 %tempB_0_load_22, %tempA_0_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 593 [1/1] (8.51ns)   --->   "%tmp_10_1_4_7 = mul nsw i32 %tempB_0_load_23, %tempA_1_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 594 [1/1] (2.55ns)   --->   "%tmp74 = add i32 %tmp_10_1_4, %tmp_10_1_4_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 595 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp75 = add i32 %tmp_10_1_4_3, %tmp_10_1_4_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 596 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp73 = add i32 %tmp74, %tmp75" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 597 [1/1] (8.51ns)   --->   "%tmp_10_1_5_4 = mul nsw i32 %tempB_1_load_20, %tempA_0_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 598 [1/1] (8.51ns)   --->   "%tmp_10_1_5_5 = mul nsw i32 %tempB_1_load_21, %tempA_1_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 599 [1/1] (8.51ns)   --->   "%tmp_10_1_5_6 = mul nsw i32 %tempB_1_load_22, %tempA_0_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 600 [1/1] (8.51ns)   --->   "%tmp_10_1_5_7 = mul nsw i32 %tempB_1_load_23, %tempA_1_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 601 [1/1] (2.55ns)   --->   "%tmp80 = add i32 %tmp_10_1_5, %tmp_10_1_5_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 602 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp81 = add i32 %tmp_10_1_5_3, %tmp_10_1_5_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 603 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp79 = add i32 %tmp80, %tmp81" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 604 [1/1] (8.51ns)   --->   "%tmp_10_1_6_4 = mul nsw i32 %tempB_0_load_28, %tempA_0_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 605 [1/1] (8.51ns)   --->   "%tmp_10_1_6_5 = mul nsw i32 %tempB_0_load_29, %tempA_1_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 606 [1/1] (8.51ns)   --->   "%tmp_10_1_6_6 = mul nsw i32 %tempB_0_load_30, %tempA_0_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 607 [1/1] (8.51ns)   --->   "%tmp_10_1_6_7 = mul nsw i32 %tempB_0_load_31, %tempA_1_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 608 [1/1] (2.55ns)   --->   "%tmp86 = add i32 %tmp_10_1_6, %tmp_10_1_6_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 609 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp87 = add i32 %tmp_10_1_6_3, %tmp_10_1_6_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 610 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp85 = add i32 %tmp86, %tmp87" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 611 [1/1] (8.51ns)   --->   "%tmp_10_1_7_4 = mul nsw i32 %tempB_1_load_28, %tempA_0_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 612 [1/1] (8.51ns)   --->   "%tmp_10_1_7_5 = mul nsw i32 %tempB_1_load_29, %tempA_1_load_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 613 [1/1] (8.51ns)   --->   "%tmp_10_1_7_6 = mul nsw i32 %tempB_1_load_30, %tempA_0_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 614 [1/1] (8.51ns)   --->   "%tmp_10_1_7_7 = mul nsw i32 %tempB_1_load_31, %tempA_1_load_7" [matrix_mult/matrix_mult.cpp:16]   --->   Core 16 'Mul' <Latency = 0> <II = 1> <Delay = 8.51> <FuncUnit> <Opcode : 'mul'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 615 [1/1] (2.55ns)   --->   "%tmp92 = add i32 %tmp_10_1_7, %tmp_10_1_7_1" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_44 : Operation 616 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp93 = add i32 %tmp_10_1_7_3, %tmp_10_1_7_2" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_44 : Operation 617 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp91 = add i32 %tmp92, %tmp93" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>

 <State 45> : 6.92ns
ST_45 : Operation 618 [1/1] (0.00ns)   --->   "%tempResult_0_addr_2 = getelementptr [32 x i32]* %tempResult_0, i64 0, i64 %newIndex9" [matrix_mult/matrix_mult.cpp:10]
ST_45 : Operation 619 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_4_7, i32* %tempResult_0_addr_2, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_45 : Operation 620 [1/1] (0.00ns)   --->   "%tempResult_1_addr_2 = getelementptr [32 x i32]* %tempResult_1, i64 0, i64 %newIndex9" [matrix_mult/matrix_mult.cpp:10]
ST_45 : Operation 621 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_5_7, i32* %tempResult_1_addr_2, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_45 : Operation 622 [1/1] (0.00ns)   --->   "%tempResult_0_addr_3 = getelementptr [32 x i32]* %tempResult_0, i64 0, i64 %newIndex10" [matrix_mult/matrix_mult.cpp:10]
ST_45 : Operation 623 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_6_7, i32* %tempResult_0_addr_3, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_45 : Operation 624 [1/1] (0.00ns)   --->   "%tempResult_1_addr_3 = getelementptr [32 x i32]* %tempResult_1, i64 0, i64 %newIndex10" [matrix_mult/matrix_mult.cpp:10]
ST_45 : Operation 625 [1/1] (3.25ns)   --->   "store i32 %tmp_11_0_7_7, i32* %tempResult_1_addr_3, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_45 : Operation 626 [1/1] (2.55ns)   --->   "%tmp53 = add i32 %tmp_10_1_0_5, %tmp_10_1_0_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 627 [1/1] (2.55ns)   --->   "%tmp54 = add i32 %tmp_10_1_0_7, %tmp_10_1_0_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 628 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp52 = add i32 %tmp53, %tmp54" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 629 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_0_7 = add nsw i32 %tmp49, %tmp52" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 630 [1/1] (2.55ns)   --->   "%tmp59 = add i32 %tmp_10_1_1_5, %tmp_10_1_1_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 631 [1/1] (2.55ns)   --->   "%tmp60 = add i32 %tmp_10_1_1_7, %tmp_10_1_1_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 632 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp58 = add i32 %tmp59, %tmp60" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 633 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_1_7 = add nsw i32 %tmp55, %tmp58" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 634 [1/1] (2.55ns)   --->   "%tmp65 = add i32 %tmp_10_1_2_5, %tmp_10_1_2_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 635 [1/1] (2.55ns)   --->   "%tmp66 = add i32 %tmp_10_1_2_7, %tmp_10_1_2_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 636 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp64 = add i32 %tmp65, %tmp66" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 637 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_2_7 = add nsw i32 %tmp61, %tmp64" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 638 [1/1] (2.55ns)   --->   "%tmp71 = add i32 %tmp_10_1_3_5, %tmp_10_1_3_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 639 [1/1] (2.55ns)   --->   "%tmp72 = add i32 %tmp_10_1_3_7, %tmp_10_1_3_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 640 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp70 = add i32 %tmp71, %tmp72" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 641 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_3_7 = add nsw i32 %tmp67, %tmp70" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 642 [1/1] (2.55ns)   --->   "%tmp77 = add i32 %tmp_10_1_4_5, %tmp_10_1_4_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 643 [1/1] (2.55ns)   --->   "%tmp78 = add i32 %tmp_10_1_4_7, %tmp_10_1_4_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 644 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp76 = add i32 %tmp77, %tmp78" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 645 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_4_7 = add nsw i32 %tmp73, %tmp76" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 646 [1/1] (2.55ns)   --->   "%tmp83 = add i32 %tmp_10_1_5_5, %tmp_10_1_5_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 647 [1/1] (2.55ns)   --->   "%tmp84 = add i32 %tmp_10_1_5_7, %tmp_10_1_5_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 648 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp82 = add i32 %tmp83, %tmp84" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 649 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_5_7 = add nsw i32 %tmp79, %tmp82" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 650 [1/1] (2.55ns)   --->   "%tmp89 = add i32 %tmp_10_1_6_5, %tmp_10_1_6_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 651 [1/1] (2.55ns)   --->   "%tmp90 = add i32 %tmp_10_1_6_7, %tmp_10_1_6_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 652 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp88 = add i32 %tmp89, %tmp90" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 653 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_6_7 = add nsw i32 %tmp85, %tmp88" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 654 [1/1] (2.55ns)   --->   "%tmp95 = add i32 %tmp_10_1_7_5, %tmp_10_1_7_4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 655 [1/1] (2.55ns)   --->   "%tmp96 = add i32 %tmp_10_1_7_7, %tmp_10_1_7_6" [matrix_mult/matrix_mult.cpp:16]   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_45 : Operation 656 [1/1] (0.00ns) (grouped into TernaryAdder)   --->   "%tmp94 = add i32 %tmp95, %tmp96" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>
ST_45 : Operation 657 [1/1] (4.37ns) (root node of TernaryAdder)   --->   "%tmp_11_1_7_7 = add nsw i32 %tmp91, %tmp94" [matrix_mult/matrix_mult.cpp:16]   --->   Core 80 'TAddSub' <Latency = 0> <II = 1> <Delay = 2.18> <IPBlock> <Opcode : 'add' 'sub'> <InPorts = 3> <OutPorts = 1> <Sync> <CReg>

 <State 46> : 3.25ns
ST_46 : Operation 658 [1/1] (0.00ns)   --->   "%tempResult_0_addr_4 = getelementptr [32 x i32]* %tempResult_0, i64 0, i64 %newIndex12" [matrix_mult/matrix_mult.cpp:10]
ST_46 : Operation 659 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_0_7, i32* %tempResult_0_addr_4, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_46 : Operation 660 [1/1] (0.00ns)   --->   "%tempResult_1_addr_4 = getelementptr [32 x i32]* %tempResult_1, i64 0, i64 %newIndex12" [matrix_mult/matrix_mult.cpp:10]
ST_46 : Operation 661 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_1_7, i32* %tempResult_1_addr_4, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_46 : Operation 662 [1/1] (0.00ns)   --->   "%tempResult_0_addr_5 = getelementptr [32 x i32]* %tempResult_0, i64 0, i64 %newIndex14" [matrix_mult/matrix_mult.cpp:10]
ST_46 : Operation 663 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_2_7, i32* %tempResult_0_addr_5, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_46 : Operation 664 [1/1] (0.00ns)   --->   "%tempResult_1_addr_5 = getelementptr [32 x i32]* %tempResult_1, i64 0, i64 %newIndex14" [matrix_mult/matrix_mult.cpp:10]
ST_46 : Operation 665 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_3_7, i32* %tempResult_1_addr_5, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 47> : 3.25ns
ST_47 : Operation 666 [1/1] (0.00ns)   --->   "%empty_8 = call i32 (...)* @_ssdm_op_SpecLoopTripCount(i64 4, i64 4, i64 4) nounwind"
ST_47 : Operation 667 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecLoopName([24 x i8]* @p_str5) nounwind" [matrix_mult/matrix_mult.cpp:12]
ST_47 : Operation 668 [1/1] (0.00ns)   --->   "%tmp_2 = call i32 (...)* @_ssdm_op_SpecRegionBegin([24 x i8]* @p_str5) nounwind" [matrix_mult/matrix_mult.cpp:12]
ST_47 : Operation 669 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecPipeline(i32 -1, i32 1, i32 1, i32 0, [1 x i8]* @p_str1) nounwind" [matrix_mult/matrix_mult.cpp:12]
ST_47 : Operation 670 [1/1] (0.00ns)   --->   "%empty_9 = call i32 (...)* @_ssdm_op_SpecRegionEnd([24 x i8]* @p_str5, i32 %tmp_2) nounwind" [matrix_mult/matrix_mult.cpp:16]
ST_47 : Operation 671 [1/1] (0.00ns)   --->   "%tempResult_0_addr_6 = getelementptr [32 x i32]* %tempResult_0, i64 0, i64 %newIndex16" [matrix_mult/matrix_mult.cpp:10]
ST_47 : Operation 672 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_4_7, i32* %tempResult_0_addr_6, align 16" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_47 : Operation 673 [1/1] (0.00ns)   --->   "%tempResult_1_addr_6 = getelementptr [32 x i32]* %tempResult_1, i64 0, i64 %newIndex16" [matrix_mult/matrix_mult.cpp:10]
ST_47 : Operation 674 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_5_7, i32* %tempResult_1_addr_6, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_47 : Operation 675 [1/1] (0.00ns)   --->   "%tempResult_0_addr_7 = getelementptr [32 x i32]* %tempResult_0, i64 0, i64 %newIndex18" [matrix_mult/matrix_mult.cpp:10]
ST_47 : Operation 676 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_6_7, i32* %tempResult_0_addr_7, align 8" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_47 : Operation 677 [1/1] (0.00ns)   --->   "%tempResult_1_addr_7 = getelementptr [32 x i32]* %tempResult_1, i64 0, i64 %newIndex18" [matrix_mult/matrix_mult.cpp:10]
ST_47 : Operation 678 [1/1] (3.25ns)   --->   "store i32 %tmp_11_1_7_7, i32* %tempResult_1_addr_7, align 4" [matrix_mult/matrix_mult.cpp:16]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_47 : Operation 679 [1/1] (0.00ns)   --->   "br label %burst.rd.end6.0" [matrix_mult/matrix_mult.cpp:10]

 <State 48> : 8.75ns
ST_48 : Operation 680 [1/1] (8.75ns)   --->   "%gmem_addr_wr_req = call i1 @_ssdm_op_WriteReq.m_axi.i32P(i32* %gmem_addr, i32 64)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>
ST_48 : Operation 681 [1/1] (1.76ns)   --->   "br label %burst.wr.header"

 <State 49> : 3.25ns
ST_49 : Operation 682 [1/1] (0.00ns)   --->   "%indvar1 = phi i7 [ %indvar_next2, %burst.wr.body_ifconv ], [ 0, %burst.wr.header.preheader ]"
ST_49 : Operation 683 [1/1] (1.48ns)   --->   "%exitcond5 = icmp eq i7 %indvar1, -64"   --->   Core 25 'Cmp' <Latency = 0> <II = 1> <Delay = 1.48> <FuncUnit> <Opcode : 'icmp'> <InPorts = 2> <OutPorts = 1>
ST_49 : Operation 684 [1/1] (1.87ns)   --->   "%indvar_next2 = add i7 %indvar1, 1"   --->   Core 14 'AddSub' <Latency = 0> <II = 1> <Delay = 1.87> <FuncUnit> <Opcode : 'add' 'sub'> <InPorts = 2> <OutPorts = 1>
ST_49 : Operation 685 [1/1] (0.00ns)   --->   "br i1 %exitcond5, label %memcpy.tail, label %burst.wr.body_ifconv"
ST_49 : Operation 686 [1/1] (0.00ns)   --->   "%tmp_97 = trunc i7 %indvar1 to i1"
ST_49 : Operation 687 [1/1] (0.00ns)   --->   "%newIndex19 = call i6 @_ssdm_op_PartSelect.i6.i7.i32.i32(i7 %indvar1, i32 1, i32 6)"
ST_49 : Operation 688 [1/1] (0.00ns)   --->   "%newIndex20 = zext i6 %newIndex19 to i64"
ST_49 : Operation 689 [1/1] (0.00ns)   --->   "%tempResult_0_addr_8 = getelementptr [32 x i32]* %tempResult_0, i64 0, i64 %newIndex20" [matrix_mult/matrix_mult.cpp:18]
ST_49 : Operation 690 [1/1] (0.00ns)   --->   "%tempResult_1_addr_8 = getelementptr [32 x i32]* %tempResult_1, i64 0, i64 %newIndex20" [matrix_mult/matrix_mult.cpp:18]
ST_49 : Operation 691 [2/2] (3.25ns)   --->   "%tempResult_1_load = load i32* %tempResult_1_addr_8, align 4" [matrix_mult/matrix_mult.cpp:18]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_49 : Operation 692 [2/2] (3.25ns)   --->   "%tempResult_0_load = load i32* %tempResult_0_addr_8, align 4" [matrix_mult/matrix_mult.cpp:18]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>

 <State 50> : 4.62ns
ST_50 : Operation 693 [1/2] (3.25ns)   --->   "%tempResult_1_load = load i32* %tempResult_1_addr_8, align 4" [matrix_mult/matrix_mult.cpp:18]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_50 : Operation 694 [1/2] (3.25ns)   --->   "%tempResult_0_load = load i32* %tempResult_0_addr_8, align 4" [matrix_mult/matrix_mult.cpp:18]   --->   Core 37 'RAM' <Latency = 1> <II = 1> <Delay = 3.25> <Storage> <Opcode : 'load' 'store'> <Ports = 2> <Width = 32> <Depth = 32> <RAM>
ST_50 : Operation 695 [1/1] (1.37ns)   --->   "%tempResult_load_phi = select i1 %tmp_97, i32 %tempResult_1_load, i32 %tempResult_0_load" [matrix_mult/matrix_mult.cpp:18]   --->   Core 26 'Sel' <Latency = 0> <II = 1> <Delay = 1.37> <FuncUnit> <Opcode : 'select'> <InPorts = 3> <OutPorts = 1>

 <State 51> : 8.75ns
ST_51 : Operation 696 [1/1] (0.00ns)   --->   "%empty_10 = call i32 (...)* @_ssdm_op_SpecLoopTripCount(i64 64, i64 64, i64 64) nounwind"
ST_51 : Operation 697 [1/1] (0.00ns)   --->   "%burstwrite_rbegin = call i32 (...)* @_ssdm_op_SpecRegionBegin([18 x i8]* @burstwrite_OC_region) nounwind"
ST_51 : Operation 698 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecPipeline(i32 1, i32 1, i32 1, i32 0, [1 x i8]* @p_str10)"
ST_51 : Operation 699 [1/1] (0.00ns)   --->   "call void (...)* @_ssdm_op_SpecLoopName([29 x i8]* @memcpy_OC_result_OC_s)"
ST_51 : Operation 700 [1/1] (8.75ns)   --->   "call void @_ssdm_op_Write.m_axi.i32P(i32* %gmem_addr, i32 %tempResult_load_phi, i4 -1)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>
ST_51 : Operation 701 [1/1] (0.00ns)   --->   "%burstwrite_rend = call i32 (...)* @_ssdm_op_SpecRegionEnd([18 x i8]* @burstwrite_OC_region, i32 %burstwrite_rbegin) nounwind"
ST_51 : Operation 702 [1/1] (0.00ns)   --->   "br label %burst.wr.header"

 <State 52> : 8.75ns
ST_52 : Operation 703 [5/5] (8.75ns)   --->   "%gmem_addr_wr_resp = call i1 @_ssdm_op_WriteResp.m_axi.i32P(i32* %gmem_addr)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 53> : 8.75ns
ST_53 : Operation 704 [4/5] (8.75ns)   --->   "%gmem_addr_wr_resp = call i1 @_ssdm_op_WriteResp.m_axi.i32P(i32* %gmem_addr)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 54> : 8.75ns
ST_54 : Operation 705 [3/5] (8.75ns)   --->   "%gmem_addr_wr_resp = call i1 @_ssdm_op_WriteResp.m_axi.i32P(i32* %gmem_addr)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 55> : 8.75ns
ST_55 : Operation 706 [2/5] (8.75ns)   --->   "%gmem_addr_wr_resp = call i1 @_ssdm_op_WriteResp.m_axi.i32P(i32* %gmem_addr)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>

 <State 56> : 8.75ns
ST_56 : Operation 707 [1/5] (8.75ns)   --->   "%gmem_addr_wr_resp = call i1 @_ssdm_op_WriteResp.m_axi.i32P(i32* %gmem_addr)" [matrix_mult/matrix_mult.cpp:18]   --->   Core 9 'm_axi' <Latency = 6> <II = 1> <Delay = 1.00> <Adapter> <Opcode : 'read' 'write' 'readreq' 'writereq' 'writeresp'>
ST_56 : Operation 708 [1/1] (0.00ns)   --->   "ret void" [matrix_mult/matrix_mult.cpp:19]


============================================================
+ Verbose Summary: Timing violations
============================================================
Target clock period: 10ns, clock uncertainty: 1.25ns.

 <State 1>: 1ns
The critical path consists of the following:
	s_axi read on port 'result' [5]  (1 ns)

 <State 2>: 8.75ns
The critical path consists of the following:
	'getelementptr' operation ('gmem_addr_2') [16]  (0 ns)
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [30]  (8.75 ns)

 <State 3>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [30]  (8.75 ns)

 <State 4>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [30]  (8.75 ns)

 <State 5>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [30]  (8.75 ns)

 <State 6>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [30]  (8.75 ns)

 <State 7>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [30]  (8.75 ns)

 <State 8>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [30]  (8.75 ns)

 <State 9>: 2.42ns
The critical path consists of the following:
	'icmp' operation ('exitcond3') [34]  (1.49 ns)
	blocking operation 0.931 ns on control path)

 <State 10>: 8.75ns
The critical path consists of the following:
	bus read on port 'gmem' (matrix_mult/matrix_mult.cpp:6) [42]  (8.75 ns)

 <State 11>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempA_1_addr', matrix_mult/matrix_mult.cpp:6) [47]  (0 ns)
	'store' operation (matrix_mult/matrix_mult.cpp:6) of variable 'gmem_addr_2_read', matrix_mult/matrix_mult.cpp:6 on array 'tempA[1]', matrix_mult/matrix_mult.cpp:5 [53]  (3.25 ns)

 <State 12>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [59]  (8.75 ns)

 <State 13>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [59]  (8.75 ns)

 <State 14>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [59]  (8.75 ns)

 <State 15>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [59]  (8.75 ns)

 <State 16>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [59]  (8.75 ns)

 <State 17>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [59]  (8.75 ns)

 <State 18>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [59]  (8.75 ns)

 <State 19>: 2.42ns
The critical path consists of the following:
	'icmp' operation ('exitcond4') [63]  (1.49 ns)
	blocking operation 0.931 ns on control path)

 <State 20>: 8.75ns
The critical path consists of the following:
	bus read on port 'gmem' (matrix_mult/matrix_mult.cpp:7) [71]  (8.75 ns)

 <State 21>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempB_0_addr', matrix_mult/matrix_mult.cpp:7) [75]  (0 ns)
	'store' operation (matrix_mult/matrix_mult.cpp:7) of variable 'gmem_addr_1_read', matrix_mult/matrix_mult.cpp:7 on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [79]  (3.25 ns)

 <State 22>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempB_0_addr_1') [88]  (0 ns)
	'load' operation ('tempB_0_load', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [89]  (3.25 ns)

 <State 23>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [89]  (3.25 ns)

 <State 24>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_2', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [93]  (3.25 ns)

 <State 25>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_4', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [97]  (3.25 ns)

 <State 26>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_6', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [101]  (3.25 ns)

 <State 27>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_8', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [121]  (3.25 ns)

 <State 28>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_10', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [125]  (3.25 ns)

 <State 29>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_12', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [129]  (3.25 ns)

 <State 30>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_14', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [133]  (3.25 ns)

 <State 31>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_16', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [153]  (3.25 ns)

 <State 32>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_18', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [157]  (3.25 ns)

 <State 33>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_20', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [161]  (3.25 ns)

 <State 34>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_22', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [165]  (3.25 ns)

 <State 35>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_24', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [185]  (3.25 ns)

 <State 36>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_26', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [189]  (3.25 ns)

 <State 37>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_28', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [193]  (3.25 ns)

 <State 38>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempB_0_load_30', matrix_mult/matrix_mult.cpp:16) on array 'tempB[0]', matrix_mult/matrix_mult.cpp:5 [197]  (3.25 ns)

 <State 39>: 3.25ns
The critical path consists of the following:
	'phi' operation ('i', matrix_mult/matrix_mult.cpp:10) with incoming values : ('i_1_1', matrix_mult/matrix_mult.cpp:10) [218]  (0 ns)
	'or' operation ('newIndex6', matrix_mult/matrix_mult.cpp:10) [236]  (0 ns)
	'getelementptr' operation ('tempA_1_addr_2', matrix_mult/matrix_mult.cpp:10) [241]  (0 ns)
	'load' operation ('tempA_1_load_1', matrix_mult/matrix_mult.cpp:16) on array 'tempA[1]', matrix_mult/matrix_mult.cpp:5 [242]  (3.25 ns)

 <State 40>: 3.25ns
The critical path consists of the following:
	'load' operation ('tempA_0_load', matrix_mult/matrix_mult.cpp:16) on array 'tempA[0]', matrix_mult/matrix_mult.cpp:5 [231]  (3.25 ns)

 <State 41>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_s', matrix_mult/matrix_mult.cpp:16) [232]  (8.51 ns)

 <State 42>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_10_0_0_4', matrix_mult/matrix_mult.cpp:16) [248]  (8.51 ns)

 <State 43>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_10_1', matrix_mult/matrix_mult.cpp:16) [393]  (8.51 ns)

 <State 44>: 8.51ns
The critical path consists of the following:
	'mul' operation ('tmp_10_1_0_4', matrix_mult/matrix_mult.cpp:16) [409]  (8.51 ns)

 <State 45>: 6.92ns
The critical path consists of the following:
	'add' operation ('tmp53', matrix_mult/matrix_mult.cpp:16) [424]  (2.55 ns)
	'add' operation ('tmp52', matrix_mult/matrix_mult.cpp:16) [426]  (0 ns)
	'add' operation ('tmp_11_1_0_7', matrix_mult/matrix_mult.cpp:16) [427]  (4.37 ns)

 <State 46>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempResult_0_addr_4', matrix_mult/matrix_mult.cpp:10) [390]  (0 ns)
	'store' operation (matrix_mult/matrix_mult.cpp:16) of variable 'tmp_11_1_0_7', matrix_mult/matrix_mult.cpp:16 on array 'tempResult[0]', matrix_mult/matrix_mult.cpp:5 [428]  (3.25 ns)

 <State 47>: 3.25ns
The critical path consists of the following:
	'getelementptr' operation ('tempResult_0_addr_6', matrix_mult/matrix_mult.cpp:10) [480]  (0 ns)
	'store' operation (matrix_mult/matrix_mult.cpp:16) of variable 'tmp_11_1_4_7', matrix_mult/matrix_mult.cpp:16 on array 'tempResult[0]', matrix_mult/matrix_mult.cpp:5 [496]  (3.25 ns)

 <State 48>: 8.75ns
The critical path consists of the following:
	bus request on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [551]  (8.75 ns)

 <State 49>: 3.25ns
The critical path consists of the following:
	'phi' operation ('indvar1') with incoming values : ('indvar_next2') [554]  (0 ns)
	'getelementptr' operation ('tempResult_1_addr_8', matrix_mult/matrix_mult.cpp:18) [567]  (0 ns)
	'load' operation ('tempResult_1_load', matrix_mult/matrix_mult.cpp:18) on array 'tempResult[1]', matrix_mult/matrix_mult.cpp:5 [568]  (3.25 ns)

 <State 50>: 4.62ns
The critical path consists of the following:
	'load' operation ('tempResult_1_load', matrix_mult/matrix_mult.cpp:18) on array 'tempResult[1]', matrix_mult/matrix_mult.cpp:5 [568]  (3.25 ns)
	'select' operation ('tempResult_load_phi', matrix_mult/matrix_mult.cpp:18) [570]  (1.37 ns)

 <State 51>: 8.75ns
The critical path consists of the following:
	bus write on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [571]  (8.75 ns)

 <State 52>: 8.75ns
The critical path consists of the following:
	bus access on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [575]  (8.75 ns)

 <State 53>: 8.75ns
The critical path consists of the following:
	bus access on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [575]  (8.75 ns)

 <State 54>: 8.75ns
The critical path consists of the following:
	bus access on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [575]  (8.75 ns)

 <State 55>: 8.75ns
The critical path consists of the following:
	bus access on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [575]  (8.75 ns)

 <State 56>: 8.75ns
The critical path consists of the following:
	bus access on port 'gmem' (matrix_mult/matrix_mult.cpp:18) [575]  (8.75 ns)


============================================================
+ Verbose Summary: Binding
============================================================
N/A
* FSMD analyzer results:
  - Output states:
 - Input state : 
  - Chain level:
	State 1
	State 2
	State 3
	State 4
	State 5
	State 6
	State 7
	State 8
	State 9
	State 10
	State 11
	State 12
	State 13
	State 14
	State 15
	State 16
	State 17
	State 18
	State 19
	State 20
	State 21
	State 22
	State 23
	State 24
	State 25
	State 26
	State 27
	State 28
	State 29
	State 30
	State 31
	State 32
	State 33
	State 34
	State 35
	State 36
	State 37
	State 38
	State 39
	State 40
	State 41
	State 42
	State 43
	State 44
	State 45
	State 46
	State 47
	State 48
	State 49
	State 50
	State 51
	State 52
	State 53
	State 54
	State 55
	State 56


============================================================
+ Verbose Summary: Datapath Resource usage 
============================================================
N/A