Instruction Set Architecture
Programming interface for SRAM-based CIM accelerators
The CIMFlow ISA defines the programming interface for SRAM-based Compute-in-Memory accelerators. It maps neural network operations to hardware through a hierarchical abstraction model, enabling efficient compilation and execution of deep learning workloads.
Hardware Hierarchy
The ISA models CIM hardware at three abstraction levels:
Each core executes its own instruction stream and coordinates with others through explicit synchronization primitives.
Instruction Categories
Instructions follow RISC design principles with a fixed 32-bit encoding. They fall into three categories:
Compute
Matrix-vector multiplication, vector element-wise operations, and scalar arithmetic.
e.g. CIM_MVM · VEC_OP · REDUCE · SC_RR · SC_RI
Communication
Memory load/store, data movement, and inter-core send/receive.
e.g. SC_LD · SC_ST · MEM_CPY · SEND · RECV
Control Flow
Conditional branches, jumps, barriers, and synchronization tags.
e.g. BRANCH · JMP · WAIT · BARRIER · TAG
Register Files
Each core maintains two register files—see Register Files for the complete map:
- General Register File (GRF): 32 registers (
r0–r31) for addresses, counters, and arithmetic - Special Register File (SRF): 32 registers for CIM configuration (IDs 0–15) and vector parameters (IDs 16–31)
Design Principles
CIM-Native Operations
CIM_MVM directly drives the in-memory compute array for matrix-vector multiplication.
Explicit Parallelism
SEND/RECV for inter-core data transfer, WAIT/BARRIER/TAG for synchronization.
Uniform Encoding
32-bit instructions with 6-bit opcode across five encoding types (R, I-A, I-B, I-C, J).
Configurable Precision
SRF controls input, output, and weight bit widths for mixed-precision compute.
Last updated on