

# SystemVerilogCSP: Modeling Digital Asynchronous Circuits Using SystemVerilog Interfaces

Arash Saifhashemi<sup>1</sup> Peter A. Beerel<sup>1,2</sup>

<sup>1</sup>Ming Hsieh Dept. of Electrical Engineering, University of Southern California <sup>2</sup>Fulcrum Microsystems, Calabasas CA, 91302.

CPA 2011 University of Limerick, Ireland



#### **Outline**



- Introduction
  - Why asynchronous circuit design?
  - Hardware description languages
- SystemVerilog Abstract Communication

   Basic Features
  - Channels Send, Receive
  - Channel status and mixed-level simulation
- SystemVerilog Abstract Communication
   — Extended Features
  - Peek and Probe
  - Split and synchronized communication
  - One-to-many and one-to-any channels
- Results and Conclusion

# Why Asynchronous Circuit Design? USC Viterbi



- Systems on chip: chips are becoming distributed systems
  - Communication-dominated
  - No globally synchronous clock
- Asynchronous alternative
  - Local handshaking: CSP-like communication
- Benefits
  - Higher speed
    - Shorter globally critical paths
  - Lower power consumption
    - Remove power-hungry global clock
    - Modules active only when necessary
  - Robustness to variations
    - Process, voltage, and temperature

60 IP blocks 350 RAMs: **Communication Bottleneck** 





## Asynchronous Circuit Design - Today



#### Applications

- Ethernet Switches (Fulcrum Microsystems)
- Ultra high-speed FPGAs (Achronix)
- Multi-core network on Chip
- Ultra low-power chip design

Basic challenges

- Lack of CAD tools to automate designs
- Proteus design flow (USC)
  - Leverage off of available synchronous CAD tools
  - Starting at a high-level specification written in SystemVerilogCSP

Proprietary CSP
SystemVerilog

Standard Synthesis

**CAD Tools** 

Commercially Available

Asynchronization and Optimization

Standard
Physical Design

Final Layout

# Hardware Description Languages



- Desirable features of an HDL
  - Concurrency e.g.: A=B || (C=D; E=F)
  - Timing e.g.: A=B after 5ns
  - Support for various levels of abstraction
  - Support by commercial CAD tools
  - Support for both synchronous & asynchronous
- Communication abstraction
  - Ease of design
  - Design usability: protocols evolve and change
  - Architecture evaluation before implementation
  - Ease of adoption by synchronous designers
- CSP as a basis for a hardware description language
  - Suitable for modeling abstract communication
  - Lacks some desirable features

#### Synthesis \_\_\_\_



Simulation/ Verification

#### **Previous Work**



#### New Language inspired by CSP

 Have limited CAD tool support - LARD [Edwards et al], Tangram [Berkel et al], CHP [Martin]

#### Software languages

No inherent support for timing, limited CAD tool support - JCSP [Welch et. al]

#### VHDL

 Fine grained concurrency is cumbersome [Frankild et al, Renaudin et al, Myers et al]

#### VerilogCSP

- Verilog Programming Language Interface: very slow; cannot handle multichannel modules [Saifhashemi et al]
- Verilog macros are cumbersome and do not support extensions

#### SystemVerilog (Superset of Verilog)

Initial implementations promising but do not address extensions [Tiempo]

## **CSP Communication Channels**



Abstract communication between processes



$$SENDER = (mid!v \rightarrow SENDER)$$

$$RECEIVER = (mid?x \rightarrow RECEIVER)$$

- No notion of hardware implementation details
- Semantics based on events on channels between independent processes [Hoare' 04]

## **Outline**



- Introduction
  - Why asynchronous circuit design?
  - Hardware description languages
- SystemVerilog Abstract Communication

   Basic Features
  - Channels Send, Receive
  - Channel status and mixed-level simulation
- SystemVerilog Abstract Communication Advanced Features
  - Peek and Probe
  - Split and synchronized communication
  - One-to-many and one-to-any channels
- Results and Conclusion

# Abstract SystemVerilog Channels



- Our approach
  - Use SystemVerilog interface to abstract channel wires as well as Send/Receive tasks

```
mid
                                         (SystemVerilog
                                           Interface)
   Abstract
                                 s:SENDER
                                                     r:RECEIVER
communication
  module Sender (interface R);
                                       module Receiver (interface L);
    parameter WIDTH = 8;
                                         parameter WIDTH = 8;
    logic [WIDTH-1:0] v;
                                         logic [WIDTH-1:0] x;
    always
                                         always
    begin
                                         begin
                                          → L.Receive(x);
    v={$random()}%2**(WIDTH-1);
  > R.Send(v);
                                            #15ns;
    #10ns;
                                         end
    end
                                       endmodule
  endmodule
```

#### Behind The Scenes: Channel Interface



- Channel details encapsulated within an "Interface"
- Implementation details (below) hidden from user
  - Greatly simplifies debugging and evaluation of the design

# Interface Send and Receive Tasks USC Viterbi



#### Arbitrary handshaking protocol Support most commonly used

```
task Send (input logic
                                   task Receive(output logic
    [WIDTH-1:0] d);
                                     [WIDTH-1:0] d);
 begin
                                     begin
   data = d;
                                       status = r pend;
   req = 1;
                                       wait (req == 1 );
                                       d = data;
   status = s pend;
   wait (ack == 1 );
                                       ack = 1;
   req = 0;
                                       wait (req == 0 );
                                       ack = 0;
   wait (ack == 0 );
   status = idle;
                                       status = idle;
 end
                                     end
endtask
                                   endtask
```

- Send/Receive tasks are analogous to CSP's! (output) and? (input)
- Semantics are based on synchronization of concurrent processes using SystemVerilog's notion of update and evaluation events

# **Viewing Channel Status**



- Enumerated types make viewing channel status inherent to all standard SystemVerilog simulators
- The designer can monitor if and what processes are engaged in the communication over time



## Supports Mixed-Levels of Abstraction USC Viterbi



**Completed** blocks can be simulated with others still at behavioral level



```
module mp fb csp (interface L, interface R);
                                                  Gate-level
  logic data;
                                                description description
  always
                                                  the buffer buffer
  begin
    L.Receive(data);
                                               (Afte Bayoth esis)
    R.Send(data);
  end
           module mp fb gate (interface L, interface R);
endmodule
             celement
                            ce(L.req, pd bar, c);
                   inv
                        (pd bar, pd);
             not
                           cp (c, L.ack, R.ack, pd, L.data, R.data);
             cap pass
           endmodule
```

# Supports Design Verification



- Co-simulation: Implemented circuit vs. original circuit
- It is important to use the same Testbench
  - Sometimes very complicated
  - Verifies correct implementation
- No need for Shims [Saifhashemi'05]



## **Outline**



- Introduction
  - Why asynchronous circuit design?
  - Hardware description languages
- SystemVerilog Abstract Communication

   Basic Features
  - Channels Send, Receive
  - Channel status and mixed-level simulation
- SystemVerilog Abstract Communication Advanced Features
  - Peek and Probe
  - Split and synchronized communication
  - One-to-many and one-to-any channels
- Results and Conclusion

## Peek and Probe



- Peek
  - Sample data
     without committing
     to communication
- Probe
  - Is the channel idle?
  - Usually used for arbitration



```
task Peek (output logic[WIDTH-1:0] d);
  wait (status == s_pend );
  d = data;
endtask
```



```
wait(ch0.status!=idle && ch1.status!= idle);
winner = Arbitrate (ch0.status, ch1.status);

if(winner == 0)
   ch0.Receive(d);
if(winner ==1)
   ch1.Receive(d);
```

# **Split Communication**



- Handshaking of different channels might be interleaved in implementation
- Modeling interleaved behavior at high level is important for early system evaluation



```
module buf (interface L, interface R);
  logic data;
                          module buf split (interface L, interface R);
  always
                             logic data;
  begin
     L.Receive(data);
                             always
     R.Send(data);
                             begin
                                L.SplitReceive (data, 1);
  end
                                R.Send
                                               (data);
endmodule
                                L.SplitReceive (data, 2);
                             end
                          endmodule
```

# **Synchronized Communications**



 Sometimes implementation forces correlation of communication on multiple channels



- Synchronized start
- Synchronized finish
- Early performance evaluation of system requires modeling such behavior

Concurrent body starts

Acts like a barrier

```
always
begin

fork
    A.Receive(a, 1);
    B.Receive(b, 1);

join
    fork
    A.Receive(a, 2);
    B.Receive(b, 2);
    join
    sum = a + b;
    SUM.Send(sum);
end
```

# **One-To-Many Channels**



- One sender to multiple receivers
  - Option 1: Use a copy block
    - Makes design cumbersome



- Option 2: Shared channels [JCSP, Welch et. al]
  - Sender and receiver send and receive as if the channel is a normal one-to-one channel
  - Top level module specifies the channel is broadcast
- Shared channels are closer to hardware implementation
  - A shared data bus between sender and receivers
  - Separate req and ack signals for receiving processes.

# One-To-Any Channel



Q1

One sender to multiple receiver - JCSP [Welch et. al]

Only one of the receiver participates in communication

```
Q2
task Receive(output logic[WIDTH-1:0] d);
                                                                   Q3
 status = r pend;
                                  always
 wait (req == hsPhase );
                                    begin : main
 if (ONE2ANY) <
                                      wait (L.status != idle);
    reg = 'z; // Inhibits
                                      randValue = {$random()} % 3;
            //other receivers
                                      if (randValue ==1)
 d = data;
                                        L.Receive(x);
 ack = hsPhase;
                                      else
 status = idle;
                                        begin
endtask
                                          #0;
                                          disable main;
                                        end
                                    end
```

#### **Outline**



- Introduction
  - Why asynchronous circuit design?
  - Hardware description languages
  - CSP Communication Channels
- SystemVerilog Abstract Communication

   Basic Features
  - Channels Send, Receive
  - Channel status and mixed-level simulation
- SystemVerilog Abstract Communication- Advanced Features
  - Peek and Probe
  - Split and synchronized communication
  - One-to-many and one-to-any channels
- Results and Conclusion

## Results – Simulation Run-Times



- Comparison to VerilogCSP [Saifhashemi'05]
  - Simulation time of a linear pipeline with depth of 10
  - Platform: Sun UltraSPARC, Modelsim SE 6.6 simulator
  - 12%-20% improvement

| Number of data items | 100K  | 200K  | 300K   | 400K   | 500K   |
|----------------------|-------|-------|--------|--------|--------|
| Simulation time in   | 45.14 | 76.38 | 107.60 | 139.57 | 170.62 |
| Seconds (VerilogCSP) |       |       |        |        |        |
| Simulation time in   | 40.12 | 65    | 89.70  | 115.52 | 141.99 |
| Seconds              |       |       |        |        |        |
| (SystemVerilogCSP)   |       |       |        |        |        |
| Ratio                | 1.12  | 1.17  | 1.19   | 1.20   | 1.20   |

#### Conclusions



- CSP-like communication and extensions can be modeled using SystemVerilog interfaces
- Features and advantages
  - Ease of design: abstract communication, channel status
  - Mixed asynchronous and synchronous designs can be modeled in same language and simulation environment
  - Extensions: more accurate modeling of implemented hardware
  - Make adoption of asynchronous technology easier
- Currently being used to teach the course EE-552 Asynchronous VLSI at the University of Southern California
- Future work
  - Automated synthesis from SystemVerilogCSP

# Supports Design Verification



- Testing DUT:
  - Initially, modeled in SystemVerilogCSP
  - Later implemented in gates
- It is important that Testbench does change
  - Sometimes very complicated
  - Communicates with other blocks
  - Verifies correct implementation
- No need for Shims [Saifhashemi'05]

