DFiant: First Look🔗

Your first encounter with the DFiant syntax, semantics and language features

In this section we provide simple examples to demonstrate various DFiant syntax, semantics and languages features. If you wish to understand how to run these examples yourself, please refer to the Getting Started chapter of this documentation.

Main Feature Overview🔗

Concise and simple syntax
Write portable code: target and timing agnostic dataflow hardware description
Strong bit-accurate type-safety
Simplified port connections
Automatic latency path balancing
Automatic/manual pipelining
Meta hardware description via rich Scala language constructs

Basic Example: An Identity Function🔗

Let's begin with a basic example. The dataflow design ID has a signed 16-bit input port x and a signed 16-bit output port y. We implemented an identity function between the input and output, meaning that for an input series \(x_k\) the output series shall be \(y_k=x_k\). Fig. 1a depicts a functional drawing of the design and Fig. 1b contains five tabs: the ID.scala DFiant dataflow design ID class and its compiled RTL files in VHDL (v2008) and Verilog (v2001).

Fig. 1a: Functional drawing of the dataflow design 'ID' with an input port 'x' and an output port 'y'

ID.scala

import DFiant._

@df class ID extends DFDesign { //This our `ID` dataflow design
  val x = DFSInt(16) <> IN  //The input port is a signed 16-bit integer
  val y = DFSInt(16) <> OUT //The output port is a signed 16-bit integer
  y := x //trivial direct input-to-output assignment
}

ID.vhdl

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.ID_pkg.all;

entity ID is
port (
  x   : in  signed(15 downto 0);
  y   : out signed(15 downto 0)
);
end ID;

architecture ID_arch of ID is
begin
  async_proc : process (all)
  begin
    y <= x;
  end process;
end ID_arch;

ID_pkg.vhdl

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

package ID_pkg is
  function bit_reverse(s : std_logic_vector) return std_logic_vector;
  function resize(arg : std_logic_vector; size : integer) return std_logic_vector;
  function to_sl(b : boolean) return std_logic;
  function to_sl(arg : std_logic_vector) return std_logic;
  function to_slv(arg : std_logic) return std_logic_vector;
  function to_slv(arg : unsigned) return std_logic_vector;
  function to_slv(arg : signed) return std_logic_vector;
  function to_slv(arg : boolean) return std_logic_vector;
end package ID_pkg;

package body ID_pkg is
  function bit_reverse(s : std_logic_vector) return std_logic_vector is
     variable v_s : std_logic_vector(s'high downto s'low);
  begin
    for i in s'high downto s'low loop
      v_s(i) := s(s'high - i);
    end loop;
    return v_s;
  end bit_reverse;
  function resize(arg : std_logic_vector; size : integer) return std_logic_vector is
  begin
    return to_slv(resize(unsigned(arg), size));
  end resize;
  function to_sl(b : boolean) return std_logic is
  begin
    if (b) then
      return '1';
    else
      return '0';
    end if;
  end to_sl;
  function to_sl(arg : std_logic_vector) return std_logic is
  begin
    return arg(arg'low);
  end to_sl;
  function to_slv(arg : std_logic) return std_logic_vector is
  begin
    if (arg = '1') then
      return "1";
    else
      return "0";
    end if;
  end to_slv;
  function to_slv(arg : unsigned) return std_logic_vector is
    variable slv : std_logic_vector(arg'length-1 downto 0);
  begin
    slv := std_logic_vector(arg);
    return slv;
  end to_slv;
  function to_slv(arg : signed) return std_logic_vector is
    variable slv : std_logic_vector(arg'length-1 downto 0);
  begin
    slv := std_logic_vector(arg);
    return slv;
  end to_slv;
  function to_slv(arg : boolean) return std_logic_vector is
  begin
    if (arg) then
      return "1";
    else
      return "0";
    end if;
  end to_slv;
end package body ID_pkg;

ID.v

`default_nettype               none
`timescale 1ns/1ps
`include "ID_defs.v"


module ID(
  input  wire signed [15:0] x,
  output reg  signed [15:0] y
);
  always @(*)
  begin
    y                       = x;
  end
endmodule

ID_defs.v

1
2
3

`ifndef ID_DEFS_H
`define ID_DEFS_H
`endif

Fig. 1b: A DFiant implementation of the identity function as a toplevel design and the generated VHDL/Verilog files

The Scala code in Fig. 1b describes our ID design as a Scala class. To compile this further to RTL or simulate it we need to create a program that instantiates the class and invokes additional commands. See the getting started guide for further details.

Defining a new dataflow design

import DFiant._ once per source file.
@df class _design_name_ extends DFDesign {} to define your dataflow design. Populate your design with the required dataflow functionality.

ID.scala line-by-line breakdown

Line 1: The import DFiant._ statement summons all the DFiant classes, types and objects into the current scope. This is a must for every dataflow design source file.
Lines 3-7: The ID Scala class is extended from the DFDesign (abstract) class and therefore declares it as a dataflow design. In addition, we also need to annotate the class with the @df dataflow context annotation. This annotation provides an implicit context that is required for the DFiant compilation. In case this annotation is missing, you will get a missing context error. Note: currently in Scala 2.xx we populate a class within braces {}. For those of you who dislike braces, a braceless syntax is expected to be available in Scala 3, where DFiant will migrate to in the future.
- Lines 4-5: Here we construct the input port x and output port y. Both were set as a 16-bit signed integer dataflow variable via the DFSInt(width) constructor, where width is any positive integer. DFiant also support various types such as DFBits, DFUInt, and DFBool. All these dataflow variable construction options and more are discussed later in this documentation.
  The syntax val _name_ = _dftype_ <> _direction_ is used to construct a port and give it a named Scala reference. The Scala reference name will affect the name of this port when compiled to the required backend representation.
- Line 6: The assignment operator := sets the dataflow output port to consume all input port tokens as they are.

ID RTL files observations

The ID.vhdl/ID.v files are readable and maintain the names set in the DFiant design. The generated files follow various writing conventions such as lowercase keywords and proper code alignment.
The ID_pkg.vhdl is a package file that is shared between all VHDL files generated by DFiant and contains common conversion functions that may be required. Additionally it may contain other definitions like enumeration types.

ID demo

import DFiant._

@df class ID extends DFDesign { //This our `ID` dataflow design
  val x = DFSInt(16) <> IN  //The input port is a signed 16-bit integer
  val y = DFSInt(16) <> OUT //The output port is a signed 16-bit integer
  y := x //trivial direct input-to-output assignment
}


object IDApp extends App {
  import DFiant.compiler.backend.verilog.v2001
  val id = new ID
  id.compile.printGenFiles(colored = false)
}

Hierarchy and Connection Example🔗

One of the most qualifying characteristics of hardware design is the composition of modules/entities via hierarchies and IO port connections. DFiant is no exception and easily enables dataflow design compositions. Fig. 2a demonstrates such a composition that creates yet another identity function, but this time as a chained composition of two identity function designs. The top-level design IDTop introduces two instances of ID we saw in the previous example and connects them accordingly.

Fig. 2a: Functional drawing of the dataflow design 'IDTop' with an input port 'x' and an output port 'y'

IDTop.scala

import DFiant._

@df class IDTop extends DFDesign { //This our `IDTop` dataflow design
  val x = DFSInt(16) <> IN  //The input port is a signed 16-bit integer
  val y = DFSInt(16) <> OUT //The output port is a signed 16-bit integer
  val id1 = new ID //First instance of the `ID` design
  val id2 = new ID //Second instance of the `ID` design
  id1.x <> x       //Connecting parent input port to child input port
  id1.y <> id2.x   //Connecting sibling instance ports
  id2.y <> y       //Connecting parent output port to child output port
}

IDTop.vhdl

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.IDTop_pkg.all;

entity IDTop is
port (
  x            : in  signed(15 downto 0);
  y            : out signed(15 downto 0)
);
end IDTop;

architecture IDTop_arch of IDTop is  
  signal id1_x : signed(15 downto 0);
  signal id1_y : signed(15 downto 0);
  signal id2_x : signed(15 downto 0);
  signal id2_y : signed(15 downto 0);
begin
  id1 : entity work.ID(ID_arch) port map (
    x          => id1_x,
    y          => id1_y
  );
  id2 : entity work.ID(ID_arch) port map (
    x          => id2_x,
    y          => id2_y
  );
  async_proc : process (all)
  begin
    id1_x      <= x;
    id2_x      <= id1_y;
    y          <= id2_y;
  end process;
end IDTop_arch;

IDTop.v

`default_nettype               none
`timescale 1ns/1ps
`include "IDTop_defs.v"


module IDTop(
  input  wire signed [15:0] x,
  output reg  signed [15:0] y
);
  wire        signed [15:0] id1_y;
  wire        signed [15:0] id2_y;
  reg         signed [15:0] id1_x;
  reg         signed [15:0] id2_x;
  ID id1(
    .x                      (id1_x),
    .y                      (id1_y)
  );
  ID id2(
    .x                      (id2_x),
    .y                      (id2_y)
  );
  always @(*)
  begin
    id1_x                   = x;
    id2_x                   = id1_y;
    y                       = id2_y;
  end
endmodule

Fig. 2b: A DFiant implementation of IDTop as a toplevel design and the generated VHDL/Verilog files

IDTop.scala observations

Lines 6-7: Instantiating and naming the two internal ID designs (by constructing a Scala class).
Lines 8-10: Connecting the design ports as can be seen in the functional diagram. The <> connection operator is different than the := assignment operator we saw earlier in several ways:
1. Directionality and Commutativity: The connection operation is commutative and the dataflow direction, from producer to consumer, is set according to the context in which it is used. Assignments always set the dataflow direction from right to left of the operator.
2. Number of Applications: A connection to any bit can be made only once, while assignments are unlimited. Also, a bit cannot receive both a connection and an assignment.
3. Initialization: A connection propagates initialization from the producer to the consumer if the consumer is not explicitly initialized (via init). Assignments have no effect over initialization.
Notice that connections can be made between sibling design ports as well as between parent ports to child ports.
For more information access the connectivity section.

IDTop RTL files observations

Unlike DFiant, RTLs do not support direct sibling module/component port connections and therefore require intermediate wires/signals to connect through. For consistency and brevity the DFiant backend compiler always creates signals for all ports of all modules and connects them accordingly.

IDTop demo

import DFiant._

@df class IDTop extends DFDesign { //This our `IDTop` dataflow design
  val x = DFSInt(16) <> IN  //The input port is a signed 16-bit integer
  val y = DFSInt(16) <> OUT //The output port is a signed 16-bit integer
  val id1 = new ID //First instance of the `ID` design
  val id2 = new ID //Second instance of the `ID` design
  id1.x <> x       //Connecting parent input port to child input port
  id1.y <> id2.x   //Connecting sibling instance ports
  id2.y <> y       //Connecting parent output port to child output port
}

@df class ID extends DFDesign {
  val x = DFSInt(16) <> IN  
  val y = DFSInt(16) <> OUT 
  y := x 
}

object IDTopApp extends App {
  import DFiant.compiler.backend.verilog.v2001
  val idTop = new IDTop
  idTop.compile.printGenFiles(colored = false)
}

Concurrency Abstraction🔗

Concurrency and data scheduling abstractions rely heavily on language semantics. DFiant code is expressed in a sequential manner yet employs an asynchronous dataflow programming model to enable implicit and intuitive concurrent hardware description. This is achieved by setting the data scheduling order, or token-flow, according to the data dependency: all independent dataflow expressions are scheduled concurrently, while dependent operations are synthesized into a guarded FIFO-styled pipeline.

\[\begin{aligned} &f:(i_{k},j_{k})_{k\in \mathbb{N}}\rightarrow (a_k,b_k,c_k,d_k,e_k)_{k\in \mathbb{N}}\\ &\triangleq\left\{ \begin{split} a_k & = i_k + 5 \\ b_k & = a_k * 3 \\ c_k & = a_k + b_k \\ d_k & = i_k - 1 \\ e_k & = j_k / 4 \\ \end{split}\right.~~~~~k\geq 0 \\ \\ \end{aligned}\]

Fig. 4a: Functional drawing of the dataflow design 'Conc' with an input port 'x' and an output port 'y'

Conc.scala

import DFiant._

@df class Conc extends DFDesign {
  val i, j      = DFUInt(32) <> IN
  val a,b,c,d,e = DFUInt(32) <> OUT
  a := i + 5
  b := a * 3
  c := a + b
  d := i - 1
  e := j / 4
}

Conc.vhdl

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.Conc_pkg.all;

entity Conc is
port (
  i   : in  unsigned(31 downto 0);
  j   : in  unsigned(31 downto 0);
  a   : out unsigned(31 downto 0);
  b   : out unsigned(31 downto 0);
  c   : out unsigned(31 downto 0);
  d   : out unsigned(31 downto 0);
  e   : out unsigned(31 downto 0)
);
end Conc;

architecture Conc_arch of Conc is
begin
  async_proc : process (all)
  begin
    a <= i + 5;
    b <= resize(a * 3, 32);
    c <= a + b;
    d <= i - 1;
    e <= j / 4;
  end process;
end Conc_arch;

Conc.v

`default_nettype        none
`timescale 1ns/1ps
`include "Conc_defs.v"


module Conc(
  input  wire [31:0] i,
  input  wire [31:0] j,
  output reg  [31:0] a,
  output reg  [31:0] b,
  output reg  [31:0] c,
  output reg  [31:0] d,
  output reg  [31:0] e
);
  always @(*)
  begin
    a                = i + 5;
    b                = a * 3;
    c                = a + b;
    d                = i - 1;
    e                = j / 4;
  end
endmodule

Fig. 4b: A DFiant implementation of Conc as a toplevel design and the generated VHDL/Verilog files

Conc.scala observations

Lines 6-7:
For more information access the state section.

Conc RTL files observations

Bla Bla

Conc demo

import DFiant._

@df class Conc extends DFDesign {
  val i, j      = DFUInt(32) <> IN
  val a,b,c,d,e = DFUInt(32) <> OUT
  a := i + 5
  b := a * 3
  c := a + b
  d := i - 1
  e := j / 4
}

object ConcApp extends App {
  import DFiant.compiler.backend.verilog.v2001
  val conc = new Conc
  conc.compile.printGenFiles(colored = false)
}

State Abstraction🔗

So far, all the examples were pure (stateless) functions, whereas frequently in hardware we need to express a state. A state is needed when a design must access (previous) values that are no longer (or never were) available on its input. DFiant assumes every dataflow variable is a token stream and provides constructs to initialize the token history via the init construct, reuse tokens via the .prev construct, and update the state via the assignment := construct.

Here we provide various implementations of a simple moving average (SMA); all have a 4-tap average window of a 16-bit integer input and output a 16-bit integer average. With regards to overflow avoidance and precision loss, DFiant is no different than any other HDL, and we took those into account when we selected our operators and declared the variable widths. Via the SMA examples we can differentiate between two kinds of state: a derived state, and a commit state.

Derived State SMA🔗

Derived State

A derived (feedforward) state is a state whose current output value is independent of its previous value. For example, checking if a dataflow stream value has changed requires reusing the previous token and comparing to the current token.

Trivial three-adder SMA implementation🔗

The trivial derived state SMA implementation comes from the basic SMA formula:

\[ y_k=\left(x_k+x_{k-1}+x_{k-2}+x_{k-3}\right)/4~~~~x_{i<0}=0 \]

As can be seen from the formula, we need 3 state elements to match the maximum x history access. Fortunately, state creation is implicit in DFiant. Just by calling x.prev(_step_) to access the history of x we construct _step_ number of states and chain them, as can be seen in Fig. 3 (DFiant automatically merges the same states constructed from several calls).

Fig. 3a: Functional drawing of the dataflow design 'SMA_DS' with an input port 'x' and an output port 'y'

SMA_DS.scala

import DFiant._

@df class SMA_DS extends DFDesign {
  val x   = DFSInt(16) <> IN init 0
  val y   = DFSInt(16) <> OUT
  val s0  = x +^ x.prev
  val s2  = x.prev(2) +^ x.prev(3)
  val sum = s0 +^ s2
  y       := (sum / 4).resize(16)
}

SMA_DS.vhdl

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.SMA_DS_pkg.all;

entity SMA_DS is
port (
  clk                : in  std_logic;
  rst                : in  std_logic;
  x                  : in  signed(15 downto 0) := 16d"0";
  y                  : out signed(15 downto 0)
);
end SMA_DS;

architecture SMA_DS_arch of SMA_DS is  
  signal x_prev1     : signed(15 downto 0) := 16d"0";
  signal x_prev2     : signed(15 downto 0) := 16d"0";
  signal x_prev3     : signed(15 downto 0) := 16d"0";
  signal x_prev1_sig : signed(15 downto 0);
  signal x_prev2_sig : signed(15 downto 0);
begin
  async_proc : process (all)  
    variable s0      : signed(16 downto 0);
    variable s2      : signed(16 downto 0);
    variable sum     : signed(17 downto 0);
  begin
    s0               := resize(x, 17) + x_prev1;
    s2               := resize(x_prev2, 17) + x_prev3;
    sum              := resize(s0, 18) + s2;
    x_prev1_sig      <= x_prev1;
    x_prev2_sig      <= x_prev2;
    y                <= resize(sum / 4, 16);
  end process;
  sync_proc : process (rst, clk)
  begin
    if rst = '0' then
      x_prev1        <= 16d"0";
      x_prev2        <= 16d"0";
      x_prev3        <= 16d"0";
    elsif rising_edge(clk) then
      x_prev1        <= x;
      x_prev2        <= x_prev1_sig;
      x_prev3        <= x_prev2_sig;
    end if;
  end process;
end SMA_DS_arch;

SMA_DS.v

`default_nettype               none
`timescale 1ns/1ps
`include "SMA_DS_defs.v"


module SMA_DS(
  input  wire               clk,
  input  wire               rst,
  input  wire signed [15:0] x,
  output reg  signed [15:0] y
);
  reg         signed [15:0] x_prev1 = 16'sd0;
  reg         signed [15:0] x_prev2 = 16'sd0;
  reg         signed [15:0] x_prev3 = 16'sd0;
  reg         signed [16:0] s0;
  reg         signed [16:0] s2;
  reg         signed [17:0] sum;
  reg         signed [17:0] y_part;
  reg         signed [15:0] x_prev1_sig;
  reg         signed [15:0] x_prev2_sig;
  always @(*)
  begin
    s0                      = ({x[15], x[15:0]}) + x_prev1;
    s2                      = ({x_prev2[15], x_prev2[15:0]}) + x_prev3;
    sum                     = ({s0[16], s0[16:0]}) + s2;
    y_part                  = sum / 4;
    x_prev1_sig             = x_prev1;
    x_prev2_sig             = x_prev2;
    y                       = {y_part[17], y_part[14:0]};
  end
  always @(negedge rst or posedge clk)
  begin
    if (rst == 1'b0) 
    begin
      x_prev1               <= 16'sd0;
      x_prev2               <= 16'sd0;
      x_prev3               <= 16'sd0;
    end
    else 
    begin
      x_prev1               <= x;
      x_prev2               <= x_prev1_sig;
      x_prev3               <= x_prev2_sig;
    end
  end
endmodule

Fig. 3b: A DFiant implementation of SMA_DS as a toplevel design and the generated VHDL/Verilog files

SMA_DS.scala observations

Line 4: The SMA forumla defines the history of x is at the start of the system (all values are considered to be 0). We apply this information by initializing the x history via init 0.
Lines 6-7: Accessing the history of x is done via .prev(_step_), where _step_ is a constant positive integer that defines the number of steps into history we require to retrieve the proper value.
Lines 6-8: To avoid overflow we chose the +^ carry-addition operator, meaning that s0 and s2 are 17-bit wide, and sum is 18-bit wide.
Line 9: The sum/4 division result keeps the LHS 18-bit width. To assign this value to the output y which is 16-bit wide, we must resize it first, via .resize. DFiant has strong bit-accurate type-safety, and it does not allow assigning a wider value to a narrower value without explicit resizing. In the following animated figure we show what happens if we did not resize the value.

The Scala presentation compiler is able to interact with the editor and a custom message is presented due to the DFiant type-safe checks.
The various dataflow type inference and operator safety rules are discussed at the type-system section.
For more information on state and initialization access the this section.

SMA_DS RTL files observations

This is often where a language like verilog falls short and relies on external linting

SMA_DS demo

import DFiant._

@df class SMA_DS extends DFDesign {
  val x   = DFSInt(16) <> IN init 0
  val y   = DFSInt(16) <> OUT
  val s0  = x +^ x.prev
  val s2  = x.prev(2) +^ x.prev(3)
  val sum = s0 +^ s2
  y       := (sum / 4).resize(16)
}

object SMA_DSApp extends App {
  import DFiant.compiler.backend.verilog.v2001
  val sma = new SMA_DS
  sma.compile.printGenFiles(colored = false)
}

Two-adder SMA implementation🔗

The following algebraic manipulation reveals how we can achieve the same function with just two adders.

\[\begin{eqnarray} s_{0,k} &=& x_k+x_{k-1} \\ s_{2,k} &=& x_{k-2}+x_{k-3} = \left.\left (x_t+x_{t-1} \right )\right|_{t=k-2} = s_{0,k-2} \\ y_k &=& \left(s_{0,k}+s_{2,k}\right)/4~~~~x_{i<0}=0 \end{eqnarray}\]

Instead of relying only on the history of x, we can utilize the history of s0 to produce s2. DFiant has time invariant history access through basic operators like addition, so (x +^ x.prev).prev(2) is equivalent to x.prev(2) +^ x.prev(3).

Fig. 4a: Functional drawing of the dataflow design 'SMA_DS2' with an input port 'x' and an output port 'y'

SMA_DS2.scala

import DFiant._

@df class SMA_DS2 extends DFDesign {
  val x   = DFSInt(16) <> IN init 0
  val y   = DFSInt(16) <> OUT
  val s0  = x +^ x.prev
  val s2  = s0.prev(2)
  val sum = s0 +^ s2
  y       := (sum / 4).resize(16)
}

SMA_DS2.vhdl

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.SMA_DS2_pkg.all;

entity SMA_DS2 is
port (
  clk                 : in  std_logic;
  rst                 : in  std_logic;
  x                   : in  signed(15 downto 0) := 16d"0";
  y                   : out signed(15 downto 0)
);
end SMA_DS2;

architecture SMA_DS2_arch of SMA_DS2 is  
  signal x_prev1      : signed(15 downto 0) := 16d"0";
  signal s0_prev1     : signed(16 downto 0) := 17d"0";
  signal s2           : signed(16 downto 0) := 17d"0";
  signal s0_sig       : signed(16 downto 0);
  signal s0_prev1_sig : signed(16 downto 0);
begin
  async_proc : process (all)  
    variable s0       : signed(16 downto 0);
    variable sum      : signed(17 downto 0);
  begin
    s0                := resize(x, 17) + x_prev1;
    sum               := resize(s0, 18) + s2;
    s0_sig            <= s0;
    s0_prev1_sig      <= s0_prev1;
    y                 <= resize(sum / 4, 16);
  end process;
  sync_proc : process (rst, clk)
  begin
    if rst = '0' then
      x_prev1         <= 16d"0";
      s0_prev1        <= 17d"0";
      s2              <= 17d"0";
    elsif rising_edge(clk) then
      x_prev1         <= x;
      s0_prev1        <= s0_sig;
      s2              <= s0_prev1_sig;
    end if;
  end process;
end SMA_DS2_arch;

SMA_DS2.v

`default_nettype               none
`timescale 1ns/1ps
`include "SMA_DS2_defs.v"


module SMA_DS2(
  input  wire               clk,
  input  wire               rst,
  input  wire signed [15:0] x,
  output reg  signed [15:0] y
);
  reg         signed [15:0] x_prev1 = 16'sd0;
  reg         signed [16:0] s0;
  reg         signed [16:0] s0_prev1 = 17'sd0;
  reg         signed [16:0] s2 = 17'sd0;
  reg         signed [17:0] sum;
  reg         signed [17:0] y_part;
  reg         signed [16:0] s0_sig;
  reg         signed [16:0] s0_prev1_sig;
  always @(*)
  begin
    s0                      = ({x[15], x[15:0]}) + x_prev1;
    sum                     = ({s0[16], s0[16:0]}) + s2;
    y_part                  = sum / 4;
    s0_sig                  = s0;
    s0_prev1_sig            = s0_prev1;
    y                       = {y_part[17], y_part[14:0]};
  end
  always @(negedge rst or posedge clk)
  begin
    if (rst == 1'b0) 
    begin
      x_prev1               <= 16'sd0;
      s0_prev1              <= 17'sd0;
      s2                    <= 17'sd0;
    end
    else 
    begin
      x_prev1               <= x;
      s0_prev1              <= s0_sig;
      s2                    <= s0_prev1_sig;
    end
  end
endmodule

Fig. 4b: A DFiant implementation of SMA_DS2 as a toplevel design and the generated VHDL/Verilog files

SMA_DS2.scala observations

Lines 6-7:
For more information access the state section.

SMA_DS2 RTL files observations

Bla Bla

SMA_DS2 demo

import DFiant._

@df class SMA_DS2 extends DFDesign {
  val x   = DFSInt(16) <> IN init 0
  val y   = DFSInt(16) <> OUT
  val s0  = x +^ x.prev
  val s2  = s0.prev(2)
  val sum = s0 +^ s2
  y       := (sum / 4).resize(16)
}

object SMA_DS2App extends App {
  import DFiant.compiler.backend.verilog.v2001
  val sma = new SMA_DS2
  sma.compile.printGenFiles(colored = false)
}

Commit State SMA🔗

Commit State

A commit (feedback) state is a state whose current output value is dependent on its previous state value. For example, a cumulative sum function output value is dependent on its previous sum output value.

\[\begin{eqnarray} a_0 &=& 0 \\ a_k &=& a_{k-1} - x_{k-4}+x_k \\ y_k &=& a_k/4 \end{eqnarray}\]

SMA_CS.scala

import DFiant._

@df class SMA_CS extends DFDesign {
  val x   = DFSInt(16) <> IN init 0
  val y   = DFSInt(16) <> OUT
  val acc = DFSInt(18) <> VAR init 0
  acc := acc - x.prev(4) + x
  y   := (acc / 4).resize(16)
}

SMA_CS.vhdl

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.SMA_CS_pkg.all;

entity SMA_CS is
port (
  clk                : in  std_logic;
  rst                : in  std_logic;
  x                  : in  signed(15 downto 0) := 16d"0";
  y                  : out signed(15 downto 0)
);
end SMA_CS;

architecture SMA_CS_arch of SMA_CS is  
  signal x_prev1     : signed(15 downto 0) := 16d"0";
  signal x_prev2     : signed(15 downto 0) := 16d"0";
  signal x_prev3     : signed(15 downto 0) := 16d"0";
  signal x_prev4     : signed(15 downto 0) := 16d"0";
  signal acc_prev1   : signed(17 downto 0) := 18d"0";
  signal x_prev1_sig : signed(15 downto 0);
  signal x_prev2_sig : signed(15 downto 0);
  signal x_prev3_sig : signed(15 downto 0);
  signal acc_sig     : signed(17 downto 0);
begin
  async_proc : process (all)  
    variable acc     : signed(17 downto 0);
  begin
    acc              := acc_prev1;
    acc              := (acc - x_prev4) + x;
    x_prev1_sig      <= x_prev1;
    x_prev2_sig      <= x_prev2;
    x_prev3_sig      <= x_prev3;
    acc_sig          <= acc;
    y                <= resize(acc / 4, 16);
  end process;
  sync_proc : process (rst, clk)
  begin
    if rst = '0' then
      x_prev1        <= 16d"0";
      x_prev2        <= 16d"0";
      x_prev3        <= 16d"0";
      x_prev4        <= 16d"0";
      acc_prev1      <= 18d"0";
    elsif rising_edge(clk) then
      x_prev1        <= x;
      x_prev2        <= x_prev1_sig;
      x_prev3        <= x_prev2_sig;
      x_prev4        <= x_prev3_sig;
      acc_prev1      <= acc_sig;
    end if;
  end process;
end SMA_CS_arch;

SMA_CS.v

`default_nettype               none
`timescale 1ns/1ps
`include "SMA_CS_defs.v"


module SMA_CS(
  input  wire               clk,
  input  wire               rst,
  input  wire signed [15:0] x,
  output reg  signed [15:0] y
);
  reg         signed [15:0] x_prev1 = 16'sd0;
  reg         signed [15:0] x_prev2 = 16'sd0;
  reg         signed [15:0] x_prev3 = 16'sd0;
  reg         signed [15:0] x_prev4 = 16'sd0;
  reg         signed [17:0] acc;
  reg         signed [17:0] acc_prev1 = 18'sd0;
  reg         signed [17:0] y_part;
  reg         signed [15:0] x_prev1_sig;
  reg         signed [15:0] x_prev2_sig;
  reg         signed [15:0] x_prev3_sig;
  reg         signed [17:0] acc_sig;
  always @(*)
  begin
    acc                     = acc_prev1;
    acc                     = (acc - x_prev4) + x;
    y_part                  = acc / 4;
    x_prev1_sig             = x_prev1;
    x_prev2_sig             = x_prev2;
    x_prev3_sig             = x_prev3;
    acc_sig                 = acc;
    y                       = {y_part[17], y_part[14:0]};
  end
  always @(negedge rst or posedge clk)
  begin
    if (rst == 1'b0) 
    begin
      x_prev1               <= 16'sd0;
      x_prev2               <= 16'sd0;
      x_prev3               <= 16'sd0;
      x_prev4               <= 16'sd0;
      acc_prev1             <= 18'sd0;
    end
    else 
    begin
      x_prev1               <= x;
      x_prev2               <= x_prev1_sig;
      x_prev3               <= x_prev2_sig;
      x_prev4               <= x_prev3_sig;
      acc_prev1             <= acc_sig;
    end
  end
endmodule

Finite Step (State) Machine (FSM) Example🔗

SeqDet.scala

import DFiant._

@df class SeqDet extends DFDesign {
  val seqIn  = DFBit <> IN
  val detOut = DFBit <> OUT
  @df def detStep(
    out : Int, trueNS : => FSM, falseNS : => FSM
  ) : FSM = FSM {
    detOut := out
    ifdf(seqIn){
      trueNS.goto()
    }.elsedf {
      falseNS.goto()
    }
  }
  val S0     : FSM = detStep(0, S1, S0)
  val S1     : FSM = detStep(0, S1, S10)
  val S10    : FSM = detStep(0, S1, S100)
  val S100   : FSM = detStep(0, S1001, S0)
  val S1001  : FSM = detStep(1, S1, S10)
}

SeqDet.vhdl

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.SeqDet_pkg.all;

entity SeqDet is
port (
  clk                    : in  std_logic;
  rst                    : in  std_logic;
  seqIn                  : in  std_logic;
  detOut                 : out std_logic
);
end SeqDet;

architecture SeqDet_arch of SeqDet is  
  type E_fsm_states is (
    E_fsm_states_S0,
    E_fsm_states_S1,
    E_fsm_states_S10,
    E_fsm_states_S100,
    E_fsm_states_S1001
  );
  signal fsm_state_prev1 : E_fsm_states := E_fsm_states_S0;
  signal fsm_state_sig   : E_fsm_states;
begin
  async_proc : process (all)  
    variable fsm_state   : E_fsm_states;
  begin
    fsm_state            := fsm_state_prev1;
    case fsm_state is
      when E_fsm_states_S0 =>
        detOut           <= '0';
        if seqIn = '1' then
          fsm_state      := E_fsm_states_S1;
        else
          fsm_state      := E_fsm_states_S0;
        end if;
      when E_fsm_states_S1 =>
        detOut           <= '0';
        if seqIn = '1' then
          fsm_state      := E_fsm_states_S1;
        else
          fsm_state      := E_fsm_states_S10;
        end if;
      when E_fsm_states_S10 =>
        detOut           <= '0';
        if seqIn = '1' then
          fsm_state      := E_fsm_states_S1;
        else
          fsm_state      := E_fsm_states_S100;
        end if;
      when E_fsm_states_S100 =>
        detOut           <= '0';
        if seqIn = '1' then
          fsm_state      := E_fsm_states_S1001;
        else
          fsm_state      := E_fsm_states_S0;
        end if;
      when E_fsm_states_S1001 =>
        detOut           <= '1';
        if seqIn = '1' then
          fsm_state      := E_fsm_states_S1;
        else
          fsm_state      := E_fsm_states_S10;
        end if;
    end case;
    fsm_state_sig        <= fsm_state;
  end process;
  sync_proc : process (rst, clk)
  begin
    if rst = '0' then
      fsm_state_prev1    <= E_fsm_states_S0;
    elsif rising_edge(clk) then
      fsm_state_prev1    <= fsm_state_sig;
    end if;
  end process;
end SeqDet_arch;

SeqDet.v

`default_nettype                none
`timescale 1ns/1ps
`include "SeqDet_defs.v"


module SeqDet(
  input  wire                clk,
  input  wire                rst,
  input  wire                seqIn,
  output reg                 detOut
);
  `define E_fsm_states_S0    0
  `define E_fsm_states_S1    1
  `define E_fsm_states_S10   2
  `define E_fsm_states_S100  3
  `define E_fsm_states_S1001 4
  reg         [2:0]          fsm_state;
  reg         [2:0]          fsm_state_prev1 = `E_fsm_states_S0;
  reg         [2:0]          fsm_state_sig;
  always @(*)
  begin
    fsm_state                = fsm_state_prev1;
    case (fsm_state)
      `E_fsm_states_S0 : begin
        detOut               = 1'b0;
        if (seqIn) fsm_state = `E_fsm_states_S1;
        else fsm_state = `E_fsm_states_S0;
      end
      `E_fsm_states_S1 : begin
        detOut               = 1'b0;
        if (seqIn) fsm_state = `E_fsm_states_S1;
        else fsm_state = `E_fsm_states_S10;
      end
      `E_fsm_states_S10 : begin
        detOut               = 1'b0;
        if (seqIn) fsm_state = `E_fsm_states_S1;
        else fsm_state = `E_fsm_states_S100;
      end
      `E_fsm_states_S100 : begin
        detOut               = 1'b0;
        if (seqIn) fsm_state = `E_fsm_states_S1001;
        else fsm_state = `E_fsm_states_S0;
      end
      `E_fsm_states_S1001 : begin
        detOut               = 1'b1;
        if (seqIn) fsm_state = `E_fsm_states_S1;
        else fsm_state = `E_fsm_states_S10;
      end
      default : begin
        fsm_state            = 3'b???;
        detOut               = 1'b?;
      end
    endcase
    fsm_state_sig            = fsm_state;
  end
  always @(negedge rst or posedge clk)
  begin
    if (rst == 1'b0) fsm_state_prev1 <= `E_fsm_states_S0;
    else fsm_state_prev1 <= fsm_state_sig;
  end
endmodule