DSP API

class DSP

Base class for all DSP models.

DSP provides the common interface for all neural network-based audio processing models. It handles:

  • Input/output channel management

  • Sample rate tracking

  • Level management (input/output levels and loudness)

  • Prewarm functionality for settling initial conditions

  • Buffer size management

Subclasses should override process() to implement the actual processing algorithm.

Subclassed by nam::Buffer, nam::container::ContainerModel, nam::lstm::LSTM

Public Functions

DSP(const int in_channels, const int out_channels, const double expected_sample_rate)

Constructor.

Parameters:
  • in_channels – Number of input channels

  • out_channels – Number of output channels

  • expected_sample_rate – Expected sample rate in Hz (-1.0 if unknown)

virtual ~DSP() = default

Virtual destructor.

virtual void prewarm()

Prewarm the model to settle initial conditions.

This can be somewhat expensive, so should not be called during real-time audio processing. Important: don’t expect the model to be outputting zeroes after this. Neural networks don’t know that there’s anything special about “zero”, and forcing this gets rid of some possibilities (e.g. models that “are noisy”).

virtual void process(NAM_SAMPLE **input, NAM_SAMPLE **output, const int num_frames)

Process audio frames.

Parameters:
  • input – Input audio buffers. Double pointer where the first pointer indexes channels and the second indexes frames: input[channel][frame]

  • output – Output audio buffers. Same structure as input.

  • num_frames – Number of frames to process

inline double GetExpectedSampleRate() const

Get the expected sample rate.

Returns:

Expected sample rate in Hz (-1.0 if unknown)

inline int NumInputChannels() const

Get the number of input channels.

Returns:

Number of input channels

inline int NumOutputChannels() const

Get the number of output channels.

Returns:

Number of output channels

double GetInputLevel()

Get the input level.

Input level is in dBu RMS, corresponding to 0 dBFS peak for a 1 kHz sine wave. You should call HasInputLevel() first to be safe. Note: input level is assumed global over all inputs.

Returns:

Input level in dBu

double GetLoudness() const

Get how loud this model’s output is, in dB, if a “typical” input is processed.

This can be used to normalize the output level of the object.

Throws a std::runtime_error if the model doesn’t know how loud it is. Note: loudness is assumed global over all outputs.

Throws:

std::runtime_error – If the model doesn’t know its loudness

Returns:

Loudness in dB

double GetOutputLevel()

Get the output level.

Output level is in dBu RMS, corresponding to 0 dBFS peak for a 1 kHz sine wave. You should call HasOutputLevel() first to be safe. Note: output level is assumed global over all outputs.

Returns:

Output level in dBu

bool HasInputLevel()

Check if this model knows its input level.

Note: input level is assumed global over all inputs.

Returns:

true if input level is known, false otherwise

inline bool HasLoudness() const

Check if the model knows how loud it is.

Returns:

true if loudness is known, false otherwise

bool HasOutputLevel()

Check if this model knows its output level.

Note: output level is assumed global over all outputs.

Returns:

true if output level is known, false otherwise

virtual void Reset(const double sampleRate, const int maxBufferSize)

General function for resetting the DSP unit.

This doesn’t call prewarm(). If you want to do that, then you might want to use ResetAndPrewarm(). See https://github.com/sdatkinson/NeuralAmpModelerCore/issues/96 for the reasoning.

Parameters:
  • sampleRate – Current sample rate

  • maxBufferSize – Maximum buffer size to process

inline void ResetAndPrewarm(const double sampleRate, const int maxBufferSize)

Reset the DSP unit, then prewarm.

Parameters:
  • sampleRate – Current sample rate

  • maxBufferSize – Maximum buffer size to process

void SetInputLevel(const double inputLevel)

Set the input level.

Parameters:

inputLevel – Input level in dBu

void SetLoudness(const double loudness)

Set the loudness.

This is usually defined to be the loudness to a standardized input. The trainer has its own, but you can always use this to define it a different way if you like yours better. Note: loudness is assumed global over all outputs.

Parameters:

loudness – Loudness in dB

void SetOutputLevel(const double outputLevel)

Set the output level.

Parameters:

outputLevel – Output level in dBu

Friends

friend class wavenet::WaveNet
class Buffer : public nam::DSP

Base class for DSP models that require input buffering This class is deprecated and will be removed in a future version.

Class where an input buffer is kept so that long-time effects can be captured. (e.g. conv nets or impulse responses, where we need history that’s longer than the sample buffer that’s coming in.)

Subclassed by nam::Linear, nam::convnet::ConvNet

Public Functions

Buffer(const int in_channels, const int out_channels, const int receptive_field, const double expected_sample_rate = -1.0)

Constructor.

Parameters:
  • in_channels – Number of input channels

  • out_channels – Number of output channels

  • receptive_field – Size of the receptive field (buffer size needed)

  • expected_sample_rate – Expected sample rate in Hz (-1.0 if unknown)

class Linear : public nam::Buffer

Basic linear model.

Implements a simple linear convolution, (i.e. an impulse response).

Public Functions

Linear(const int in_channels, const int out_channels, const int receptive_field, const bool _bias, const std::vector<float> &weights, const double expected_sample_rate = -1.0)

Constructor.

Parameters:
  • in_channels – Number of input channels

  • out_channels – Number of output channels

  • receptive_field – Size of the impulse response

  • _bias – Whether to use bias

  • weights – Model weights (impulse response coefficients)

  • expected_sample_rate – Expected sample rate in Hz (-1.0 if unknown)

virtual void process(NAM_SAMPLE **input, NAM_SAMPLE **output, const int num_frames) override

Process audio frames.

Parameters:
  • input – Input audio buffers

  • output – Output audio buffers

  • num_frames – Number of frames to process

class Conv1x1

1x1 convolution (really just a fully-connected linear layer operating per-sample)

Performs a pointwise convolution, which is equivalent to a fully connected layer applied independently to each time step. Supports grouped convolution for efficiency.

Public Functions

Conv1x1(const int in_channels, const int out_channels, const bool _bias, const int groups = 1)

Constructor.

Parameters:
  • in_channels – Number of input channels

  • out_channels – Number of output channels

  • _bias – Whether to use bias

  • groups – Number of groups for grouped convolution (default: 1)

inline Eigen::MatrixXf &GetOutput()

Get the entire internal output buffer.

This is intended for internal wiring between layers/arrays; callers should treat the buffer as pre-allocated storage and only consider the first num_frames columns valid for a given processing call. Slice with .leftCols(num_frames) as needed.

Returns:

Reference to the output buffer

inline const Eigen::MatrixXf &GetOutput() const

Get the entire internal output buffer (const version)

Returns:

Const reference to the output buffer

void SetMaxBufferSize(const int maxBufferSize)

Resize the output buffer to handle maxBufferSize frames.

Parameters:

maxBufferSize – Maximum number of frames to process in a single call

void set_weights_(std::vector<float>::iterator &weights)

Set the parameters (weights) of this module.

Parameters:

weights – Iterator to the weights vector. Will be advanced as weights are consumed.

inline Eigen::MatrixXf process(const Eigen::MatrixXf &input) const

Process input and return output matrix.

Parameters:

input – Input matrix (channels x num_frames) or (channels,)

Returns:

Output matrix (channels x num_frames) or (channels,), respectively

Eigen::MatrixXf process(const Eigen::MatrixXf &input, const int num_frames) const

Process input and return output matrix.

Parameters:
  • input – Input matrix (channels x num_frames)

  • num_frames – Number of frames to process

Returns:

Output matrix (channels x num_frames)

void process_(const Eigen::Ref<const Eigen::MatrixXf> &input, const int num_frames)

Process input and store output to pre-allocated buffer.

Uses Eigen::Ref to accept matrices and block expressions without creating temporaries (real-time safe). Access output via GetOutput().

Parameters:
  • input – Input matrix (channels x num_frames)

  • num_frames – Number of frames to process

long get_out_channels() const
long get_in_channels() const
struct dspData

Data structure for a DSP object.

Contains all information needed to instantiate and configure a DSP model.

Public Members

std::string version

Data version. Follows conventions established in trainer code.

std::string architecture

High-level architecture. Supported: “ConvNet”, “LSTM”, “Linear”, “WaveNet”.

nlohmann::json config

Model configuration JSON.

nlohmann::json metadata

Model metadata JSON.

std::vector<float> weights

Model weights.

double expected_sample_rate

Expected sample rate in Hz.

Most NAM models implicitly assume data at some sample rate. Use -1.0 for “I don’t know”.

Warning

doxygenenum: Cannot find enum “nam::EArchitectures” in doxygen xml output for project “NeuralAmpModelerCore” from directory: doxygen/xml