LSTM API

namespace lstm

Functions

LSTMConfig parse_config_json(const nlohmann::json &config)

Parse LSTM configuration from JSON.

Parameters:: config – JSON configuration object
Returns:: LSTMConfig

std::unique_ptr<ModelConfig> create_config(const nlohmann::json &config, double sampleRate): Config parser for ConfigParserRegistry.

class LSTM : public nam::DSP

#include <lstm.h>

A multi-layer LSTM model.

A multi-layer LSTM processes audio frame-by-frame, maintaining hidden states across layers. Each layer processes the hidden state from the previous layer as input.

Public Functions

LSTM(const int in_channels, const int out_channels, const int num_layers, const int input_size, const int hidden_size, std::vector<float> &weights, const double expected_sample_rate = -1.0)

Constructor.

Parameters:

in_channels – Number of input channels
out_channels – Number of output channels
num_layers – Number of LSTM layers
input_size – Size of the input to each LSTM cell
hidden_size – Size of the hidden state in each LSTM cell
weights – Model weights vector
expected_sample_rate – Expected sample rate in Hz (-1.0 if unknown)

~LSTM() = default: Destructor.

virtual void process(NAM_SAMPLE **input, NAM_SAMPLE **output, const int num_frames) override

Process audio frames.

Parameters:

input – Input audio buffers
output – Output audio buffers
num_frames – Number of frames to process

class LSTMCell

#include <lstm.h>

A single LSTM cell.

Public Functions

LSTMCell(const int input_size, const int hidden_size, std::vector<float>::iterator &weights)

Constructor.

Parameters:

input_size – Size of the input vector
hidden_size – Size of the hidden state
weights – Iterator to the weights vector. Will be advanced as weights are consumed.

inline Eigen::VectorXf get_hidden_state() const

Get the current hidden state.

Returns:: Hidden state vector

void process_(const Eigen::VectorXf &x)

Process a single input vector.

Parameters:: x – Input vector

struct LSTMConfig : public nam::ModelConfig

#include <lstm.h>

Configuration for an LSTM model.

Public Functions

virtual std::unique_ptr<DSP> create(std::vector<float> weights, double sampleRate) override

Construct a DSP object from this configuration.

Parameters:

weights – Model weights (taken by value to allow move for WaveNet)
sampleRate – Expected sample rate in Hz

Returns:

Unique pointer to a DSP object

Public Members

int num_layers

int input_size

int hidden_size

int in_channels

int out_channels

class LSTM : public nam::DSP

A multi-layer LSTM model.

A multi-layer LSTM processes audio frame-by-frame, maintaining hidden states across layers. Each layer processes the hidden state from the previous layer as input.

Public Functions

LSTM(const int in_channels, const int out_channels, const int num_layers, const int input_size, const int hidden_size, std::vector<float> &weights, const double expected_sample_rate = -1.0)

Constructor.

Parameters:

in_channels – Number of input channels
out_channels – Number of output channels
num_layers – Number of LSTM layers
input_size – Size of the input to each LSTM cell
hidden_size – Size of the hidden state in each LSTM cell
weights – Model weights vector
expected_sample_rate – Expected sample rate in Hz (-1.0 if unknown)

~LSTM() = default: Destructor.

virtual void process(NAM_SAMPLE **input, NAM_SAMPLE **output, const int num_frames) override

Process audio frames.

Parameters:

input – Input audio buffers
output – Output audio buffers
num_frames – Number of frames to process

class LSTMCell

A single LSTM cell.

Public Functions

LSTMCell(const int input_size, const int hidden_size, std::vector<float>::iterator &weights)

Constructor.

Parameters:

input_size – Size of the input vector
hidden_size – Size of the hidden state
weights – Iterator to the weights vector. Will be advanced as weights are consumed.

inline Eigen::VectorXf get_hidden_state() const

Get the current hidden state.

Returns:: Hidden state vector

void process_(const Eigen::VectorXf &x)

Process a single input vector.

Parameters:: x – Input vector