[docs]class AudioSignal(object):
"""
**Overview**
:class:`AudioSignal` is the main entry and exit point for all source separation algorithms
in *nussl*. The :class:`AudioSignal` class is a general container for all things related to
audio data. It contains utilities for:
* Input and output from an array or from a file,
* Time-series and frequency domain manipulation,
* Plotting and visualizing,
* Playing audio within a terminal or jupyter notebook,
* Applying a mask to estimate signals
and more. The :class:`AudioSignal` class is used in all source separation objects in *nussl*.
:class:`AudioSignal` object stores time-series audio data as a 2D ``numpy`` array in
:attr:`audio_data` (see :attr:`audio_data` for details) and stores Short-Time Fourier Transform
data as 3D ``numpy`` array in :ref:`stft_data` (see :attr:`stft_data` for details).
**Initialization**
There are a few options for initializing an :class:`AudioSignal` object. The first is to
initialize an empty :class:`AudioSignal` object, with no parameters:
>>> import nussl
>>> signal = nussl.AudioSignal()
In this case, there is no data stored in :attr:`audio_data` or in :attr:`stft_data`, though
these attributes can be updated at any time after the object has been created.
Additionally, an :class:`AudioSignal` object can be loaded with exactly one of the following:
1. A path to an input audio file (see :func:`load_audio_from_file` for details).
2. A `numpy` array of 1D or 2D real-valued time-series audio data.
3. A `numpy` array of 2D or 3D complex-valued time-frequency STFT data.
:class:`AudioSignal` will throw an error if it is initialized with more than one of the
previous at once.
Here are examples of all three of these cases:
.. code-block:: python
:linenos:
import numpy as np
import nussl
# Initializing an empty AudioSignal object:
sig_empty = nussl.AudioSignal()
# Initializing from a path:
file_path = 'my/awesome/mixture.wav'
sig_path = nussl.AudioSignal(file_path)
# Initializing with a 1D or 2D numpy array containing audio data:
aud_1d = np.sin(np.linspace(0.0, 1.0, 48000))
sig_1d = nussl.AudioSignal(audio_data_array=aud_1d, sample_rate=48000)
# FYI: The shape doesn't matter, nussl will correct for it
aud_2d = np.array([aud_1d, -2 * aud_1d])
sig_2d = nussl.AudioSignal(audio_data_array=aud_2d)
# Initializing with a 2D or 3D numpy array containing STFT data:
stft_2d = np.random.rand((513, 300)) + 1j * np.random.rand((513, 300))
sig_stft_2d = nussl.AudioSignal(stft=stft_2d)
# Two channels of STFT data:
stft_3d = nussl.utils.complex_randn((513, 300, 2))
sig_stft_3d = nussl.AudioSignal(stft=stft_3d)
# Initializing with more than one of the above methods will raise an exception:
sig_exception = nussl.AudioSignal(audio_data_array=aud_2d, stft=stft_2d)
When initializing from a path, :class:`AudioSignal` can read many types of audio files,
provided that your computer has the backends installed to understand the corresponding codecs.
*nussl* uses ``librosa``'s `load` function to read in audio data. See librosa's documentation
for details: https://github.com/librosa/librosa#audioread
Once initialized with a single type of data (time-series or time-frequency), there are methods
to compute an STFT from time-series data (:func:`stft`) and vice versa (:func:`istft`).
**Sample Rate**
The sample rate of an :class:`AudioSignal` object is set upon initialization. If initializing
from a path, the sample rate of the :class:`AudioSignal` object inherits the native sample
rate from the file. If initialized with an audio or stft data array, the sample rate is passed
in as an optional argument. In these cases, with no sample rate explicitly defined, the default
sample rate is 44.1 kHz (CD quality). If this argument is provided when reading from a file
and the provided sample rate does not match the native sample rate of the file,
:class:`AudioSignal` will resample the data from the file so that it matches the provided
sample rate.
Notes:
There is no guarantee that data in :attr:`audio_data` corresponds to data in
:attr:`stft_data`. E.g., when an :class:`AudioSignal` object is initialized with
:attr:`audio_data` of an audio mixture, its :attr:`stft_data` is ``None`` until :func:`stft`
is called. Once :func:`stft` is called and a mask is applied to :attr:`stft_data` (via some
algorithm), the :attr:`audio_data` in this :class:`AudioSignal` object still contains data
from the original mixture that it was initialized with even though :attr:`stft_data`
contains altered data. (To hear the results, simply call :func:`istft` on the
:class:`AudioSignal` object.) It is up to the user to keep track of the contents of
:attr:`audio_data` and :attr:`stft_data`.
See Also:
For a walk-through of AudioSignal features, see :ref:`audio_signal_basics` and
:ref:`audio_signal_stft`.
Arguments:
path_to_input_file (``str``): Path to an input file to load upon initialization. Audio
gets loaded into :attr:`audio_data`.
audio_data_array (:obj:`np.ndarray`): 1D or 2D numpy array containing a real-valued,
time-series representation of the audio.
stft (:obj:`np.ndarray`): 2D or 3D numpy array containing pre-computed complex-valued STFT
data.
label (``str``): A label for this :class:`AudioSignal` object.
offset (``float``): Starting point of the section to be extracted (in seconds) if
initializing from a file.
duration (``float``): Length of the signal to read from the file (in seconds). Defaults to
full length of the signal (i.e., ``None``).
sample_rate (``int``): Sampling rate of this :class:`AudioSignal` object.
Attributes:
path_to_input_file (``str``): Path to the input file. ``None`` if this AudioSignal never
loaded a file, i.e., initialized with a ``np.ndarray``.
label (``str``): A user-definable label for this :class:`AudioSignal` object.
"""
def __init__(self, path_to_input_file=None, audio_data_array=None, stft=None, label=None,
sample_rate=None, stft_params=None, offset=0, duration=None):
self.path_to_input_file = path_to_input_file
self._audio_data = None
self.original_signal_length = None
self._stft_data = None
self._sample_rate = None
self._active_start = None
self._active_end = None
self.label = label
# Assert that this object was only initialized in one way
got_path = path_to_input_file is not None
got_audio_array = audio_data_array is not None
got_stft = stft is not None
init_inputs = np.array([got_path, got_audio_array, got_stft])
# noinspection PyPep8
if len(init_inputs[init_inputs == True]) > 1: # ignore inspection for clarity
raise AudioSignalException('Can only initialize AudioSignal object with one and only '
'one of {path, audio, stft}!')
if path_to_input_file is not None:
self.load_audio_from_file(self.path_to_input_file, offset, duration, sample_rate)
elif audio_data_array is not None:
self.load_audio_from_array(audio_data_array, sample_rate)
if self._sample_rate is None:
self._sample_rate = constants.DEFAULT_SAMPLE_RATE \
if sample_rate is None else sample_rate
self.stft_data = stft # complex spectrogram data
self.stft_params = stft_params
def __str__(self):
dur = f'{self.signal_duration:0.3f}' if self.signal_duration else '[unknown]'
return (
f"{self.__class__.__name__} "
f"({self.label if self.label else 'unlabeled'}): "
f"{dur} sec @ "
f"{self.path_to_input_file if self.path_to_input_file else 'path unknown'}, "
f"{self.sample_rate if self.sample_rate else '[unknown]'} Hz, "
f"{self.num_channels if self.num_channels else '[unknown]'} ch."
)
##################################################
# Properties
##################################################
@property
def signal_length(self):
"""
``int``
Number of samples in the active region of :attr:`audio_data`.
The length of the audio signal represented by this object in samples.
See Also:
* :func:`signal_duration` for the signal duration in seconds.
* :func:`set_active_region_to_default` for information about active regions.
"""
if self.audio_data is None:
return self.original_signal_length
return self.audio_data.shape[constants.LEN_INDEX]
@property
def signal_duration(self):
"""
``float``
Duration of the active region of :attr:`audio_data` in seconds.
The length of the audio signal represented by this object in seconds.
See Also:
* :func:`signal_length` for the signal length in samples.
* :func:`set_active_region_to_default` for information about active regions.
"""
if self.signal_length is None:
return None
return self.signal_length / self.sample_rate
@property
def num_channels(self):
"""
``int``
Number of channels this :class:`AudioSignal` has.
Defaults to returning number of channels in :attr:`audio_data`. If that is ``None``,
returns number of channels in :attr:`stft_data`. If both are ``None`` then returns
``None``.
See Also:
* :func:`is_mono`
* :func:`is_stereo`
"""
# TODO: what about a mismatch between audio_data and stft_data??
if self.audio_data is not None:
return self.audio_data.shape[constants.CHAN_INDEX]
if self.stft_data is not None:
return self.stft_data.shape[constants.STFT_CHAN_INDEX]
return None
@property
def is_mono(self):
"""
``bool``
Whether or not this signal is mono (i.e., has exactly **one** channel). First
looks at :attr:`audio_data`, then (if that's ``None``) looks at :attr:`stft_data`.
See Also:
* :func:`num_channels`
* :func:`is_stereo`
"""
return self.num_channels == 1
@property
def is_stereo(self):
"""
``bool``
Whether or not this signal is stereo (i.e., has exactly **two** channels). First
looks at :attr:`audio_data`, then (if that's ``None``) looks at :attr:`stft_data`.
See Also:
* :func:`num_channels`
* :func:`is_mono`
"""
return self.num_channels == 2
@property
def audio_data(self):
"""
``np.ndarray``
Stored as a ``numpy`` :obj:`np.ndarray`, :attr:`audio_data` houses the raw, uncompressed
time-domain audio data in the :class:`AudioSignal`. Audio data is stored with shape
``(n_channels, n_samples)`` as an array of floats.
``None`` by default, can be initialized upon object instantiation or set at any time by
accessing this attribute or calling :func:`load_audio_from_array`. It is recommended to
set :attr:`audio_data` by using :func:`load_audio_from_array` if this
:class:`AudioSignal` has been initialized without any audio or STFT data.
Raises:
:class:`AudioSignalException`
If set incorrectly, will raise an error. Expects a real, finite-valued 1D or 2D
``numpy`` :obj:`np.ndarray`-typed array.
Warnings:
:attr:`audio_data` and :attr:`stft_data` are not automatically synchronized, meaning
that if one of them is changed, those changes are not instantly reflected in the other.
To propagate changes, either call :func:`stft` or :func:`istft`.
Notes:
* This attribute only returns values within the active region. For more information
see :func:`set_active_region_to_default`. When setting this attribute, the active
region are reset to default.
* If :attr:`audio_data` is set with an improperly transposed array, it will
automatically transpose it so that it is set the expected way. A warning will be
displayed on the console.
See Also:
* :func:`load_audio_from_file` to load audio into :attr:`audio_data` after
initialization.
* :func:`load_audio_from_array` to safely load audio into :attr:`audio_data` after
initialization.
* :func:`set_active_region_to_default` for more information about the active region.
* :attr:`signal_duration` and :attr:`signal_length` for length of audio data in seconds
and samples, respectively.
* :func:`stft` to calculate an STFT from this data,
and :func:`istft` to calculate the inverse STFT and put it in :attr:`audio_data`.
* :func:`plot_time_domain` to create a plot of audio data stored in this attribute.
* :func:`peak_normalize` to apply gain such that to the absolute max value is exactly
``1.0``.
* :func:`rms` to calculate the root-mean-square of :attr:`audio_data`
* :func:`apply_gain` to apply a gain.
* :func:`get_channel` to safely retrieve a single channel in :attr:`audio_data`.
"""
if self._audio_data is None:
return None
start = 0
end = self._audio_data.shape[constants.LEN_INDEX]
if self._active_end is not None and self._active_end < end:
end = self._active_end
if self._active_start is not None and self._active_start > 0:
start = self._active_start
return self._audio_data[:, start:end]
@audio_data.setter
def audio_data(self, value):
if value is None:
self._audio_data = None
return
elif not isinstance(value, np.ndarray):
raise AudioSignalException('Type of self.audio_data must be of type np.ndarray!')
if not np.isfinite(value).all():
raise AudioSignalException('Not all values of audio_data are finite!')
if value.ndim > 1 and value.shape[constants.CHAN_INDEX] > value.shape[constants.LEN_INDEX]:
value = value.T
if value.ndim > 2:
raise AudioSignalException('self.audio_data cannot have more than 2 dimensions!')
if value.ndim < 2:
value = np.expand_dims(value, axis=constants.CHAN_INDEX)
self._audio_data = value
self.set_active_region_to_default()
@property
def stft_data(self):
"""
``np.ndarray``
Stored as a ``numpy`` :obj:`np.ndarray`, :attr:`stft_data` houses complex-valued data
computed from a Short-time Fourier Transform (STFT) of audio data in the
:class:`AudioSignal`. ``None`` by default, this :class:`AudioSignal` object can be
initialized with STFT data upon initialization or it can be set at any time.
The STFT data is stored with shape ``(n_frequency_bins, n_hops, n_channels)`` as
a complex-valued ``numpy`` array.
Raises:
:class:`AudioSignalException`
if set with an :obj:`np.ndarray` with one dimension or more than three dimensions.
See Also:
* :func:`stft` to calculate an STFT from :attr:`audio_data`, and :func:`istft` to
calculate the inverse STFT from this attribute and put it in :attr:`audio_data`.
* :func:`magnitude_spectrogram` to calculate and get the magnitude spectrogram from
:attr:`stft_data`. :func:`power_spectrogram` to calculate and get the power
spectrogram from :attr:`stft_data`.
* :func:`get_stft_channel` to safely get a specific channel in :attr:`stft_data`.
Notes:
* :attr:`audio_data` and :attr:`stft_data` are not automatically synchronized, meaning
that if one of them is changed, those changes are not instantly reflected in the other.
To propagate changes, either call :func:`stft` or :func:`istft`.
* :attr:`stft_data` will expand a two dimensional array so that it has the expected
shape `(n_frequency_bins, n_hops, n_channels)`.
"""
return self._stft_data
@stft_data.setter
def stft_data(self, value):
if value is None:
self._stft_data = None
return
elif not isinstance(value, np.ndarray):
raise AudioSignalException('Type of self.stft_data must be of type np.ndarray!')
if value.ndim == 1:
raise AudioSignalException('Cannot support arrays with less than 2 dimensions!')
if value.ndim == 2:
value = np.expand_dims(value, axis=constants.STFT_CHAN_INDEX)
if value.ndim > 3:
raise AudioSignalException('Cannot support arrays with more than 3 dimensions!')
if not np.iscomplexobj(value):
warnings.warn('Initializing STFT with data that is non-complex. '
'This might lead to weird results!')
self._stft_data = value
@property
def stft_params(self):
"""
``STFTParams``
STFT parameters are kept in this property. STFT parameters are a ``namedtuple``
called ``STFTParams`` with the following signature:
.. code-block:: python
STFTParams(
window_length=2048,
hop_length=512,
window_type='hann'
)
The defaults are 32ms windows, 8ms hop, and a hann window.
"""
return self._stft_params
@stft_params.setter
def stft_params(self, value):
if value and not isinstance(value, STFTParams):
raise ValueError("stft_params must be of type STFTParams or None!")
default_win_len = int(
2 ** (np.ceil(np.log2(constants.DEFAULT_WIN_LEN_PARAM * self.sample_rate)))
)
default_hop_len = default_win_len // 4
default_win_type = constants.WINDOW_DEFAULT
default_stft_params = STFTParams(
window_length=default_win_len,
hop_length=default_hop_len,
window_type=default_win_type
)._asdict()
value = value._asdict() if value else default_stft_params
for key in default_stft_params:
if value[key] is None:
value[key] = default_stft_params[key]
self._stft_params = STFTParams(**value)
if self._stft_params.window_type == 'sqrt_hann':
window_type = constants.WINDOW_HANN
else:
window_type = self._stft_params.window_type
check_COLA(window_type, self._stft_params.window_length, self._stft_params.hop_length)
@property
def has_data(self):
"""
``bool``
Returns ``False`` if :attr:`audio_data` and :attr:`stft_data` are empty. Else,
returns ``True``.
"""
has_audio_data = self.audio_data is not None and self.audio_data.size != 0
has_stft_data = self.stft_data is not None and self.stft_data.size != 0
return has_audio_data or has_stft_data
@property
def file_name(self):
"""
``str``
The name of the file associated with this object. Includes extension, but not the full
path.
Notes:
This will return ``None`` if this :class:`AudioSignal` object was not
loaded from a file.
See Also:
:attr:`path_to_input_file` for the full path.
"""
if self.path_to_input_file is not None:
return os.path.basename(self.path_to_input_file)
return None
@property
def sample_rate(self):
"""
``int``
Sample rate associated with this object. If audio was read from a file, the sample
rate will be set to the sample rate associated with the file. If this object was
initialized from an array then the sample rate is set upon init. This property is
read-only. To change the sample rate, use :func:`resample`.
Notes:
This property is read-only and cannot be set directly. To change
See Also:
* :func:`resample` to change the sample rate and resample data in :attr:`sample_rate`.
* :func:`load_audio_from_array` to read audio from an array and set the sample rate.
* :var:`nussl.constants.DEFAULT_SAMPLE_RATE` the default sample rate for *nussl*
if not specified
"""
return self._sample_rate
@property
def time_vector(self):
"""
``np.ndarray``
A 1D :obj:`np.ndarray` with timestamps (in seconds) for each sample in
:attr:`audio_data`.
"""
if self.signal_duration is None:
return None
return np.linspace(0.0, self.signal_duration, num=self.signal_length)
@property
def freq_vector(self):
"""
``np.ndarray``
A 1D numpy array with frequency values (in Hz) that correspond
to each frequency bin (vertical axis) in :attr:`stft_data`. Assumes
linearly spaced frequency bins.
Raises:
:class:`AudioSignalException`: If :attr:`stft_data` is ``None``.
Run :func:`stft` before accessing this.
"""
if self.stft_data is None:
raise AudioSignalException(
'Cannot calculate freq_vector until self.stft() is run')
return np.linspace(
0.0, self.sample_rate // 2,
num=self.stft_data.shape[constants.STFT_VERT_INDEX])
@property
def time_bins_vector(self):
"""
``np.ndarray``
A 1D numpy array with time values (in seconds) that correspond
to each time bin (horizontal/time axis) in :attr:`stft_data`.
Raises:
:class:`AudioSignalException`: If :attr:`stft_data` is ``None``. Run :func:`stft`
before accessing this.
"""
if self.stft_data is None:
raise AudioSignalException(
'Cannot calculate time_bins_vector until self.stft() is run')
return np.linspace(0.0, self.signal_duration,
num=self.stft_data.shape[constants.STFT_LEN_INDEX])
@property
def stft_length(self):
"""
``int``
The length of :attr:`stft_data` along the time axis. In units of hops.
Raises:
:class:`AudioSignalException`: If ``self.stft_dat``a is ``None``. Run :func:`stft`
before accessing this.
"""
if self.stft_data is None:
raise AudioSignalException('Cannot calculate stft_length until self.stft() is run')
return self.stft_data.shape[constants.STFT_LEN_INDEX]
@property
def active_region_is_default(self):
"""
``bool``
``True`` if active region is the full length of :attr:`audio_data`. ``False`` otherwise.
See Also:
* :func:`set_active_region` for a description of active regions in :class:`AudioSignal`
* :func:`set_active_region_to_default`
"""
return self._active_start == 0 and self._active_end == self._signal_length
@property
def _signal_length(self):
"""
``int``
This is the length of the full signal, not just the active region.
"""
if self._audio_data is None:
return None
return self._audio_data.shape[constants.LEN_INDEX]
@property
def power_spectrogram_data(self):
"""
``np.ndarray``
Returns a real valued :obj:`np.ndarray` with power
spectrogram data. The power spectrogram is defined as ``(STFT)^2``, where ``^2`` is
element-wise squaring of entries of the STFT. Same shape as :attr:`stft_data`.
Raises:
:class:`AudioSignalException`: if :attr:`stft_data` is ``None``. Run :func:`stft`
before accessing this.
See Also:
* :func:`stft` to calculate the STFT before accessing this attribute.
* :attr:`stft_data` complex-valued Short-time Fourier Transform data.
* :attr:`magnitude_spectrogram_data` to get magnitude spectrogram data.
* :func:`get_power_spectrogram_channel` to get a specific channel
"""
if self.stft_data is None:
raise AudioSignalException('Cannot calculate power_spectrogram_data '
'because self.stft_data is None')
return np.abs(self.stft_data) ** 2
@property
def magnitude_spectrogram_data(self):
"""
``np.ndarray``
Returns a real valued ``np.array`` with magnitude spectrogram data. The magnitude
spectrogram is defined as ``abs(STFT)``, the element-wise absolute value of every item
in the STFT. Same shape as :attr:`stft_data`.
Raises:
AudioSignalException: if :attr:`stft_data` is ``None``. Run :func:`stft` before
accessing this.
See Also:
* :func:`stft` to calculate the STFT before accessing this attribute.
* :attr:`stft_data` complex-valued Short-time Fourier Transform data.
* :attr:`power_spectrogram_data`
* :func:`get_magnitude_spectrogram_channel`
"""
if self.stft_data is None:
raise AudioSignalException('Cannot calculate magnitude_spectrogram_data '
'because self.stft_data is None')
return np.abs(self.stft_data)
@property
def log_magnitude_spectrogram_data(self):
"""
(:obj:`np.ndarray`): Returns a real valued ``np.array`` with log magnitude spectrogram data.
The log magnitude spectrogram is defined as 20*log10(Abs(STFT)). Same shape as :attr:`stft_data`.
Raises:
AudioSignalException: if :attr:`stft_data` is ``None``. Run :func:`stft` before
accessing this.
See Also:
* :func:`stft` to calculate the STFT before accessing this attribute.
* :attr:`stft_data` complex-valued Short-time Fourier Transform data.
* :attr:`power_spectrogram_data`
* :func:`get_magnitude_spectrogram_channel`
"""
if self.stft_data is None:
raise AudioSignalException('Cannot calculate log_magnitude_spectrogram_data '
'because self.stft_data is None')
return 20 * np.log10(np.abs(self.stft_data) + 1e-8)
##################################################
# I/O
##################################################
[docs] def load_audio_from_file(self, input_file_path, offset=0, duration=None, new_sample_rate=None):
# type: (str, float, float, int) -> None
"""
Loads an audio signal into memory from a file on disc. The audio is stored in
:class:`AudioSignal` as a :obj:`np.ndarray` of `float` s. The sample rate is read from
the file, and this :class:`AudioSignal` object's sample rate is set from it. If
:param:`new_sample_rate` is not ``None`` nor the same as the sample rate of the file,
the audio will be resampled to the sample rate provided in the :param:`new_sample_rate`
parameter. After reading the audio data into memory, the active region is set to default.
:param:`offset` and :param:`duration` allow the user to determine how much of the audio is
read from the file. If those are non-default, then only the values provided will be stored
in :attr:`audio_data` (unlike with the active region, which has the entire audio data stored
in memory but only allows access to a subset of the audio).
See Also:
* :func:`load_audio_from_array` to read audio data from a :obj:`np.ndarray`.
Args:
input_file_path (str): Path to input file.
offset (float,): The starting point of the section to be extracted (seconds).
Defaults to 0 seconds (i.e., the very beginning of the file).
duration (float): Length of signal to load in second.
signal_length of 0 means read the whole file. Defaults to the full
length of the signal.
new_sample_rate (int): If this parameter is not ``None`` or the same sample rate as
provided by the input file, then the audio data will be resampled to the new
sample rate dictated by this parameter.
"""
assert offset >= 0, 'Parameter `offset` must be >= 0!'
if duration is not None:
assert duration >= 0, 'Parameter `duration` must be >= 0!'
try:
# try reading headers with soundfile for speed
audio_info = sf.info(input_file_path)
file_length = audio_info.duration
except:
# if that doesn't work try audioread
with audioread.audio_open(os.path.realpath(input_file_path)) as input_file:
file_length = input_file.duration
if offset > file_length:
raise AudioSignalException('offset is longer than signal!')
if duration is not None and offset + duration >= file_length:
warnings.warn('offset + duration are longer than the signal.'
' Reading until end of signal...',
UserWarning)
audio_input, self._sample_rate = librosa.load(input_file_path,
sr=None,
offset=offset,
duration=duration,
mono=False)
self.audio_data = audio_input
self.original_signal_length = self.signal_length
if new_sample_rate is not None and new_sample_rate != self._sample_rate:
warnings.warn('Input sample rate is different than the sample rate'
' read from the file! Resampling...',
UserWarning)
self.resample(new_sample_rate)
self.path_to_input_file = input_file_path
self.set_active_region_to_default()
[docs] def load_audio_from_array(self, signal, sample_rate=constants.DEFAULT_SAMPLE_RATE):
"""
Loads an audio signal from a :obj:`np.ndarray`. :param:`sample_rate` is the sample
of the signal.
See Also:
* :func:`load_audio_from_file` to read in an audio file from disc.
Notes:
Only accepts float arrays and int arrays of depth 16-bits.
Parameters:
signal (:obj:`np.ndarray`): Array containing the audio signal sampled at
:param:`sample_rate`.
sample_rate (int): The sample rate of signal.
Default is :ref:`constants.DEFAULT_SAMPLE_RATE` (44.1kHz)
"""
assert (type(signal) == np.ndarray)
self.path_to_input_file = None
# Change from fixed point to floating point
if not np.issubdtype(signal.dtype, np.floating):
signal = signal.astype('float') / (np.iinfo(np.dtype('int16')).max + 1.0)
self.audio_data = signal
self.original_signal_length = self.signal_length
self._sample_rate = sample_rate if sample_rate is not None \
else constants.DEFAULT_SAMPLE_RATE
self.set_active_region_to_default()
[docs] def write_audio_to_file(self, output_file_path, sample_rate=None):
"""
Outputs the audio signal data in :attr:`audio_data` to a file at :param:`output_file_path`
with sample rate of :param:`sample_rate`.
Parameters:
output_file_path (str): Filename where output file will be saved.
sample_rate (int): The sample rate to write the file at. Default is
:attr:`sample_rate`.
"""
if self.audio_data is None:
raise AudioSignalException("Cannot write audio file because there is no audio data.")
if sample_rate is None:
sample_rate = self.sample_rate
audio_output = np.copy(self.audio_data)
# TODO: better fix
# convert to fixed point again
if not np.issubdtype(audio_output.dtype, np.dtype(int).type):
audio_output = np.multiply(
audio_output,
2 ** (constants.DEFAULT_BIT_DEPTH - 1)).astype('int16')
wav.write(output_file_path, sample_rate, audio_output.T)
##################################################
# Active Region
##################################################
[docs] def set_active_region(self, start, end):
"""
Determines the bounds of what gets returned when you access :attr:`audio_data`.
None of the data in :attr:`audio_data` is discarded when you set the active region, it
merely becomes inaccessible until the active region is set back to default (i.e., the full
length of the signal).
This is useful for reusing a single :class:`AudioSignal` object to do multiple operations on
only select parts of the audio data.
Warnings:
Many functions will raise exceptions while the active region is not default. Be aware
that adding, subtracting, concatenating, truncating, and other utilities are not
available when the active region is not default.
See Also:
* :func:`set_active_region_to_default`
* :attr:`active_region_is_default`
Examples:
>>> import nussl
>>> import numpy as np
>>> n = nussl.constants.DEFAULT_SAMPLE_RATE # 1 second of audio at 44.1kHz
>>> np_sin = np.sin(np.linspace(0, 100 * 2 * np.pi, n)) # sine wave @ 100 Hz
>>> sig = nussl.AudioSignal(audio_data_array=np_sin)
>>> sig.signal_duration
1.0
>>> sig.set_active_region(0, n // 2)
>>> sig.signal_duration
0.5
Args:
start (int): Beginning of active region (in samples). Cannot be less than 0.
end (int): End of active region (in samples). Cannot be larger than
:attr:`signal_length`.
"""
start, end = int(start), int(end)
self._active_start = start if start >= 0 else 0
self._active_end = end if end < self._signal_length else self._signal_length
[docs] def set_active_region_to_default(self):
"""
Resets the active region of this :class:`AudioSignal` object to its default value of the
entire :attr:`audio_data` array.
See Also:
* :func:`set_active_region` for an explanation of active regions within the
:class:`AudioSignal`.
"""
self._active_start = 0
self._active_end = self._signal_length
##################################################
# STFT Utilities
##################################################
[docs] @staticmethod
def get_window(window_type, window_length):
"""
Wrapper around scipy.signal.get_window so one can also get the
popular sqrt-hann window.
Args:
window_type (str): Type of window to get (see constants.ALL_WINDOW).
window_length (int): Length of the window
Returns:
np.ndarray: Window returned by scipy.signa.get_window
"""
if window_type == constants.WINDOW_SQRT_HANN:
window = np.sqrt(scipy.signal.get_window(
'hann', window_length
))
else:
window = scipy.signal.get_window(
window_type, window_length)
return window
[docs] def stft(self, window_length=None, hop_length=None, window_type=None, overwrite=True):
"""
Computes the Short Time Fourier Transform (STFT) of :attr:`audio_data`.
The results of the STFT calculation can be accessed from :attr:`stft_data`
if :attr:`stft_data` is ``None`` prior to running this function or ``overwrite == True``
Warning:
If overwrite=True (default) this will overwrite any data in :attr:`stft_data`!
Args:
window_length (int): Amount of time (in samples) to do an FFT on
hop_length (int): Amount of time (in samples) to skip ahead for the new FFT
window_type (str): Type of scaling to apply to the window.
overwrite (bool): Overwrite :attr:`stft_data` with current calculation
Returns:
(:obj:`np.ndarray`) Calculated, complex-valued STFT from :attr:`audio_data`, 3D numpy
array with shape `(n_frequency_bins, n_hops, n_channels)`.
"""
if self.audio_data is None or self.audio_data.size == 0:
raise AudioSignalException(
"No time domain signal (self.audio_data) to make STFT from!")
window_length = (
self.stft_params.window_length
if window_length is None
else int(window_length)
)
hop_length = (
self.stft_params.hop_length
if hop_length is None
else int(hop_length)
)
window_type = (
self.stft_params.window_type
if window_type is None
else window_type
)
stft_data = []
window = self.get_window(window_type, window_length)
for chan in self.get_channels():
_, _, _stft = scipy.signal.stft(
chan, fs=self.sample_rate, window=window,
nperseg=window_length, noverlap=window_length - hop_length)
stft_data.append(_stft)
stft_data = np.array(stft_data).transpose((1, 2, 0))
if overwrite:
self.stft_data = stft_data
return stft_data
[docs] def istft(self, window_length=None, hop_length=None, window_type=None, overwrite=True,
truncate_to_length=None):
""" Computes and returns the inverse Short Time Fourier Transform (iSTFT).
The results of the iSTFT calculation can be accessed from :attr:`audio_data`
if :attr:`audio_data` is ``None`` prior to running this function or ``overwrite == True``
Warning:
If overwrite=True (default) this will overwrite any data in :attr:`audio_data`!
Args:
window_length (int): Amount of time (in samples) to do an FFT on
hop_length (int): Amount of time (in samples) to skip ahead for the new FFT
window_type (str): Type of scaling to apply to the window.
overwrite (bool): Overwrite :attr:`stft_data` with current calculation
truncate_to_length (int): truncate resultant signal to specified length. Default ``None``.
Returns:
(:obj:`np.ndarray`) Calculated, real-valued iSTFT from :attr:`stft_data`, 2D numpy array
with shape `(n_channels, n_samples)`.
"""
if self.stft_data is None or self.stft_data.size == 0:
raise AudioSignalException('Cannot do inverse STFT without self.stft_data!')
window_length = (
self.stft_params.window_length
if window_length is None
else int(window_length)
)
hop_length = (
self.stft_params.hop_length
if hop_length is None
else int(hop_length)
)
window_type = (
self.stft_params.window_type
if window_type is None
else window_type
)
signals = []
window = self.get_window(window_type, window_length)
for stft in self.get_stft_channels():
_, _signal = scipy.signal.istft(
stft, fs=self.sample_rate, window=window,
nperseg=window_length, noverlap=window_length - hop_length)
signals.append(_signal)
calculated_signal = np.array(signals)
# Make sure it's shaped correctly
calculated_signal = np.expand_dims(calculated_signal, -1) \
if calculated_signal.ndim == 1 else calculated_signal
# if truncate_to_length isn't provided
if truncate_to_length is None:
truncate_to_length = self.original_signal_length
if self.signal_length is not None:
truncate_to_length = self.signal_length
if truncate_to_length is not None and truncate_to_length > 0:
calculated_signal = calculated_signal[:, :truncate_to_length]
if overwrite or self.audio_data is None:
self.audio_data = calculated_signal
return calculated_signal
[docs] def apply_mask(self, mask, overwrite=False):
"""
Applies the input mask to the time-frequency representation in this :class:`AudioSignal`
object and returns a new :class:`AudioSignal` object with the mask applied. The mask
is applied to the magnitude of audio signal. The phase of the original audio
signal is then applied to construct the masked STFT.
Args:
mask (:obj:`MaskBase`-derived object): A ``MaskBase``-derived object
containing a mask.
overwrite (bool): If ``True``, this will alter ``stft_data`` in self.
If ``False``, this function will create a new ``AudioSignal`` object
with the mask applied.
Returns:
A new :class:`AudioSignal`` object with the input mask applied to the STFT,
iff ``overwrite`` is False.
"""
if not isinstance(mask, masks.MaskBase):
raise AudioSignalException(f'Expected MaskBase-derived object, given {type(mask)}')
if self.stft_data is None:
raise AudioSignalException('There is no STFT data to apply a mask to!')
if mask.shape != self.stft_data.shape:
if not mask.shape[:-1] == self.stft_data.shape[:-1]:
raise AudioSignalException(
'Input mask and self.stft_data are not the same shape! mask:'
f' {mask.shape}, self.stft_data: {self.stft_data.shape}'
)
magnitude, phase = np.abs(self.stft_data), np.angle(self.stft_data)
masked_abs = magnitude * mask.mask
masked_stft = masked_abs * np.exp(1j * phase)
if overwrite:
self.stft_data = masked_stft
else:
return self.make_copy_with_stft_data(masked_stft, verbose=False)
[docs] def ipd_ild_features(self, ch_one=0, ch_two=1):
"""
Computes interphase difference (IPD) and interlevel difference (ILD) for a
stereo spectrogram. If more than two channels, this by default computes IPD/ILD
between the first two channels. This can be specified by the arguments ch_one
and ch_two. If only one channel, this raises an error.
Args:
ch_one (``int``): index of first channel to compute IPD/ILD.
ch_two (``int``): index of second channel to compute IPD/ILD.
Returns:
ipd (``np.ndarray``): Interphase difference between selected channels
ild (``np.ndarray``): Interlevel difference between selected channels
"""
if self.stft_data is None:
raise AudioSignalException("Cannot compute ipd/ild features without stft_data!")
if self.is_mono:
raise AudioSignalException("Cannot compute ipd/ild features on mono input!")
stft_ch_one = self.get_stft_channel(ch_one)
stft_ch_two = self.get_stft_channel(ch_two)
ild = np.abs(stft_ch_one) / (np.abs(stft_ch_two) + 1e-4)
ild = 20 * np.log10(ild + 1e-8)
frequencies = self.freq_vector
ipd = np.angle(stft_ch_two * np.conj(stft_ch_one))
ipd /= (frequencies + 1.0)[:, None]
ipd = ipd % np.pi
return ipd, ild
##################################################
# Utilities
##################################################
[docs] def concat(self, other):
""" Concatenate two :class:`AudioSignal` objects (by concatenating :attr:`audio_data`).
Puts ``other.audio_data`` after :attr:`audio_data`.
Raises:
AudioSignalException: If ``self.sample_rate != other.sample_rate``,
``self.num_channels != other.num_channels``, or ``!self.active_region_is_default``
is ``False``.
Args:
other (:class:`AudioSignal`): :class:`AudioSignal` to concatenate with the current one.
"""
self._verify_audio(other)
self.audio_data = np.concatenate((self.audio_data, other.audio_data),
axis=constants.LEN_INDEX)
[docs] def truncate_samples(self, n_samples):
""" Truncates the signal leaving only the first ``n_samples`` samples.
This can only be done if ``self.active_region_is_default`` is True. If
``n_samples > self.signal_length``, then `n_samples = self.signal_length`
(no truncation happens).
Raises:
AudioSignalException: If ``self.active_region_is_default`` is ``False``.
Args:
n_samples: (int) number of samples that will be left.
"""
if not self.active_region_is_default:
raise AudioSignalException('Cannot truncate while active region is not set as default!')
n_samples = int(n_samples)
if n_samples > self.signal_length:
n_samples = self.signal_length
self.audio_data = self.audio_data[:, 0: n_samples]
[docs] def truncate_seconds(self, n_seconds):
""" Truncates the signal leaving only the first n_seconds.
This can only be done if self.active_region_is_default is True.
Args:
n_seconds: (float) number of seconds to truncate :attr:`audio_data`.
"""
n_samples = int(n_seconds * self.sample_rate)
self.truncate_samples(n_samples)
[docs] def crop_signal(self, before, after):
"""
Get rid of samples before and after the signal on all channels. Contracts the length
of :attr:`audio_data` by before + after. Useful to get rid of zero padding after the fact.
Args:
before: (int) number of samples to remove at beginning of self.audio_data
after: (int) number of samples to remove at end of self.audio_data
"""
if not self.active_region_is_default:
raise AudioSignalException('Cannot crop signal while active region '
'is not set as default!')
num_samples = self.signal_length
self.audio_data = self.audio_data[:, before:num_samples - after]
self.set_active_region_to_default()
[docs] def zero_pad(self, before, after):
""" Adds zeros before and after the signal to all channels.
Extends the length of self.audio_data by before + after.
Raises:
Exception: If `self.active_region_is_default`` is ``False``.
Args:
before: (int) number of zeros to be put before the current contents of self.audio_data
after: (int) number of zeros to be put after the current contents fo self.audio_data
"""
if not self.active_region_is_default:
raise AudioSignalException('Cannot zero-pad while active region is not set as default!')
self.audio_data = np.pad(self.audio_data, ((0, 0), (before, after)), 'constant')
[docs] def add(self, other):
"""Adds two audio signal objects.
This does element-wise addition on the :attr:`audio_data` array.
Raises:
AudioSignalException: If ``self.sample_rate != other.sample_rate``,
``self.num_channels != other.num_channels``, or
``self.active_region_is_default`` is ``False``.
Parameters:
other (:class:`AudioSignal`): Other :class:`AudioSignal` to add.
Returns:
(:class:`AudioSignal`): New :class:`AudioSignal` object with the sum of
``self`` and ``other``.
"""
if isinstance(other, int):
# this is so that sum(list of audio_signals) works.
# when sum is called on a list it's evaluated as 0 + elem1 + elem2 + ...
# so the 0 case needs to be taken care of (by doing nothing)
return self
self._verify_audio_arithmetic(other)
new_signal = copy.deepcopy(self)
new_signal.audio_data = self.audio_data + other.audio_data
return new_signal
[docs] def subtract(self, other):
"""Subtracts two audio signal objects.
This does element-wise subtraction on the :attr:`audio_data` array.
Raises:
AudioSignalException: If ``self.sample_rate != other.sample_rate``,
``self.num_channels != other.num_channels``, or
``self.active_region_is_default`` is ``False``.
Parameters:
other (:class:`AudioSignal`): Other :class:`AudioSignal` to subtract.
Returns:
(:class:`AudioSignal`): New :class:`AudioSignal` object with the difference
between ``self`` and ``other``.
"""
self._verify_audio_arithmetic(other)
other_copy = copy.deepcopy(other)
other_copy *= -1
return self.add(other_copy)
[docs] def make_copy_with_audio_data(self, audio_data, verbose=True):
""" Makes a copy of this :class:`AudioSignal` object with :attr:`audio_data` initialized to
the input :param:`audio_data` numpy array. The :attr:`stft_data` of the new :class:`AudioSignal`
object is ``None``.
Args:
audio_data (:obj:`np.ndarray`): Audio data to be put into the new :class:`AudioSignal` object.
verbose (bool): If ``True`` prints warnings. If ``False``, outputs nothing.
Returns:
(:class:`AudioSignal`): A copy of this :class:`AudioSignal` object with :attr:`audio_data`
initialized to the input :param:`audio_data` numpy array.
"""
if verbose:
if not self.active_region_is_default:
warnings.warn('Making a copy when active region is not default.')
if audio_data.shape != self.audio_data.shape:
warnings.warn('Shape of new audio_data does not match current audio_data.')
new_signal = copy.deepcopy(self)
new_signal.audio_data = audio_data
new_signal.stft_data = None
return new_signal
[docs] def make_copy_with_stft_data(self, stft_data, verbose=True):
""" Makes a copy of this :class:`AudioSignal` object with :attr:`stft_data` initialized to the
input :param:`stft_data` numpy array. The :attr:`audio_data` of the new :class:`AudioSignal`
object is ``None``.
Args:
stft_data (:obj:`np.ndarray`): STFT data to be put into the new :class:`AudioSignal` object.
Returns:
(:class:`AudioSignal`): A copy of this :class:`AudioSignal` object with :attr:`stft_data`
initialized to the input :param:`stft_data` numpy array.
"""
if verbose:
if not self.active_region_is_default:
warnings.warn('Making a copy when active region is not default.')
if stft_data.shape != self.stft_data.shape:
warnings.warn('Shape of new stft_data does not match current stft_data.')
new_signal = copy.deepcopy(self)
new_signal.stft_data = stft_data
new_signal.original_signal_length = self.original_signal_length
new_signal.audio_data = None
return new_signal
[docs] def rms(self, win_len=None, hop_len=None):
""" Calculates the root-mean-square of :attr:`audio_data`.
Returns:
(float): Root-mean-square of :attr:`audio_data`.
"""
if win_len is not None:
hop_len = win_len // 2 if hop_len is None else hop_len
rms_func = lambda arr: librosa.feature.rms(arr, frame_length=win_len,
hop_length=hop_len)[0, :]
else:
rms_func = lambda arr: np.sqrt(np.mean(np.square(arr)))
result = []
for ch in self.get_channels():
result.append(rms_func(ch))
return np.squeeze(result)
[docs] def peak_normalize(self):
"""
Peak normalizes the audio signal.
"""
self.apply_gain(1 / self.audio_data.max())
[docs] def apply_gain(self, value):
"""
Apply a gain to :attr:`audio_data`
Args:
value (float): amount to multiply self.audio_data by
Returns:
(:class:`AudioSignal`): This :class:`AudioSignal` object with the gain applied.
"""
if not isinstance(value, numbers.Real):
raise AudioSignalException('Can only multiply/divide by a scalar!')
self.audio_data = self.audio_data * value
return self
[docs] def resample(self, new_sample_rate, **kwargs):
"""
Resample the data in :attr:`audio_data` to the new sample rate provided by
:param:`new_sample_rate`. If the :param:`new_sample_rate` is the same as :attr:`sample_rate`
then nothing happens.
Args:
new_sample_rate (int): The new sample rate of :attr:`audio_data`.
kwargs: Keyword arguments to librosa.resample.
"""
if new_sample_rate == self.sample_rate:
warnings.warn('Cannot resample to the same sample rate.')
return
resampled_signal = []
for channel in self.get_channels():
resampled_channel = librosa.resample(
channel, self.sample_rate, new_sample_rate, **kwargs)
resampled_signal.append(resampled_channel)
self.audio_data = np.array(resampled_signal)
self.original_signal_length = self.signal_length
self._sample_rate = new_sample_rate
##################################################
# Channel Utilities
##################################################
def _verify_get_channel(self, n):
if n >= self.num_channels:
raise AudioSignalException(
f'Cannot get channel {n} when this object only has {self.num_channels}'
' channels! (0-based)'
)
if n < 0:
raise AudioSignalException(
f'Cannot get channel {n}. This will cause unexpected results.'
)
[docs] def get_channel(self, n):
"""Gets audio data of n-th channel from :attr:`audio_data` as a 1D :obj:`np.ndarray`
of shape ``(n_samples,)``.
Parameters:
n (int): index of channel to get. **0-based**
See Also:
* :func:`get_channels`: Generator for looping through channels of :attr:`audio_data`.
* :func:`get_stft_channel`: Gets stft data from a specific channel.
* :func:`get_stft_channels`: Generator for looping through channels from
:attr:`stft_data`.
Raises:
:class:`AudioSignalException`: If not ``0 <= n < self.num_channels``.
Returns:
(:obj:`np.array`): The audio data in the n-th channel of the signal, 1D
"""
self._verify_get_channel(n)
return np.asfortranarray(utils._get_axis(self.audio_data, constants.CHAN_INDEX, n))
[docs] def get_channels(self):
"""Generator that will loop through channels of :attr:`audio_data`.
See Also:
* :func:`get_channel`: Gets audio data from a specific channel.
* :func:`get_stft_channel`: Gets stft data from a specific channel.
* :func:`get_stft_channels`: Generator to loop through channels of :attr:`stft_data`.
Yields:
(:obj:`np.array`): The audio data in the next channel of this signal as a
1D ``np.ndarray``.
"""
for i in range(self.num_channels):
yield self.get_channel(i)
[docs] def get_stft_channel(self, n):
"""Returns STFT data of n-th channel from :attr:`stft_data` as a 2D ``np.ndarray``.
Args:
n: (int) index of stft channel to get. **0-based**
See Also:
* :func:`get_stft_channels`: Generator to loop through channels from :attr:`stft_data`.
* :func:`get_channel`: Gets audio data from a specific channel.
* :func:`get_channels`: Generator to loop through channels of :attr:`audio_data`.
Raises:
:class:`AudioSignalException`: If not ``0 <= n < self.num_channels``.
Returns:
(:obj:`np.array`): the STFT data in the n-th channel of the signal, 2D
"""
if self.stft_data is None:
raise AudioSignalException('Cannot get STFT data before STFT is calculated!')
self._verify_get_channel(n)
return utils._get_axis(self.stft_data, constants.STFT_CHAN_INDEX, n)
[docs] def get_stft_channels(self):
"""Generator that will loop through channels of :attr:`stft_data`.
See Also:
* :func:`get_stft_channel`: Gets stft data from a specific channel.
* :func:`get_channel`: Gets audio data from a specific channel.
* :func:`get_channels`: Generator to loop through channels of :attr:`audio_data`.
Yields:
(:obj:`np.array`): The STFT data in the next channel of this signal as a
2D ``np.ndarray``.
"""
for i in range(self.num_channels):
yield self.get_stft_channel(i)
[docs] def make_audio_signal_from_channel(self, n):
"""
Makes a new :class:`AudioSignal` object from with data from channel ``n``.
Args:
n (int): index of channel to make a new signal from. **0-based**
Returns:
(:class:`AudioSignal`) new :class:`AudioSignal` object with only data from
channel ``n``.
"""
new_signal = copy.copy(self)
new_signal.audio_data = self.get_channel(n)
return new_signal
[docs] def get_power_spectrogram_channel(self, n):
""" Returns the n-th channel from ``self.power_spectrogram_data``.
Raises:
Exception: If not ``0 <= n < self.num_channels``.
Args:
n: (int) index of power spectrogram channel to get **0-based**
Returns:
(:obj:`np.array`): the power spectrogram data in the n-th channel of the signal, 1D
"""
self._verify_get_channel(n)
# np.array helps with duck typing
return utils._get_axis(np.array(self.power_spectrogram_data),
constants.STFT_CHAN_INDEX, n)
[docs] def get_magnitude_spectrogram_channel(self, n):
""" Returns the n-th channel from ``self.magnitude_spectrogram_data``.
Raises:
Exception: If not ``0 <= n < self.num_channels``.
Args:
n: (int) index of magnitude spectrogram channel to get **0-based**
Returns:
(:obj:`np.array`): the magnitude spectrogram data in the n-th channel of the signal, 1D
"""
self._verify_get_channel(n)
# np.array helps with duck typing
return utils._get_axis(np.array(self.magnitude_spectrogram_data),
constants.STFT_CHAN_INDEX, n)
[docs] def to_mono(self, overwrite=True, keep_dims=False):
""" Converts :attr:`audio_data` to mono by averaging every sample.
Args:
overwrite (bool): If ``True`` this function will overwrite :attr:`audio_data`.
keep_dims (bool): If ``False`` this function will return a 1D array,
else will return array with shape `(1, n_samples)`.
Warning:
If ``overwrite=True`` (default) this will overwrite any data in :attr:`audio_data`!
Returns:
(:obj:`AudioSignal`): Mono-ed version of AudioSignal, either in place or not.
"""
mono = np.mean(self.audio_data, axis=constants.CHAN_INDEX, keepdims=keep_dims)
if overwrite:
self.audio_data = mono
return self
else:
mono_signal = self.make_copy_with_audio_data(mono)
return mono_signal
##################################################
# Utility hooks #
##################################################
[docs] def play(self):
"""
Plays this audio signal, using `nussl.play_utils.play`.
Plays an audio signal if ffplay from the ffmpeg suite of tools is installed.
Otherwise, will fail. The audio signal is written to a temporary file
and then played with ffplay.
"""
# lazy load
from . import play_utils
play_utils.play(self)
[docs] def embed_audio(self, ext='.mp3', display=True):
"""
Embeds the audio signal into a notebook, using `nussl.play_utils.embed_audio`.
Write a numpy array to a temporary mp3 file using ffmpy, then embeds the mp3
into the notebook.
Args:
ext (str): What extension to use when embedding. '.mp3' is more lightweight
leading to smaller notebook sizes.
Example:
>>> import nussl
>>> audio_file = nussl.efz_utils.download_audio_file('schoolboy_fascination_excerpt.wav')
>>> audio_signal = nussl.AudioSignal(audio_file)
>>> audio_signal.embed_audio()
This will show a little audio player where you can play the audio inline in
the notebook.
"""
# lazy load
from . import play_utils
return play_utils.embed_audio(self, ext=ext, display=display)
##################################################
# Operator overloading #
##################################################
def __add__(self, other):
return self.add(other)
def __radd__(self, other):
return self.add(other)
def __sub__(self, other):
return self.subtract(other)
def _verify_audio(self, other):
if self.num_channels != other.num_channels:
raise AudioSignalException('Cannot do operation with two signals that have '
'a different number of channels!')
if self.sample_rate != other.sample_rate:
raise AudioSignalException('Cannot do operation with two signals that have '
'different sample rates!')
def _verify_audio_arithmetic(self, other):
self._verify_audio(other)
if self.signal_length != other.signal_length:
raise AudioSignalException('Cannot do arithmetic with signals of different length!')
def __iadd__(self, other):
return self + other
def __isub__(self, other):
return self - other
def __mul__(self, value):
if not isinstance(value, numbers.Real):
raise AudioSignalException('Can only multiply/divide by a scalar!')
return self.make_copy_with_audio_data(np.multiply(self.audio_data, value), verbose=False)
def __div__(self, value):
if not isinstance(value, numbers.Real):
raise AudioSignalException('Can only multiply/divide by a scalar!')
return self.make_copy_with_audio_data(np.divide(self.audio_data, float(value)),
verbose=False)
def __truediv__(self, value):
return self.__div__(value)
def __itruediv__(self, value):
return self.__idiv__(value)
def __imul__(self, value):
return self.apply_gain(value)
def __idiv__(self, value):
return self.apply_gain(1 / float(value))
def __len__(self):
return self.signal_length
def __eq__(self, other):
for k, v in list(self.__dict__.items()):
if isinstance(v, np.ndarray):
if not np.array_equal(v, other.__dict__[k]):
return False
elif v != other.__dict__[k]:
return False
return True
def __ne__(self, other):
return not self == other
class AudioSignalException(Exception):
"""
Exception class for :class:`AudioSignal`.
"""
pass