Audacity 3.2.0
|
Namespaces | |
namespace | anonymous_namespace{GetMeterUsingTatumQuantizationFit.cpp} |
namespace | anonymous_namespace{MirDsp.cpp} |
namespace | anonymous_namespace{MirUtils.cpp} |
namespace | anonymous_namespace{MusicInformationRetrieval.cpp} |
namespace | anonymous_namespace{MusicInformationRetrievalTests.cpp} |
namespace | anonymous_namespace{StftFrameProvider.cpp} |
namespace | anonymous_namespace{StftFrameProviderTests.cpp} |
namespace | anonymous_namespace{TatumQuantizationFitBenchmarking.cpp} |
Classes | |
class | AnalyzedAudioClip |
class | DecimatingMirAudioReader |
Our MIR operations do not need the full 44.1 or 48kHz resolution typical of audio files. It may change in the future, if we start looking at chromagrams for example, but for now even a certain amount of aliasing isn't an issue. In fact, for onset detection, it may even be beneficial, since it preserves a trace of the highest frequency components by folding them down below the nyquist. Thus we can decimate the audio signal to a certain extent. This is fast and easy to implement, meanwhile reducing dramatically the amount of data and operations. More... | |
class | EmptyMirAudioReader |
class | FakeAnalyzedAudioClip |
class | FakeProjectInterface |
struct | LoopClassifierSettings |
class | MirAudioReader |
struct | MusicalMeter |
struct | OctaveError |
struct | OnsetQuantization |
class | ProjectInterface |
struct | ProjectSyncInfo |
struct | ProjectSyncInfoInput |
struct | QuantizationFitDebugOutput |
struct | RocInfo |
class | SquareWaveMirAudioReader |
class | StftFrameProvider |
class | WavMirAudioReader |
Enumerations | |
enum class | FalsePositiveTolerance { Strict , Lenient } |
enum class | TimeSignature { TwoTwo , FourFour , ThreeFour , SixEight , _count } |
enum class | TempoObtainedFrom { Header , Title , Signal } |
How the tempo was obtained: More... | |
Functions | |
std::optional< MusicalMeter > | GetMeterUsingTatumQuantizationFit (const MirAudioReader &audio, FalsePositiveTolerance tolerance, const std::function< void(double)> &progressCallback, QuantizationFitDebugOutput *debugOutput) |
Get the BPM of the given audio file, using the Tatum Quantization Fit method. More... | |
std::vector< float > | GetNormalizedCircularAutocorr (const std::vector< float > &x) |
Get the normalized, circular auto-correlation for a signal x whose length already is a power of two. Since the output is symmetric, only the left-hand side is returned, i.e., of size N/2 + 1 , where N is the power of two the input was upsampled to. More... | |
std::vector< float > | GetOnsetDetectionFunction (const MirAudioReader &audio, const std::function< void(double)> &progressCallback, QuantizationFitDebugOutput *debugOutput) |
int | GetNumerator (TimeSignature ts) |
int | GetDenominator (TimeSignature ts) |
std::vector< int > | GetPossibleBarDivisors (int lower, int upper) |
Function to generate numbers whose prime factorization contains only twos or threes. More... | |
std::vector< int > | GetPeakIndices (const std::vector< float > &x) |
std::vector< float > | GetNormalizedHann (int size) |
constexpr auto | IsPowOfTwo (int x) |
std::optional< ProjectSyncInfo > | GetProjectSyncInfo (const ProjectSyncInfoInput &in) |
std::optional< double > | GetBpmFromFilename (const std::string &filename) |
std::optional< MusicalMeter > | GetMusicalMeterFromSignal (const MirAudioReader &audio, FalsePositiveTolerance tolerance, const std::function< void(double)> &progressCallback, QuantizationFitDebugOutput *debugOutput) |
void | SynchronizeProject (const std::vector< std::shared_ptr< AnalyzedAudioClip > > &clips, ProjectInterface &project, bool projectWasEmpty) |
void | ProgressBar (int width, int percent) |
OctaveError | GetOctaveError (double expected, double actual) |
Gets the tempo detection octave error, as defined in section 5. of Schreiber, H., Urbano, J. and Müller, M., 2020. Music Tempo Estimation: Are We Done Yet?. Transactions of the International Society for Music Information Retrieval, 3(1), p.111–125. DOI: https://doi.org/10.5334/tismir.43 In short, with an example: two bars of a fast 3/4 can in some cases be interpreted as one bar of 6/8. However, there are 6 beats in the former, against 2 in the latter, leading to an "octave error" of 3. In that case, the returned factor would be 3, and the remainder, log2(3 * actual / expected) More... | |
template<typename Result > | |
RocInfo | GetRocInfo (std::vector< Result > results, double allowedFalsePositiveRate=0.) |
template<typename T > | |
void | PrintPythonVector (std::ofstream &ofs, const std::vector< T > &v, const char *name) |
template<int bufferSize = 1024> | |
float | GetChecksum (const MirAudioReader &source) |
TEST_CASE ("GetBpmFromFilename") | |
TEST_CASE ("GetProjectSyncInfo") | |
TEST_CASE ("SynchronizeProject") | |
TEST_CASE ("StftFrameProvider") | |
TEST_CASE ("GetRocInfo") | |
TEST_CASE ("GetChecksum") | |
auto | ToString (const std::optional< TimeSignature > &ts) |
TEST_CASE ("TatumQuantizationFitBenchmarking") | |
TEST_CASE ("TatumQuantizationFitVisualization") | |
Variables | |
static const std::unordered_map< FalsePositiveTolerance, LoopClassifierSettings > | loopClassifierSettings |
static constexpr auto | runLocally = false |
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
GetMeterUsingTatumQuantizationFit.cpp
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
GetMeterUsingTatumQuantizationFit.h
Matthieu Hodgkinson
A method to classify audio recordings in loops and non-loops, with a confidence score, together with a BPM estimate.
The method evaluates the assumption that the given audio is a loop. Based on this assumption, and finite possible tempi and time signatures, a set of hypotheses is tested. For each hypothesis, a tatum* quantization is tried, returning an average of the normalized distance between Onset Detection Function (ODF) peaks and the closest tatum, weighted by the ODF peak values. This yields a single scalar that strongly correlates with the fact that the audio is a loop or not, and that we use for loop/non-loop classification.
Besides this score, the classification stage also yields the most likely tatum rate, which still needs disambiguation to find the beat rate. The autocorrelation of the ODF is taken, and, for each bar division explaining the tatum rate, is comb-filtered. The energy of the comb-filtering together with the BPM likelihood are combined together, and the BPM with largest score is returned.
This approach is in some aspects like existing tempo detection methods (e.g. Percival, Graham & Tzanetakis, George (2014), implemented in the Essentia framework at https://essentia.upf.edu/), insofar as it first derives an ODF and then somehow correlates it with expected rhythmic patterns. However, the quantization distance, at the core of the method, is not known by the author to be used in other methods. Also, once the ODF is taken, the loop assumption lends itself to a single analysis of the entire ODF, rather than performing mid-term analyses which are then combined together. Finally, albeit restricting the use of application, the loop assumption reduces the number of tried hypotheses, reducing the risk of non-musical recordings to be detected as musical by sheer luck. This increased robustness of the algorithm against false positives is quintessential for Audacity, where non-music users should not be bothered by wrong detections. The loop assumption is nevertheless not fundamental, and the algorithm could be implemented without it, at the cost of a higher risk of false positives.
Evaluation and benchmarking code can be found in TatumQuantizationFitBenchmarking.cpp. This code takes a tolerable false-positive rate, and outputs the corresponding loop/non-loop threshold. It also returns the Octave Error accuracy measure, as introduced in "Schreiber, H., et al. (2020). Music Tempo Estimation: Are We Done Yet?".
A tatum is the smallest rhythmic unit in a musical piece. Quoting from https://en.wikipedia.org/wiki/Tatum_(music): "The term was coined by Jeff Bilmes (...) and is named after the influential jazz pianist Art Tatum, "whose tatum was faster than all others""
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
DSP utilities used by the Music Information Retrieval code. These may migrate to lib-math if needed elsewhere.
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
WaveMirAudioReader.cpp
Matthieu Hodgkinson
Audacity: A Digital Audio Editor
WaveMirAudioReader.h
Matthieu Hodgkinson
|
strong |
Enumerator | |
---|---|
Strict | |
Lenient |
Definition at line 24 of file MirTypes.h.
|
strong |
How the tempo was obtained:
Enumerator | |
---|---|
Header | |
Title | |
Signal |
Definition at line 59 of file MirTypes.h.
|
strong |
MUSIC_INFORMATION_RETRIEVAL_API std::optional< double > MIR::GetBpmFromFilename | ( | const std::string & | filename | ) |
Definition at line 107 of file MusicInformationRetrieval.cpp.
Referenced by GetProjectSyncInfo(), and TEST_CASE().
float MIR::GetChecksum | ( | const MirAudioReader & | source | ) |
Definition at line 163 of file MirTestUtils.h.
References MIR::MirAudioReader::GetNumSamples(), and MIR::MirAudioReader::ReadFloats().
Referenced by TEST_CASE().
|
inline |
Definition at line 46 of file MirTypes.h.
References _count.
Referenced by AudacityMirProject::ReconfigureMusicGrid().
std::optional< MusicalMeter > MIR::GetMeterUsingTatumQuantizationFit | ( | const MirAudioReader & | audio, |
FalsePositiveTolerance | tolerance, | ||
const std::function< void(double)> & | progressCallback, | ||
QuantizationFitDebugOutput * | debugOutput | ||
) |
Get the BPM of the given audio file, using the Tatum Quantization Fit method.
Definition at line 392 of file GetMeterUsingTatumQuantizationFit.cpp.
References audio, MIR::QuantizationFitDebugOutput::audioFileDuration, MIR::QuantizationFitDebugOutput::bpm, entry, MIR::anonymous_namespace{GetMeterUsingTatumQuantizationFit.cpp}::GetMostLikelyMeterFromQuantizationExperiment(), GetOnsetDetectionFunction(), GetPeakIndices(), MIR::anonymous_namespace{GetMeterUsingTatumQuantizationFit.cpp}::GetPossibleDivHierarchies(), MIR::anonymous_namespace{GetMeterUsingTatumQuantizationFit.cpp}::IsSingleEvent(), loopClassifierSettings, MIR::QuantizationFitDebugOutput::odf, MIR::QuantizationFitDebugOutput::odfPeakIndices, MIR::QuantizationFitDebugOutput::odfSr, MIR::anonymous_namespace{GetMeterUsingTatumQuantizationFit.cpp}::RunQuantizationExperiment(), MIR::QuantizationFitDebugOutput::score, MIR::QuantizationFitDebugOutput::tatumQuantization, and MIR::QuantizationFitDebugOutput::timeSignature.
Referenced by GetMusicalMeterFromSignal().
MUSIC_INFORMATION_RETRIEVAL_API std::optional< MusicalMeter > MIR::GetMusicalMeterFromSignal | ( | const MirAudioReader & | audio, |
FalsePositiveTolerance | tolerance, | ||
const std::function< void(double)> & | progressCallback, | ||
QuantizationFitDebugOutput * | debugOutput | ||
) |
Definition at line 132 of file MusicInformationRetrieval.cpp.
References audio, and GetMeterUsingTatumQuantizationFit().
Referenced by GetProjectSyncInfo(), and TEST_CASE().
std::vector< float > MIR::GetNormalizedCircularAutocorr | ( | const std::vector< float > & | x | ) |
Get the normalized, circular auto-correlation for a signal x
whose length already is a power of two. Since the output is symmetric, only the left-hand side is returned, i.e., of size N/2 + 1
, where N
is the power of two the input was upsampled to.
x.size()
is a power of two. x.size() / 2 + 1
. Definition at line 73 of file MirDsp.cpp.
References IsPowOfTwo().
Referenced by MIR::anonymous_namespace{GetMeterUsingTatumQuantizationFit.cpp}::GetBestBarDivisionIndex().
std::vector< float > MIR::GetNormalizedHann | ( | int | size | ) |
Definition at line 80 of file MirUtils.cpp.
References MIR::anonymous_namespace{MirUtils.cpp}::pi, and size.
Referenced by MIR::anonymous_namespace{MirDsp.cpp}::GetMovingAverage().
|
inline |
Definition at line 39 of file MirTypes.h.
References _count.
Referenced by AudacityMirProject::ReconfigureMusicGrid().
OctaveError MIR::GetOctaveError | ( | double | expected, |
double | actual | ||
) |
Gets the tempo detection octave error, as defined in section 5. of Schreiber, H., Urbano, J. and Müller, M., 2020. Music Tempo Estimation: Are We Done Yet?. Transactions of the International Society for Music Information Retrieval, 3(1), p.111–125. DOI: https://doi.org/10.5334/tismir.43 In short, with an example: two bars of a fast 3/4 can in some cases be interpreted as one bar of 6/8. However, there are 6 beats in the former, against 2 in the latter, leading to an "octave error" of 3. In that case, the returned factor
would be 3, and the remainder, log2(3 * actual / expected)
Definition at line 39 of file MirTestUtils.cpp.
Referenced by TEST_CASE().
std::vector< float > MIR::GetOnsetDetectionFunction | ( | const MirAudioReader & | audio, |
const std::function< void(double)> & | progressCallback, | ||
QuantizationFitDebugOutput * | debugOutput | ||
) |
Definition at line 109 of file MirDsp.cpp.
References PffftFloatVector::aligned(), audio, MIR::anonymous_namespace{MirDsp.cpp}::GetMovingAverage(), MIR::anonymous_namespace{MirDsp.cpp}::GetNoveltyMeasure(), IsPowOfTwo(), MIR::QuantizationFitDebugOutput::movingAverage, MIR::QuantizationFitDebugOutput::postProcessedStft, MIR::QuantizationFitDebugOutput::rawOdf, anonymous_namespace{ClipSegmentTest.cpp}::sampleRate, and anonymous_namespace{NoteTrack.cpp}::swap().
Referenced by GetMeterUsingTatumQuantizationFit().
std::vector< int > MIR::GetPeakIndices | ( | const std::vector< float > & | x | ) |
Definition at line 67 of file MirUtils.cpp.
Referenced by GetMeterUsingTatumQuantizationFit().
std::vector< int > MIR::GetPossibleBarDivisors | ( | int | lower, |
int | upper | ||
) |
Function to generate numbers whose prime factorization contains only twos or threes.
Definition at line 54 of file MirUtils.cpp.
References MIR::anonymous_namespace{MirUtils.cpp}::GetPowersOf2And3().
std::optional< ProjectSyncInfo > MUSIC_INFORMATION_RETRIEVAL_API MIR::GetProjectSyncInfo | ( | const ProjectSyncInfoInput & | in | ) |
Definition at line 49 of file MusicInformationRetrieval.cpp.
References MIR::ProjectSyncInfoInput::filename, FourFour, GetBpmFromFilename(), MIR::MirAudioReader::GetDuration(), GetMusicalMeterFromSignal(), Header, Lenient, MIR::ProjectSyncInfoInput::progressCallback, MIR::ProjectSyncInfoInput::projectTempo, MIR::ProjectSyncInfoInput::projectWasEmpty, MIR::anonymous_namespace{MusicInformationRetrieval.cpp}::quarternotesPerBeat, fast_float::round(), Signal, MIR::ProjectSyncInfoInput::source, Strict, MIR::ProjectSyncInfoInput::tags, Title, and MIR::ProjectSyncInfoInput::viewIsBeatsAndMeasures.
Referenced by anonymous_namespace{ProjectFileManager.cpp}::RunTempoDetection(), and TEST_CASE().
RocInfo MIR::GetRocInfo | ( | std::vector< Result > | results, |
double | allowedFalsePositiveRate = 0. |
||
) |
The Receiver Operating Characteristic (ROC) curve is a plot of the true positive rate (TPR) against the false positive rate (FPR) for the different possible thresholds of a binary classifier. The area under the curve (AUC) is a measure of the classifier's performance. The greater the AUC, the better the classifier.
Result | has public members truth , boolean, and score , numeric |
results | true classifications and scores of some population |
results
is really positive (truth
is true), and at least one is really negative 0. <= allowedFalsePositiveRate && allowedFalsePositiveRate <= 1.
Definition at line 52 of file MirTestUtils.h.
References size.
Referenced by TEST_CASE().
|
constexpr |
Definition at line 28 of file MirUtils.h.
Referenced by MIR::anonymous_namespace{GetMeterUsingTatumQuantizationFit.cpp}::GetBestBarDivisionIndex(), GetNormalizedCircularAutocorr(), GetOnsetDetectionFunction(), MIR::StftFrameProvider::StftFrameProvider(), and TEST_CASE().
void MIR::PrintPythonVector | ( | std::ofstream & | ofs, |
const std::vector< T > & | v, | ||
const char * | name | ||
) |
Definition at line 133 of file MirTestUtils.h.
References name.
Referenced by TEST_CASE().
void MIR::ProgressBar | ( | int | width, |
int | percent | ||
) |
Definition at line 26 of file MirTestUtils.cpp.
Referenced by TEST_CASE().
MUSIC_INFORMATION_RETRIEVAL_API void MIR::SynchronizeProject | ( | const std::vector< std::shared_ptr< AnalyzedAudioClip > > & | clips, |
ProjectInterface & | project, | ||
bool | projectWasEmpty | ||
) |
Definition at line 149 of file MusicInformationRetrieval.cpp.
References Header, project, Signal, and Title.
Referenced by ProjectFileManager::ImportAndRunTempoDetection(), and TEST_CASE().
MIR::TEST_CASE | ( | "GetBpmFromFilename" | ) |
Definition at line 10 of file MusicInformationRetrievalTests.cpp.
MIR::TEST_CASE | ( | "GetChecksum" | ) |
Definition at line 132 of file TatumQuantizationFitBenchmarking.cpp.
MIR::TEST_CASE | ( | "GetProjectSyncInfo" | ) |
Definition at line 83 of file MusicInformationRetrievalTests.cpp.
References MIR::anonymous_namespace{MusicInformationRetrievalTests.cpp}::arbitaryInput, MIR::ProjectSyncInfoInput::filename, MIR::anonymous_namespace{MusicInformationRetrievalTests.cpp}::filename100bpm, GetProjectSyncInfo(), and MIR::ProjectSyncInfoInput::tags.
MIR::TEST_CASE | ( | "GetRocInfo" | ) |
Definition at line 73 of file TatumQuantizationFitBenchmarking.cpp.
References GetRocInfo().
MIR::TEST_CASE | ( | "StftFrameProvider" | ) |
Definition at line 46 of file StftFrameProviderTests.cpp.
References IsPowOfTwo().
MIR::TEST_CASE | ( | "SynchronizeProject" | ) |
Definition at line 176 of file MusicInformationRetrievalTests.cpp.
References Header, project, Signal, SynchronizeProject(), and Title.
MIR::TEST_CASE | ( | "TatumQuantizationFitBenchmarking" | ) |
Definition at line 160 of file TatumQuantizationFitBenchmarking.cpp.
References audio, MIR::QuantizationFitDebugOutput::audioFileDuration, MIR::QuantizationFitDebugOutput::bpm, MIR::anonymous_namespace{TatumQuantizationFitBenchmarking.cpp}::GetBenchmarkingAudioFiles(), GetBpmFromFilename(), GetChecksum(), GetMusicalMeterFromSignal(), GetOctaveError(), MIR::OnsetQuantization::lag, Lenient, MIR::OnsetQuantization::numDivisions, MIR::anonymous_namespace{TatumQuantizationFitBenchmarking.cpp}::Pretty(), ProgressBar(), runLocally, MIR::QuantizationFitDebugOutput::score, MIR::QuantizationFitDebugOutput::tatumQuantization, MIR::QuantizationFitDebugOutput::timeSignature, and ToString().
MIR::TEST_CASE | ( | "TatumQuantizationFitVisualization" | ) |
Definition at line 11 of file TatumQuantizationFitVisualization.cpp.
References audio, MIR::QuantizationFitDebugOutput::audioFileDuration, GetMusicalMeterFromSignal(), MIR::OnsetQuantization::lag, Lenient, MIR::QuantizationFitDebugOutput::movingAverage, MIR::OnsetQuantization::numDivisions, MIR::QuantizationFitDebugOutput::odf, MIR::QuantizationFitDebugOutput::odfAutoCorr, MIR::QuantizationFitDebugOutput::odfAutoCorrPeakIndices, MIR::QuantizationFitDebugOutput::odfPeakIndices, MIR::QuantizationFitDebugOutput::odfSr, MIR::QuantizationFitDebugOutput::postProcessedStft, PrintPythonVector(), MIR::QuantizationFitDebugOutput::rawOdf, runLocally, MIR::QuantizationFitDebugOutput::score, and MIR::QuantizationFitDebugOutput::tatumQuantization.
auto MIR::ToString | ( | const std::optional< TimeSignature > & | ts | ) |
Definition at line 139 of file TatumQuantizationFitBenchmarking.cpp.
References FourFour, SixEight, ThreeFour, and TwoTwo.
Referenced by audacity::sentry::Report::ReportImpl::Send(), and TEST_CASE().
|
static |
Tolerance-dependent thresholds, used internally by GetMusicalMeterFromSignal
to decide whether to return a null or valid MusicalMeter
. The value compared against these are scores which get higher as the signal is more likely to contain music content. They are obtained by running the TatumQuantizationFitBenchmarking
test case. More information there.
Definition at line 49 of file MusicInformationRetrieval.h.
Referenced by GetMeterUsingTatumQuantizationFit().
|
staticconstexpr |
Definition at line 29 of file MirTestUtils.h.
Referenced by TEST_CASE().