Automatic Tagging and Similarity Search for MediaBay

Dear Steinberg team,

MediaBay is, in my opinion, one of the best sample managers out there. I think, it could be made the best and out-compete Sononym and Splice’s Browser, if automatic tagging and similarity search were added.

Regards,

Pavel

What specifically do you mean by this?

@raino

Say, you have a sample with a cryptic name abc.wav. An automatic tagging system would analyze the contents of the sample an determine, for example, that the sample sounds like a kick with a long body and a vinyl crackle layered on top of it. During the import and scan of the corresponding folder containing the sample, a sample manager would then attach the corresponding words (kick, long body, vinyl) as tags to the sample.

When you then search for words like kick or vinyl, the accordingly tagged sample would be returned. The system’s search could than be further improved by also considering distances between the search words and actual tags. So, for example, if you are searching not for “vinyl kick“ but for “dusty drum“, our abc.wav sample still would be found.

The functionality is essential if you have a large database of unidentified sounds. Tools like Sononym or Tuva currently provide the tagging and tag based search based on precise matching of tags and search words. I strongly believe, however, that also the next step, i.e., considering search words close enough to tags as a match, could be made efficient, especially considering the fact that most computers today are shipped with decent GPUs.

1 Like

First of all, I’m 100% with you about how useful would be a tagging system based on the sound content.

Now,

This is where AI agents could be most effective. Properly trained, they could simply alter the names to meaningful ones. This way, we wouldn’t even have to rely to any DAW’s specific tagging system.
You’ve mentioned Sononym, which I think uses its own tagging system, but I may be wrong. If however I’m right, we can always create a tiny script for implementing the renaming idea, by quering its database and properly renaming the wavs. Just an idea.

I didn’t think about that. Good idea!

That even might be an overkill (depending on how far you are ready to go). We could first start with getting a good embedding from sound files and than use that embedding to produce adequate names.

It does indeed. A brief (so I might be incorrect) look at the docs tells that (at least for one-shots) it first uses a machine learning algorithm to classify the samples into broad categories, like “Bass” or “Pad“, and then relies on file names to assign tags. The tagging system itself seems to be pretty elaborate.

Interesting. Not that long ago doing that would be totally science fiction & now it’s like - yeah have the droids do it.

@m.c do you know anything about the Media Bay database. I wonder if a custom app could modify entries.

Similarity algorithms for audiofiles accompagnied by a variety of topographical distance visualization and categorization approaches have been around in research for more than two decades now. Many of them as open source apps right from the start.
The real problem is the integration into an existing environment (MediaBay) and the fact that vstsoundfiles are protected IMO.

It surely can. However, once mediaBay rescans the files, it will revert changes for properties embedded in the files. What do you have in mind?

That would mean stuff like sample rate, but not for example Mood tags. If you had an AI that could create a list of tags after ‘listening’ to a file. Then add the file to the database along with its tags.

Yeah, although they also come with a variety of presumably human generated tags & we can add more tags because the tags are part of the database & not the .vstsound files . Where user created samples would have no tags to start with so auto-generating them would be more beneficial.

Yes, totally doable.

However, I’m still a bit concerned on how these tags will be treated from mediaBay, upon a rescan (didn’t see an issue so far) or an update of it in the future. For example in the past I did bulk updates of properties of vstPreset files inside the database, only to find out that upon a mediaBay rescan, these properties were lost, since I didn’t really include them in the vstPreset files themselves.

That being said, once you have the tool to properly tag a bunch of items, you can even go the other way and actually add the proper meta-tags inside these files. This way you comply with mediaBay and you won’t have to worry upon copying/moving these files to a new system.

Doing an audio scan of audio might be difficult given there is so many formats… AI could could a long way just analyzing the name or folder locations.