Okay so, since the main voice is dominate you can use the “unmixing levels”(Absolute Power) module to tackle this issue. Set the threshold to about -34.0 db and apply. You can fine-tune it by casting and carving out the main dominate frequency layer out of the other layer and duplicate that layer and flip the phase and merge the two back together.
I initially tried to post my example here to demonstrate what I was able to do with your 10 second clip but this forum (and the administrators/moderators) makes it impossible to help people. My edit was over the forum’s limit and although I could go back to reduce file size I decided it was not worth the frustration. I honestly believe that these admins/moderators has so many agendas that they don’t realize how much it hurts themselves in the process, and I’m not going to break my neck in order to follow stupid rules. I really believe that these moderators/admins purposefully set these rules in place in order to SUBTLELY discourage people from using the forums and genuinely helping people (kind-of like the idea of “quite quiting” culture). It’s sad to say but it seems like everybody is heading towards this “quite quiting” culture and at the same time expecting a big fat reward