Generating "real" sounds from ultrasonics is known and has been used for a long time, but until now, demodulation is performed through the non-linearity of the ear or the surrounding medium.
This solution integrates everything in chips:
https://www.edn.com/earbud-speaker-performs-ultrasonic-modulation/
This solution integrates everything in chips:
https://www.edn.com/earbud-speaker-performs-ultrasonic-modulation/