A bit of theory.
What I'm trying to mean is as follows:
The mechanic vibrating system of a speaker is a DC motor: a coil, a magnet,
and mechanic output. So, this system can act like is usually called 4 quadrant motor. Assuming an arbitrary polarity of the moving coil, when you put a
small DC voltage across it, it will respond with a DC current flowing trough
the wire, generating a magnetic field, and it interacts with the fixed field
generated by the magnet, and then causing cone (or the axis of a motor) to
move from its initial position. If you revert the polarity of such DC voltage, happens the same but moving the coil and cone (or the axis) in opposite direction. This are first and third quadrant.
When the coil becomes undriven, the system goes to the standstill position,
and then the coil moves trough the magnetic field generating a DC voltage
across the coil winding which it usually flows through the low output amplifier's impedance, and generating a current which in turns brakes the cone. These are second and fourth quadrants of the operation.
So, suppose you have your speaker in a box. You apply a voltage pulse to the
coil. The cone will move, i.e. forward. Then you will ear a "pop" sound
caused by the air moved by the cone. When you remove the excitation, the cone will return to its balance or equilibrium position causing a "boom" sound
meaning and undamped oscillating cone at the natural mechanical frequency of the resonance of the moving parts.
The air moved in the box, also will refract in the walls of the box, and some
time later will impact in the cone causing a new cone movement and voltage
appearing in the moving coil ends.
How the hell your system will recognize which of the above mentioned voltages in the coil are the desired and undesired ones?
In PWM controls, DC voltages are measured when the off state of the power stage, and then sampled and hold in a special circuits, and then, a tachometric voltage is obtained in this way. You can't do it without generating severe audio distortion.
In case you can't distinguish this difference, and at some certain time offset
in the received acoustic wave converted into electric current again, your
system may attempt to compensate for an out of phase signal, which also is out of phase mechanical or acoustically, and then positive or regenerative
feedback is present, and your system undoubtedly will oscillate.
Cordially, Osvaldo.