Baffle step is normally in the 3-4dB range due to room effects. Although you are correct that a 4th-order transition won't be correct, as long as there is a filter of about the right order in the right spot the ripple that results will probably be no worse than the ripples due to diffraction, driver FR variation, etc. I don't think that many people are using 4th order LR crossovers -- most people that are going for 4th order slopes use 2nd or 3rd order crossovers.
You are correct about the limitations of state variable crossovers, but that is really irrelevant -- there is no need for a asymmetric crossover to correct for baffle step.
Nothing you say is wrong, but when designing real world speakers you have to keep things in perspective. Nothing is flat, and the transition points and magnitude are going to have to be determined experimentally if you want them to be really accurate. If at the end of the day you can get a response that is +-3dB, you are doing great. There is no need to sweat an error of a dB or two, unless of course it is compounding another error. By the same token, an "error" in your baffle step filter may even serendipitously correct for another anomaly! This is why designing loudspeakers is an iterative process no matter how much modeling you do.