Is Fullrange my best bet?

That's a big one. Each of the two mics also captures both speakers, resulting in arbitrary comb filtering based on the distances between the four devices. Add to that a baffle step from a cabinet configuration you may not use.
Without giving it much thought a better approach could have been mounting each driver centre of a sheet of 4x8 ply, back of the baffle pointed into a damped corner or covered with stood-off heavy acoustic blankets, and recording in mono.
 
That's a big one. Each of the two mics also captures both speakers, resulting in arbitrary comb filtering based on the distances between the four devices.
This is true. Now that you mention it something similar also tends to happen near a crossover frequency. If we just focus on the part where both the woofer and tweeter are within about 20dB of each other for a 2nd order filter, that could be a span of 4 octaves, or a 16:1 frequency range, where interference effects would be at a maximum.
 
Probably just convenience if you're lucky and stumble upon a driver with a starting point that's closer to what you want so it requires less work.

A systematic error in the recording chain or headphones (etc) could throw off the base level performance, but it seems unlikely to outright lie about the relative differences.
 
i had specially good expierence with relative performance, absolute performance is harder to tell since there are too many error points in youtube recordings (recording axis, youtube codec, microphones, playback system...) , it might be that your headphones color the sound in such way that you prefer a specific driver that would sound kinda **** in reallife when its a good match to your current frequency response

i use harman target for headphones (slightly modified) and flat (slightly tilted) on speakers... so im atleast somewhat certain that i get a minimal colored sound

the pluvia 7.2 hd may be just stood out for me because it has the smoothest/least highs and the most bass... its kinda hard to tell