I've been testing productivity apps across different devices and I'm seeing huge performance variance that makes comparison reviews really difficult. An app that runs smoothly on a high-end laptop might be unusable on a budget tablet, even though they're technically running the same software.
How do you account for performance variance in your testing? Do you test on multiple device types, or do you focus on a standard test setup? And how do you communicate this variance to users who might have very different hardware than your test setup?
I test on a range of devices and clearly document my test setup. For mobile apps, that means testing on last year's flagship, a current mid-range device, and a budget device. For desktop software, I test on systems with different amounts of RAM, different CPUs, and different storage types.
The performance variance is actually a feature, not a bug, in my testing. I report the range of performance I see across different systems, along with the most common performance level. This gives users a realistic expectation - most users will see X performance, but it could be as low as Y or as high as Z depending on your setup."
I use statistical methods to account for variance. Instead of just reporting average performance, I report confidence intervals. This app typically loads in 2-3 seconds on mid-range hardware" is more useful than "average load time: 2.5 seconds" when you know there's significant variance.
For mobile, I test with different network conditions and background app scenarios. An app might perform great on WiFi with no other apps running, but how does it perform on spotty cellular with 10 other apps running in the background? That's the real-world performance variance that matters to users.