MultiHub Forum

Full Version: How do I interpret ROC AUC with imbalanced data and mismatched predictions?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I’ve been trying to get a better handle on my model's performance beyond just accuracy, so I started looking into the area under the ROC curve. Honestly, I’m a bit stuck on how to interpret the curve when my dataset is pretty imbalanced—the high score doesn’t seem to match what I see when I actually look at the predictions. I’m wondering if anyone else has felt that disconnect.
I hear you the ROC curve can look great and the AUC can be high even when the actual predictions feel off in an imbalanced dataset
The issue is that AUC is about ranking not calibration so in imbalanced cases you may need to peek at calibration plots or the precision recall curve as a complementary view
I used to assume the ROC curve tells me how many correct positives I get but that is not quite right in practice it ranks scores not counts
Maybe the smarter move is to accept that a single curve hides a lot and to mix metrics instead is calibration a better lens here. Is calibration a better lens here?
Frame this as a framing shift the ROC curve is a tool to compare rankings rather than map errors in an imbalanced set and that change in framing might help
Consider a calibration view and the precision recall curve to see where mispredictions lie the idea nudges you to look beyond ROC curve alone
In my practice I tune thresholds and watch the confusion at several cut points and I still notice the ROC curve hiding a gap