How do I interpret ROC AUC with imbalanced data and mismatched predictions?
#1
I’ve been trying to get a better handle on my model's performance beyond just accuracy, so I started looking into the area under the ROC curve. Honestly, I’m a bit stuck on how to interpret the curve when my dataset is pretty imbalanced—the high score doesn’t seem to match what I see when I actually look at the predictions. I’m wondering if anyone else has felt that disconnect.
Reply
#2
I hear you the ROC curve can look great and the AUC can be high even when the actual predictions feel off in an imbalanced dataset
Reply
#3
The issue is that AUC is about ranking not calibration so in imbalanced cases you may need to peek at calibration plots or the precision recall curve as a complementary view
Reply
#4
I used to assume the ROC curve tells me how many correct positives I get but that is not quite right in practice it ranks scores not counts
Reply
#5
Maybe the smarter move is to accept that a single curve hides a lot and to mix metrics instead is calibration a better lens here. Is calibration a better lens here?
Reply
#6
Frame this as a framing shift the ROC curve is a tool to compare rankings rather than map errors in an imbalanced set and that change in framing might help
Reply
#7
Consider a calibration view and the precision recall curve to see where mispredictions lie the idea nudges you to look beyond ROC curve alone
Reply
#8
In my practice I tune thresholds and watch the confusion at several cut points and I still notice the ROC curve hiding a gap
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: