Why is my dbt performance plateauing after scaling the data warehouse?
#1
I’ve been trying to get our reporting pipeline to run faster, and I keep hitting a wall with how we’re handling our data warehouse. We moved a bunch of our transformations into dbt last year, which helped for a while, but now even simple dashboard queries feel sluggish when pulling from the final modeled layer. I’m starting to wonder if the whole architecture is just wrong for our scale, or if I’m missing something obvious about materializations and incremental builds. Has anyone else had their performance gains from dbt sort of plateau and then fall off?
Reply
#2
dbt helped for a while but the gains fade when the final modeled layer grows large. sometimes the bottleneck moves from transforms to the warehouse engine and dashboards end up paying the price. could be the materializations or the way incremental builds are wired, not just the code. have you measured the final select times versus precomputation steps?
Reply
#3
i wonder if the problem is not dbt but the warehouse you run on. at scale the same model can be fine during an overnight run yet slow during interactive dashboards. maybe you hit concurrency limits or skew. could be a hardware or configuration issue rather than the modeling itself.
Reply
#4
consider pushing more work into incremental models with a strong primary key and proper checks to avoid full refreshes. often you gain speed by avoiding full refreshes and by running only the changed dates. test the incremental logic and watch the final table update paths
Reply
#5
data gravity can slow you down a lot. a fast raw layer can drift if the week grows. the goal is to serve fast dashboards not just neat tables so think about pre computed aggregates on top of the modeled layer
Reply
#6
maybe the framing is wrong speed up the dashboard might not be the right north star. measure end to end latency and compare serving paths before chasing more dbt tweaks
Reply
#7
i like to think of dbt as a story builder. when the story grows long the reader tires. you could try shorter chapters like isolated sub models and clear materialization choices to keep the pace
Reply
#8
you might be hitting a single hot data path and a tiny change in the dashboard layer or a caching layer makes a big difference dbt is not the only lever here
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: