Sales report with window function times out on 50M-row transaction table
#1
I'm a database administrator for an e-commerce platform, and our nightly sales report query has started timing out as our transaction table has grown past 50 million rows. The query involves multiple joins and a window function to calculate running totals, and while the execution plan suggests an index scan, it's still taking over 20 minutes. I've tried adding composite indexes on the join and filter columns, but I'm not seeing the performance gain I expected. I'm wondering if I need to reconsider the entire query structure or look at partitioning strategies.
Reply
#2
That setup is a classic case for partitioning and pre-aggregation. Try range-partitioning the facts by sale_date (daily or monthly chunks) and enable partition pruning so the query only touches the relevant slices during the nightly run. It’s not magic, but you’ll often see dramatic cut in scan time once you isolate the data.
Reply
#3
From a project I worked on, the fastest path was a rolling summary table. Compute daily aggregates (sales, running totals) into a separate table, refresh it every night, keep the detail for ad hoc needs in another store. Then rewrite the report to use the summary + a light join. You keep velocity and still have accuracy, with a known refresh window.
Reply
#4
Before rearchitecting, check stats and plan reality vs estimates. Update statistics, re-analyze, and look at the actual rows fed into each step. If the optimizer misestimates, you may be missing helpful indexes or have overly broad joins. Consider breaking the query into steps: first materialize the big joined subset into a temp table, then run the window function on that smaller result. Also verify if parallelism is enabled and you're using an index-only path where possible.
Reply
#5
Partitioning helps, but not for every workload. If your window function groups by a large dimension, make sure you partition by a field that actually reduces the work (date, region) and not something uniform. Also consider indexing strategies that support the join order and make sure the filter is sargable before the joins.
Reply
#6
Which database are you on (PostgreSQL, MySQL, SQL Server, Oracle)? Partitioning features and window function performance can vary a lot; the exact plan hinges on your engine. Also what are the high cardinality joins? What are your current stats refresh cadence and maintenance window?
Reply
#7
Hold off on a full rewrite. A staged plan often pays: 1) add a summary table or MV; 2) pre-filter with targeted indexes; 3) quantize the window computation to run on a smaller dataset; 4) test with a parallel copy of production data to validate performance before promoting changes.
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: