This is one of those posts where I wish I could show actual code snippets, but since it involves a 3rd party vendor for one of my clients and I don’t have permission, I can’t.
So, I’m forced unfortunately to talk about the issue in a roundabout way.
My client uses a 3rd party tool to track documents. I’ve mentioned this before. They’ve been growing fairly fast and running into performance issues. I suppose growing fast is a good thing, but having performance issues is not.
In any case, using Query Store, I was able to send the vendor a list of queries and stats about them for them to review and to ideally improve the queries that needed work.
Yesterday they got back to me. The email was essentially we took this first query (let’s call it Doubly-Joined) and rewrote it as this second query (let’s call it Singly-Joined). I looked at the two queries, which join 4 tables. They’re very similar to each other, but the first one did join in the main table a second time (hence why I’m calling it Doubly-Joined). It’s not clear why this was done. The second query basically removed the second join and in the select clause, changed the aliases to the second join to the first join. This does give them a slightly different query plan, but ultimately, they return the same number of rows.


As you can see, the 2nd query plan is definitely a bit simpler (ignore the one warning, it’s not something that appears to be fixable here).
So, a naive take would be “we removed an unnecessary join, so of course it should be faster!” But is it?
Sometimes intuition can be correct, sometimes not so much. In this case though, it’s easy to confirm by seeing exactly how many rows are being read in each query.
I wrapped each query in a
Set Statistics IO ON/OFF
Set Statistics TIME ON/OFF
block and ran it. Here are the results
The Doubly-Joined
Table 'Table1'. Scan count 0, logical reads 337264, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Table2'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Table3'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Table4'. Scan count 1, logical reads 396, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
The Singly-joined
Table 'Table1'. Scan count 0, logical reads 337260, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Table2'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Table3'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Table4'. Scan count 1, logical reads 396, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
I’ve highlighted the relevant change. The single-joined query consistently performed with 4 fewer logical reads. Now, if the original number had been 8 and had dropped in half to 4, I’d be happy. But the change from 337264 to 337260 leaves me a bit underwhelmed. Furthermore, under multiple runs, the second query did not consistently use less CPU time, sometimes it took faster to run. Further testing was consistent in the lack of apparent improvement.
Needless to say, I don’t think this query improvement will help much. I’ve reached out to the vendor to see if they can provide more details, but honestly, I’m not hoping for much.