Salesforce Developer Soql Sort Optimization


// Salesforce - Developer - SOQL - Use sort optimization on large queries:

Do you have a SOQL query or a report that takes a long time to complete because 
you are querying data from an object that has tens of millions of rows, yet your 
business requirements won’t allow you to add a selective filter? Read on to 
learn more about sort optimization, a simple technique that many developers and 
architects overlook when applying SOQL performance tuning best practices.

Let’s say that you contacted Customer Support to place a custom 
index on a field to optimize the performance of your SOQL query. As the number 
of records in your object increases, or the data distribution changes over time, 
your filters might no longer meet the appropriate query optimizer 
selectivity threshold.

When the filters don’t meet that selectivity threshold, the query 
optimizer does a full table scan instead of using the index, and the query takes 
significantly longer to complete or simply times out altogether.

For a variety of business reasons, it might be difficult to add filters to make 
the query more selective. But if you could add just one sort clause (ORDER BY) 
on an indexed field, which in many cases makes the data more consumable by end 
users, you would have a good chance of bringing back the performance benefits 
offered by using an index.

So how would adding a sort clause make the query more selective? It wouldn’t. 
Sort clauses have nothing to do with selectivity. You might also ask, “Wouldn’t 
adding an ORDER BY clause increase the overhead because Salesforce must complete 
a sorting task in addition to selecting the appropriate rows?” The answer is 
“no.” When you have an ORDER BY clause bound by a LIMIT clause–and chances are 
that you already have one in place to avoid hitting governor limits–the query optimizer considers using the index because the index, by 
nature, is a presorted list of values. The query starts from either end of the 
index, depending on your ascending or descending order preference, and then 
works its way up or down until it reaches the number or records specified by 
the LIMIT clause. It knows exactly when to stop searching and can return the 
query results without having to do a full table scan.

Let’s dig in a bit deeper. There are a few best practices that you should know 
and follow when applying the sort optimization technique.

1. Sort on a number or date field.  The query optimizer considers 
   number (e.g., order number, customer ID) or date (e.g., CreatedDate) data 
   type fields as candidates for sorting. If you need to use a field that does 
   not have a number or date type, you can add a “hidden” field, populate it 
   with converted number values, and then place an index on it.

2. Sort on the second field for a two-column index.  If your custom index is a 
   two-column index, make sure that you have an equality filter for the first 
   field and are sorting on the second field. The equality filter for the first 
   field doesn’t need to meet the query optimizer’s selectivity 
   threshold, but it does need to appear in the query’s WHERE clause for the query optimizer to consider using the two-column index.

3. Exclude null records.  If you sort on a field that might contain null 
   records–regardless of whether you actually have any–add a filter in the WHERE 
   clause to exclude those records. Otherwise, the query optimizer 
   won’t consider using the index. This means you cannot apply sort optimization 
   if you want to include records that have null values in the sorting fields as 
   part of your results.

4. Evaluate optimization.  When you have multiple filters, and the matching 
   records don’t appear anywhere close to the beginning or end of the index, 
   Salesforce might end up scanning a large number of records on the index. 
   Similarly, if you target a large number of records with your LIMIT clause, 
   the query might take a long time to complete.

Best Use Cases for Sort Optimization:  In summary, sort optimization takes 
advantage of available indexes by picking up records in a limited, orderly 
fashion. The best sort optimization use cases involve running your SOQL queries 
to list most recent purchase orders, or to find record set boundaries such as 
“largest,” “least,” “top 50,” and so on for your record set. Sort optimization 
might have your queries less of a performance boost for something like a list of 
the 5,000 customers who most recently purchased a product that the company 
stopped selling 10 years ago. This performance difference exists because the 
query might need to scan a few thousand records on the index before it finds 
the first matching record, then keep searching until it reaches the five 
thousandth record. Unless this product sold so well until 10 years ago that 
filtering on “product” does not meet the query optimizer’s 
appropriate selectivity threshold, the query optimizer is most likely 
going to use indexes on dates and products instead of using sort optimization.
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License