The Force.com query optimizer is an engine that sits between your SOQL, reports, and list views and the database itself. Because of salesforce.com’s multitenancy, the optimizer gathers its own statistics instead of relying on the underlying database statistics. Using both these statistics and pre-queries, the optimizer generates the most optimized SQL to fetch your data. It looks at each filter in your WHERE clause to determine which index, if any, should drive your query. To determine if an index should be used to drive a query, the Force.com query optimizer checks the number of records targeted by the filter against selectivity thresholds. For a standard index, the threshold is 30 percent of the first million targeted records and 15 percent of all records after that first million. In addition, the selectivity threshold for a standard index maxes out at 1 million total targeted records, which you could reach only if you had more than 5.6 million total records. So if you had 2.5 million accounts, and your SOQL contained a filter on a standard index, that index would drive your query if the filter targeted fewer than 525,000 accounts. SELECT id FROM Account WHERE CreatedDate > 2013-01-01T00:00:00Z (30% of 1 to 1 million targeted records) + (15% of 1 million to 2.5 million targeted records) = 300,000 + 225,000 = 525,000 For a custom index, the selectivity threshold is 10 percent of the first million targeted records and 5 percent all records after that first million. In addition, the selectivity threshold for a custom index maxes out at 333,333 targeted records, which you could reach only if you had more than 5.6 million records. If the previous query were changed so that it used a filter on a field with a custom index, the threshold for 2.5 million accounts would be 175,000. SELECT id FROM Account WHERE CustomIndexedDate__c > 2013-01-01T00:00:00Z (10% of 1 to 1 million targeted records) + (5% of 1 million to 2.5 million targeted records) = 100,000 + 75,000 = 175,000 In these standard index and custom index examples, the Force.com query optimizer does use the standard and custom indexes, as each number of targeted records falls below the appropriate selectivity threshold. If, on the other hand, the number of targeted records exceeds an index’s selectivity threshold, the Force.com query optimizer does not use that index to drive the query. Common Causes of Non-Selective SOQL Queries: 1. Having Too Much Data. Whether you’re displaying a list of records through a Visualforce page or through a list view, it’s important to consider the user experience. Pagination can help, but will your users really go through a list with thousands of records? You might not have this much data in your current implementation, but if you don’t have enough selective filters, these long lists can easily become an issue as your data grows. Design your SOQL, reports, and list views with large data volumes in mind. 2. Performing Large Data Loads. Large data loads and deletions can affect query performance. The Force.com query optimizer uses the total number of records as part of the calculation for its selectivity threshold. This number takes into account your recently deleted records. A deleted record remains in the Recycle Bin for 15 days—or even less time if you exceed your storage limit, and the record has been in the Recycle Bin for at least two hours—and then that record is actually removed from the Recycle Bin or flagged for a physical delete. When the Force.com query optimizer judges returned records against its thresholds, all of the records that appear in the Recycle Bin or are marked for physical delete do still count against your total number of records. From our earlier example of accounts and a custom indexed field, the selectivity threshold was 175,000, and the total number of records was 2.5 million. Let’s say that a Bulk API job runs and deletes all records before January 1, 2013, and those records total 2.4 million. That leaves us with 100,000 non-deleted account records. If the deleted records are still in the Recycle Bin, the Force.com optimizer mistakenly thinks that the 100,000 non-deleted records fall under and meet a 2.5 million-record selectivity threshold, and it generates a query that isn’t optimized. In reality, the threshold is 10,000 targeted records (10 percent of 100,000 targeted records). If the deleted records do not need to go to the Recycle Bin, use the hard delete option in the Bulk API or contact salesforce.com Customer Support to physically delete the records. If your data loads cause the records targeted by your filter to exceed the selectivity threshold, you might need to include additional filters to make your queries selective again. 3. Using Leading % Wildcards. A LIKE condition with a leading % wildcard does not use an index. SELECT id FROM Account WHERE Name LIKE ‘%Acme%’ This is the type of query that would normally work better with SOSL. However, if you need real-time results, an alternative is to create a custom search page, which restricts leading % wildcards and adds governance on the search string(s). Within a report/list view, the CONTAINS clause translates into ‘%string%’. 4. Using NOT and !=. When your filter uses != or NOT—which includes using NOT EQUALS/CONTAINS for reports, even if the field is indexed—the Force.com query optimizer can’t use the index to drive the query. For better performance, filter using = or IN, and the reciprocal values. Use the reciprocal values instead. SELECT id FROM Case WHERE Status != ‘Closed’ SELECT id FROM Case WHERE Status IN (‘New’, ‘On Hold’, ‘Pending’, ‘ReOpened’) The following query won’t use the index on the Status field because of the NOT. The problem with this query is that even if you change it to use the reciprocal values, the index on the Status field won’t meet the selectivity threshold because of the Closed value. In this case, you do want to add additional filters to reduce the number of records retrieved. SELECT id FROM Case WHERE NOT (Status IN (‘On Hold’, ‘Pending’, ‘ReOpened’)) SELECT id FROM Case WHERE Status IN ('New', 'Closed') AND Priority = ‘High’ Using a filter on an indexed field such as CreatedDate is always recommended, but this field was not included in the original query so that we could make a point about the selectivity threshold. 5. Using Complex Joins. Complex AND/OR conditions and sub-queries require the Force.com query optimizer to produce a query that is optimized for the join, but might not perform as well as multiple issued queries would. This is especially true with the OR condition. For Force.com to use an index for an OR condition, all of the fields in the condition must be indexed and meet the selectivity threshold. If the fields in the OR condition are in multiple objects, and one or more of those fields does not meet a selectivity threshold, the query can be expensive. Filters on formula fields that are non-deterministic can’t be indexed and result in additional joins. Common formula field practices include transforming a numeric value in a related object into a text string or using a complex transformation involving multiple related objects. In both cases, if you filter on this formula field, the Force.com query optimizer must join the related objects. If you have large data volumes and are planning to use this formula field in several queries, creating a separate field to hold the value will perform better than following either of the previous common practices. You’ll need to create a workflow rule or trigger to update this second field, have this new field indexed, and use it in your queries. If the number of records returned for an index is larger than a threshold, query performance wouldn’t improved by the use of that index.