Tuesday, May 29, 2018

Sitecore and Azure Search - contains a term that is too large to process

Similar to the PostFailedForSomeDocumentsException error, when I was attempting to rebuild the master index on a Sitecore 9 instance in Azure PAAS, the following error would appear in the logs:
{"key":"a1c7786e795d5d2f16525bda3d7087b9","status":false,"errorMessage":"Field 'content_1' contains a term that is too large to process. The max length for UTF-8 encoded terms is 32766 bytes. The most likely cause of this error is that filtering, sorting, and/or faceting are enabled on this field, which causes the entire field value to be indexed as a single term. Please avoid the use of these options for large fields.","statusCode":400},
I was able to trace the content_1 field back to the Sitecore.ContentSearch.Azure.DefaultIndexConfiguration.config configuration file to the field definition for _content (where the main item content is stored).
<field fieldName="_content" cloudFieldName="content_1" searchable="YES" retrievable="NO" facetable="NO" filterable="NO" sortable="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.Azure.CloudSearchFieldConfiguration, Sitecore.ContentSearch.Azure" />
As suggested by Azure search I set filterable, sortable and facetable to NO on this field. It would be unlikely that I would need any of these features on a field containing the bulk of the indexes text.

Furthermore I also found the same error with a mysterious field name 'script':
"errorMessage":"Field 'script' contains a term that is too large to process. The max length for UTF-8 encoded terms is 32766 bytes. The most likely cause of this error is that filtering, sorting, and/or faceting are enabled on this field, which causes the entire field value to be indexed as a single term. Please avoid the use of these options for large fields."
This field wasn't as easily traced, so I simply added a number of fields called script the to the list of fields to not be indexed. A patch file is shared below for the master database with Azure Search.
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/" xmlns:search="http://www.sitecore.net/xmlconfig/search/">
<sitecore role:require="Standalone or ContentManagement" search:require="Azure">
    <contentSearch>
      <configuration>
        <indexes>
          <index id="sitecore_master_index">
            <configuration>
              <documentOptions ref="contentSearch/indexConfigurations/defaultCloudIndexConfiguration/documentOptions">
                <exclude hint="list:AddExcludedField">
      <scriptField tag="{CD8DA5E2-3B65-4A14-B7A6-9F41181CE172}">{CD8DA5E2-3B65-4A14-B7A6-9F41181CE172}</scriptField>
      <scriptField2 tag="{3AC4B854-6CF9-4B30-9D52-1E5518AFF0E8}">{3AC4B854-6CF9-4B30-9D52-1E5518AFF0E8}</scriptField2>
      <scriptField3 tag="{DF23B990-55C1-401F-8F77-4698EDBD6FA9}">{DF23B990-55C1-401F-8F77-4698EDBD6FA9}</scriptField3>
      <scriptField4 tag="{FF1383BB-F095-4958-9B6D-E555DB653C44}">{FF1383BB-F095-4958-9B6D-E555DB653C44}</scriptField4>
      <scriptField5 tag="{CEDA3E7F-7406-40D0-A154-7CF7E3E7E85B}">{CEDA3E7F-7406-40D0-A154-7CF7E3E7E85B}</scriptField5>
      <scriptField6 tag="{C0A5BD91-E658-46CE-B631-77CE337D8E6E}">{C0A5BD91-E658-46CE-B631-77CE337D8E6E}</scriptField6>
      <scriptField7 tag="{B1A94FF0-6897-47C0-9C51-AA6ACB80B1F0}">{B1A94FF0-6897-47C0-9C51-AA6ACB80B1F0}</scriptField7>
      <scriptField8 tag="{CD8DA5E2-3B65-4A14-B7A6-9F41181CE172}">{CD8DA5E2-3B65-4A14-B7A6-9F41181CE172}</scriptField8>
                </exclude>
              </documentOptions>
            </configuration>
          </index>
        </indexes>
      </configuration>
    </contentSearch>
  </sitecore>
</configuration>

No comments:

Post a Comment