Configuring full-text search
Prerequisite
- Access to the Configuration > Full text search (CM032) menu.
Introduction
SoftExpert Configuration has a menu that provides system administrators with greater control over the behavior of full-text searches in SoftExpert Suite.
For more details on using full-text search, click here.
Understand how the screen is divided:
data:image/s3,"s3://crabby-images/0687a/0687a2ad0becae09aa6e555d6d49496eadb07ae5" alt=""
A - In the Status tab, it is possible to check the availability of the search service, as well as to view information on executed indexings: how many records and files have been indexed, and the report on the last indexings.
Options that allow for manually starting the indexing process are available, but they must only be used in extreme cases and under SoftExpert's guidance.
B - In the Relevances tab, it is possible to increase or decrease the relevance given to each of the criteria considered in the full-text search. To do that, use the slider control and adjust it to define the desired value, or fill in the field available next to the control.
When setting configurations in this tab, take into account that they will impact the full-text search for all SoftExpert Suite users.
C - In the Attributes for filters tab, it is possible to configure a list of attributes that will be available as filters to refine the search. When full-text search is used, they will appear in a side panel, in the Attributes grouper.
D - In the Synonyms tab, it is possible to create synonym groups that will be used to refine the searches during a full-text search.
E - Click on the Save configurations button so that anything edited in this menu is applied to the full-text search.
See more details of each available tab:
Status
Search service
The Search service presents different colors and messages depending on the health of the search engine. See below possible statuses related to engine health:
Status | Information |
GREEN | It indicates that both Elasticsearch and Atlas are working and there are no index issues. However, it does not necessarily mean that the environment is fully indexed. |
RED | It indicates that Atlas is failing to connect to Elasticsearch. Elasticsearch may be having a configuration or unavailability issue. |
YELLOW | It indicates that there are replicas configured for Elasticsearch, but they are not assigned correctly to their nodes. The Elasticsearch service displays this status when a replica shard is in the same node as the main shard. |
CHECK_ERROR | It indicates that SoftExpert Suite is failing to communicate with the Atlas service. Atlas may be having a configuration or unavailability issue. |
In an on-premise environment with Elasticsearch managed by SoftExpert Suite (automanaged), it is possible to perform actions such as restarting and configuring the Elasticsearch memory. This option is also valid to deploy the search service again in case it is not responding.
Indexing
What is it meant for?
In order for Elasticsearch to operate correctly, it must have its indexes created. With the indexing process, the engine indexes all files within its scope, thus creating the indexes.
How long can it take?
The indexing length is directly related to the number of records and files to be indexed; in some cases, it may take days for the indexing to complete.
Which file extensions can be indexed?
Only the txt, doc, docx, xls, xlsx, pdf, eml, odt, ods, ppt, pptx, xml, rtf, and odp extensions can be indexed.
What is the maximum size of the files to be indexed?
The maximum size of a single file to be indexed is 30 MB.
Relevances
data:image/s3,"s3://crabby-images/807b6/807b66a6060509588f8b82103011a99242b41ac9" alt=""
Default relevance values
Field | Default value |
ID # | 30 |
Title | 4 |
Attributes | 2 |
File name | 1.5 |
Description | 1 |
File content | 1 |
Exact word | 1 |
To adjust the value of the relevance criteria, use the slider control to define the desired value, or fill in the field available next to the control. It is possible to set a value from 0 to 50.
This menu provides a button that allows for restoring the default values of SoftExpert Suite.
How does relevance work?
Elasticsearch results are sorted through a similarity score via an Okapi BM25 algorithm. In this algorithm, the following factors are considered for score calculation:
- Frequency of the searched token in each of the indexed fields.
- Size of each indexed field containing the token.
- Number of documents with the token.
A mathematical calculation uniting these three factors is then performed. For example, a document with the "bucket" token appearing twice, and with a size of 1,000 words, is more relevant than a document with the "bucket" token appearing three times among 10,000 words.
That is because, in a text with a smaller number of characters, the size of the document may happen to be more relevant than the frequency. Obviously, if the token frequency increases, the analysis will be reconsidered, as it is a sum of the three factors.
However, if a text has a high frequency of a word, especially when it comes to smaller texts, the relevance of this token will be smaller for this document, as the algorithm considers it a "common token" for that document.
Moreover, once the algorithm score is obtained, it is multiplied by the value of the field previously defined in the relevances.
Thus, it is important to check if there are any attributes containing the token in the searched document, since they will also increase the search score.
Attributes for filters
In this tab, it is possible to configure a list of up to 10 attributes that will be available as filters to refine the search. These attributes allow the user to raise the number of side filters to be displayed.
When full-text search is used, they will appear in the left side panel, in the Attributes section.
The attributes that can be selected are the same as those that can be filled out during the creation of a document, for example.
Synonyms
In this tab, it is possible to create synonym groups to be used in a full-text search.
When creating a synonym group, it is possible to inform the search system that all words contained in the group have the same meaning. This will allow searches done within SoftExpert Suite to bring customized results that meet the company's needs.
For example, if a synonym group containing the words "Collaborator", "Employee", and "User" is created, whenever "Collaborator" is searched for, the system will also show any records containing the words "Employee" and "User".
To add a synonym group, follow the steps below:
1. Access the Synonyms tab.
2. Click on the Add button.
3. Enter the group name.
4. Enter the desired synonym and click on the Add button. Repeat the process until adding all synonyms to the group.
5. Click on the Save button.
data:image/s3,"s3://crabby-images/99bab/99babda74a51c61a908bd0bb316221b1aa9c64a2" alt=""
Conclusion
All done! Now you know how to control the behavior of the system's full-text search.