# Best practices for Redis Query Engine performance


If you're using Redis Software or Redis Cloud, see the [best practices for scalable Redis Query Engine](http://redis.io/docs/latest/operate/oss_and_stack/stack-with-enterprise/search/scalable-query-best-practices) page.


## Checklist
Below are some basic steps to ensure good performance of the Redis Query Engine (RQE).

* Create a Redis data model with your query patterns in mind.
* Ensure the Redis architecture has been sized for the expected load using the [sizing calculator](http://redis.io/redisearch-sizing-calculator/).
* Provision Redis nodes with sufficient resources (RAM, CPU, network) to support the expected maximum load.
* Review [`FT.INFO`](http://redis.io/docs/latestcommands/ft.info) and [`FT.PROFILE`](http://redis.io/docs/latestcommands/ft.profile) outputs for anomalies and/or errors.
* Conduct load testing in a test environment with real-world queries and a load generated by either [memtier_benchmark](http://github.com/redislabs/memtier_benchmark) or a custom load application.

## Indexing considerations

### General
- Favor [`TAG`](http://redis.io/docs/latest/develop/ai/search-and-query/indexing/field-and-type-options#tag-fields) over [`NUMERIC`](http://redis.io/docs/latest/develop/ai/search-and-query/indexing/field-and-type-options#numeric-fields) for use cases that only require matching.
- Favor [`TAG`](http://redis.io/docs/latest/develop/ai/search-and-query/indexing/field-and-type-options#tag-fields) over [`TEXT`](http://redis.io/docs/latest/develop/ai/search-and-query/indexing/field-and-type-options#text-fields) for use cases that don’t require full-text capabilities (pure match).

### Non-threaded search
- Put only those fields used in your queries in the index.
- Only make fields [`SORTABLE`](http://redis.io/docs/latest/develop/ai/search-and-query/advanced-concepts/sorting) if they are used in [`SORTBY`](http://redis.io/docs/latest/develop/ai/search-and-query/advanced-concepts/sorting#specifying-sortby)
queries.
- Use [`DIALECT 2`](http://redis.io/docs/latest/develop/ai/search-and-query/advanced-concepts/dialects#dialect-2).

### Threaded (query performance factor or QPF) search
- Put both query fields and any projected fields (`RETURN` or `LOAD`) in the index.
- Set all fields to `SORTABLE`.
- Set TAG fields to [UNF](http://redis.io/docs/latest/develop/ai/search-and-query/advanced-concepts/sorting#normalization-unf-option).
- Optional: Set `TEXT` fields to `NOSTEM` if the use case will support it.
- Use [`DIALECT 2`](http://redis.io/docs/latest/develop/ai/search-and-query/advanced-concepts/dialects#dialect-2).

## Query optimization

- Avoid returning large result sets.  Use `CURSOR` or `LIMIT`.
- Avoid wildcard searches.
- Avoid projecting all fields (e.g., `LOAD *`). Project only those fields that are part of the index schema.
- If queries are long-running, enable threading (query performance factor) to reduce contention for the main Redis thread.

## Validate performance (`FT.PROFILE`)

You can analyze [`FT.PROFILE`](http://redis.io/docs/latestcommands/ft.profile) output to gain insights about query execution.
The following informational items are available for analysis:

- Total execution time
- Execution time per shard
- Coordination time (for multi-sharded environments)
- Breakdown of the query into fundamental components, such as `UNION` and `INTERSECT`
- Warnings, such as `TIMEOUT`

## Anti-patterns

When designing and querying indexes in RQE, certain practices can hinder performance, scalability, and maintainability. Below are some common anti-patterns to avoid:

- **Large documents**: storing excessively large documents in Redis makes data retrieval slower and increases memory usage. Break data into smaller, focused records whenever possible.
- **Deeply-nested fields**: retrieving or indexing deeply-nested JSON fields is computationally expensive. Use a flatter schema for better performance.
- **Large result sets**: fetching unnecessarily large result sets puts a strain on memory and network resources. Limit results to only what is needed.
- **Wildcarding**: using wildcard patterns indiscriminately in queries can lead to large and inefficient scans, especially if the index size is significant.
- **Large projections**: including excessive fields in query results increases memory overhead and slows down query execution. Limit projections to essential fields.

The following examples depict an anti-pattern index schema and query, followed by corrected versions designed for scalability with RQE.

### Anti-pattern index schema

The following schema introduces challenges for scalability and performance:

```sh
FT.CREATE jsonidx:profiles ON JSON PREFIX 1 profiles: 
          SCHEMA $.tags.* as t NUMERIC SORTABLE 
                 $.firstName as name TEXT 
                 $.location as loc GEO
```

Issues:

- Minimal schema definition: the schema is sparse and lacks fields like `lastName`, `id`, and `version` that might be frequently queried. This results in additional operations to fetch these fields separately, reducing efficiency.
- Missing `SORTABLE` flag for text fields: sorting operations on unsortable fields require full-text processing, which is slow.
- Wildcard indexing: `$.tags.*` creates a broad index that can lead to excessive memory usage and reduced query performance.

### Anti-pattern query

The following query is inefficient and not optimized for vertical scaling:

```sh
FT.AGGREGATE jsonidx:profiles '@t:[1299 1299]' LOAD * LIMIT 0 10
```
Issues:

- Wildcard projection (`LOAD *`): retrieving all fields in the result set is inefficient and increases memory usage, especially if the documents are large.
- Unnecessary fields: fields that aren't required for the current operation are still fetched, slowing down execution.
- Lack of advanced query syntax: without specifying a query dialect or leveraging features like tagging, the query may perform unnecessary computations.

### Improved index schema

Here’s an optimized schema that adheres to best practices for vertical scaling:

```sh
FT.CREATE jsonidx:profiles ON JSON PREFIX 1 profiles: 
          SCHEMA $.tags.* as t NUMERIC SORTABLE 
                 $.firstName as name TEXT NOSTEM SORTABLE 
                 $.lastName as lastname TEXT NOSTEM SORTABLE 
                 $.location as loc GEO SORTABLE 
                 $.id as id TAG SORTABLE UNF 
                 $.ver as ver TAG SORTABLE UNF
```

Improvements:

- `NOSTEM` for text fields: prevents stemming on fields like `firstName` and `lastName` to allow for exact matches (e.g., "Smith" stays "Smith").
- Expanded schema: adds commonly queried fields like `lastName`, `id`, and `version`, making queries more efficient by reducing the need for post-query data retrieval.
- `TAG` fields: `id` and `ver` are defined as `TAG` fields to support fast filtering with exact matches.
- `SORTABLE` for all relevant fields: ensures that sorting operations are efficient without requiring full-text scanning.

You might be wondering why `$.tags.* as t NUMERIC SORTABLE` is acceptable in the improved schema and it wasn't previously.
The inclusion of `$.tags.*` is acceptable when:

- It has a clear purpose: it is actively used in queries, such as filtering on numeric ranges or matching specific values.
- Other fields in the schema complement it: these fields reduce over-reliance on `$.tags.*` for all query operations, distributing the load more evenly.
- Projections and limits are managed carefully: queries that use `$.tags.*` should avoid loading unnecessary fields or returning excessively large result sets.

### Improved query

The following query is better suited for vertical scaling:

```sh
FT.AGGREGATE jsonidx:profiles '@t:[1299 1299]' 
                LOAD 6 id t name lastname loc ver 
                LIMIT 0 10
                DIALECT 2
```

Improvements:

- Targeted projection: the `LOAD` clause specifies only essential fields (`id, t, name, lastname, loc, ver`), reducing memory and network overhead.
- Limited results: the `LIMIT` clause ensures the query retrieves only the first 10 results, avoiding large result sets.
- [`DIALECT 2`](http://redis.io/docs/latest/develop/ai/search-and-query/advanced-concepts/dialects#dialect-2): enables the latest RQE syntax and features, ensuring compatibility with modern capabilities.