![]() Together, these steps can be reduced to “Does this table benefit from sorting?” Expanded, that question looks like the following:įor tables that benefit from either a compound or interleaved sort key, you can ask additional questions to determine which type of key is most appropriate. In these circumstances, you don’t need to specify a sort key. If you have a table that is accessed in a pattern where none of these three optimizations benefits you, then you have one of the few scenarios where defining a sort key makes no difference. MERGE JOIN is the fastest of the three JOIN operations supported by Amazon Redshift. Least common, but still important, is sorting to facilitate a MERGE JOIN operation. The next most frequent benefit provided by sort keys is sorting to remove, or reduce the cost of, sort steps required by SQL operations like ORDER BY, PARTITION BY, GROUP BY, and so on. However, in a few edge cases sorting a table doesn’t result in a performance improvement and only adds minor overhead to data ingestion.Īs discussed preceding, you most often use sort keys to improve the effectiveness of the zone maps, which result in reduced block I/O for read operations. Will queries against my tables benefit from a sort key?Īlmost always the answer is yes. Starting at the most fundamental question, and diving deeper, we can construct a logical flowchart applicable to any table that can be used for identifying the ideal sort type and sort key columns for your workload. Improve join performance by enabling MERGE JOIN operation.You can reduce compute overhead and I/O by avoiding or reducing cost of sort steps.You can reduce disk I/O by improving zone map effectiveness.There are numerous benefits of ordering your data in Amazon Redshift: ![]() In practice, a compound sort key is most appropriate for the vast majority of Amazon Redshift workloads. Each of these styles of sort key is useful for certain table access patterns. In Amazon Redshift, we allow for a table to be defined with compound sort keys, interleaved sort keys, or no sort keys. This methodology offers concrete guidance on how to properly use sort keys for performance.ĭefining a table with a sort key results in the physical ordering of data within each slice, based on the sort type and the columns chosen in the key definition. I’ll also present another methodology with which to work through your specific workload. ![]() In this installment, I’ll cover different sort key options, when to use sort keys, and how to identify the most optimal sort key configuration for your tables. Part 3: Compound and Interleaved Sort Keys ( Translated into Japanese) Part 2: Distribution Styles and Distribution Keys Part 1: Preamble, Prerequisites, and Prioritization ![]()
0 Comments
Leave a Reply. |