Skip to main content

Datometry Documentation

Override Table Distribution Using the CREATE TABLE Statement

You can override table distribution using the Hyper-Q hints override_distribution_style . For example, you can specify that the CREATE TABLE statement create hash-distributed or round-robin distributed tables.

Related to the hint override_distribution_style is override_distribution_cols, which lets you override the distribution column for the table distribution style being created. The default behavior of Hyper-Q is to use columns defined as the Teradata primary index as the Azure Synapse Analytics distribution column. You can override this behavior, and specify a different column to use as the distribution column on Azure Synapse Analytics. To learn about override_distribution_cols, see Override Column Distribution Using the CREATE TABLE Statement

To learn about the syntax and usage of the Azure Synapse Analytics CREATE TABLE statement, see CREATE TABLE in the Microsoft documentation.

In the following example, Azure Synapse Analytics uses a hash distribution when creating a new table. You can override this behavior using the Hyper-Q hint override_distribution_style. In the example below, a Teradata CREATE TABLE statement is emulated on Azure Synapse Analytics.

CREATE TABLE table_name (
    column1 integer,     
    column2 integer,     
    column3 varchar(10),     
   ) 
UNIQUE PRIMARY INDEX (column1);

The above Teradata CREATE TABLE statement is emulated as shown by Hyper-Q on Azure Synapse Analytics. Note that the table distribution is a hash distribution (DISTRIBUTION = HASH).

CREATE TABLE
    [database_name].[table_name] (
        [column1] int,
        [column2] int,
        [column3] varchar(10),
    )
WITH
    (DISTRIBUTION = HASH ([column1] ) , CLUSTERED COLUMNSTORE INDEX )

You can apply the Hyper-Q hint override_distribution_style to change the distribution style. In the following example the same Teradata CREATE TABLE statement applies the hint override_distribution_style=replicated, specifying that the table use a replicate distribution style.

CREATE TABLE table_name (
-- @dtm_override_distribution_style=replicate
    column1 integer,
    column2 integer,
    column3 varchar(10));

The emulated query on Azure Synapse Analytics uses REPLICATE as the distribution style.

CREATE TABLE
    [database_name].[table_name] (
        [column1] int,
        [column2] int,
        [column3] varchar (10)
    )
WITH
    (DISTRIBUTION = REPLICATE , CLUSTERED COLUMNSTORE INDEX )
See Also

The following Hyper-Q hints allow you to override the default Hyper-Q behavior, and provide precise control over Azure Synapse Analytics table creation options.

  • Override the CREATE TABLE Statement Override the CREATE TABLE Statement You can override the default Hyper-Q table creation behavior using the synapse_table_options_override hint. Use this in situations in which you require precise control over table creation options in Azure Synapse Analytics.

  • Override Table Distribution Using the CREATE TABLE Statement  You can override table distribution using the Hyper-Q hint override_distribution_style. For example, you can specify that the CREATE TABLE statement create hash-distributed or round-robin distributed tables.

  • Override Column Distribution Using the CREATE TABLE Statement The default behavior of Hyper-Q is to use the columns defined as a Teradata primary index as Azure Synapse Analytics distribution columns. You can override the distribution of columns using the Hyper-Q hint override_distribution_cols.

Hint Syntax

The syntax for override_distribution_style is:

override_distribution_style = replicate|hash|round_robin

where:

Parameter

Description

replicate

Stores one copy of the table on each Compute node. For Azure Synapse Analytics the table is stored on a distribution database on each Compute node.

hash

Assigns each row to one distribution by hashing the value stored in distribution_column_name. The default behavior of Hyper-Q is to serialize columns defined as a Teradata primary index as Azure Synapse Analytics distribution columns. The algorithm is deterministic, which means it always hashes the same value to the same distribution. The distribution column should be defined as NOT NULL because all rows that have NULL are assigned to the same distribution.

round_robin

Distributes the rows evenly across all the distributions in a round-robin fashion. This behavior is the default for Azure Synapse Analytics.

Hint Scope

The override_distribution_style hint can be specified at the statement, request, and session level. To learn more about Hyper-Q SQL hints, and how to use them within a SQL query, see About Hyper-Q SQL Hints.

Example 7. Specify that Hyper-Q Override Azure Synapse Table Distribution Style as a Query Hint

You can specify override_distribution_style as a query hint. The following example species that the new table use a hash distribution.

CREATE TABLE table_name (
-- @dtm_override_distribution_style=hash
    column1 integer,
    column2 integer,
    column3 varchar(10));

To learn more about Hyper-Q SQL hints, and how to use them within a SQL query, see About Hyper-Q SQL Hints.



Example 8. Specify Multiple Hints to Override Table Creation Behavior in a Single Statement

You can apply several hints to override the default table creation behavior to a single CREATE TABLE statement. In the following example Hyper-Q hints override the default Teradata CREATE TABLE behaviors as specified:

  • Table layout uses a clustered index.

  • The indexed column is column2 .

  • The distribution style is round robin.

The Teradata CREATE TABLE statement with the Hyper-Q hints applied.

CREATE TABLE table_name (
 -- @dtm_override_table_layout=clustered_index
 -- @dtm_override_table_layout_cols=column2
 -- @dtm_override_distribution_style=round_robin
    column1 integer,
    column2 integer,
    column3 varchar(10));

The emulated query on Azure Synapse Analytics as as shown below.

CREATE TABLE
     [database_name].[table_name]  (
        [column1] int,
        [column2] int,
        [column3] varchar (10)
    )
WITH
    (DISTRIBUTION = ROUND_ROBIN , CLUSTERED INDEX ([column2] ) )