Specify Statistics Collection Sampling Percentage
You can specify the Teradata statistics collection sampling on Azure Synapse using the Hyper-Q hint default_stats_sample
.
Hyper-Q translates the Teradata COLLECT STATISTICS command to CREATE STATISTICS and UPDATE STATISTICS on Azure Synapse. There is a significant difference between Teradata and Azure Synapse statistics collection. To accommodate these differences, Hyper-Q observes the following behavior when creating or updating statistics.
When updating statistics with no specified sampling options, Hyper-Q uses the Teradata RESAMPLE option.
When creating or updating statistics with an explicit sampling percentage, Hyper-Q will use that percentage.
When updating statistics with either the SAMPLE or SYSTEM SAMPLE options, but no specified percentage, Hyper-Q uses the percentage specified using the
default_stats_sample
hint.When creating statistics with no options specified, or with either the SAMPLE or SYSTEM SAMPLE options, but no specified percentage, Hyper-Q uses the percentage specified using the
default_stats_sample
hint.Hyper-Q respects the percentage specified for USING SAMPLE n PERCENT
To perform statistics collection using sampling, the Teradata COLLECT STATISTICS USING command uses one of the following options: SAMPLE, SYSTEM SAMPLE, SAMPLE n PERCENT, and NO SAMPLE. If specifying a percentage, Hyper-Q uses SAMPLE n PERCENT.
When creating statistics for the first time, Hyper-Q uses the sampling percentage specified using the default_stats_sample
hint. If not specified, the default value is 20 percent.
The syntax for default_stats_sample
is:
default_stats_sample = sampling_percentage
where:
Parameter | Description |
---|---|
| The percentage of statistics sampling you specify determines if the Teradata COLLECT STATISTICS command resamples or does a full scan of the specified table. If you specify 100, Hyper-Q does a full scan. If not specified, the default value is 20 percent. |
Hint Scope
The default_stats_sample
hint can be specified at the request, session, and global level. To learn more about Hyper-Q SQL hints, and how to use them within a SQL query, see About Hyper-Q SQL Hints.
You can specify default_stats_sample
as a global setting in the Hyper-Q configuration file (dtm.ini
). This example species a sampling of 5 percent.
"gateway".default_stats_sample=5
To learn more about specifying a hint as a gateway
option in the Hyper-Q configuration file, see Gateway Configuration Parameters.
You can specify default_stats_sample
as a query hint at either the session or request level. The following example species a sampling of 10 percent.
COLLECT STATISTICS -- @dtm_default_stats_sample=10 for session COLUMN (column_name) AS statistics_name on table_name;
Hyper-Q translates the above Teradata COLLECT STATISTICS statement as the following CREATE STATISTICS on Azure Synapse.
CREATE STATISTICS statistics_name ON schema_name.table_name ([column_name] ) WITH SAMPLE 10 PERCENT
To learn more about Hyper-Q SQL hints, and how to use them within a SQL query, see About Hyper-Q SQL Hints.