Skip to main content

Datometry Documentation

Gateway Configuration Parameters

The [gateway] defines the connection to the target data warehouse. For example, Azure Synapse Analytics or Google BigQuery.

[gateways]
"gateway".name = "gateway_msadw"
"gateway".connection_string = "Driver=ODBC Driver 17 for SQL Server;Server=__HOST__;Port=__PORT__;Database=__DATABASE__;AutoTranslate=no;Uid=__USER__;Pwd=__PASSWORD__""
"gateway".host = "servername"
"gateway".port = 1433"
"gateway".tls = prefer
"gateway”.Ipv = 4
"gateway".type = msadw
"gateway".version = "100000"
"gateway".bulk_load_utility = bcp
"gateway".bulk_load_chunk_size = 1GB
"gateway".bulk_load_max_errors = 2000000000
"gateway".bulk_load_mload_session_count_override = true
"gateway".bulk_load_detect_encoding = false
"gateway”.bulk_load_locale = "en_US.UTF-8"
"gateway".allow_collect_stats_on_missing_index = true
"gateway".default_table_type = "multiset"
"gateway".parallel_conversion_available_memory_size = 3200MB
"gateway".max_tdf_size = 100MB
"gateway".max_tdf_rows = 10000000
Table 7. Gateway Parameters

Parameter

Type

Description

allow_collect_stats_on_missing_ index

Boolean

Specifies if COLLECT STATS is to be translated for missing indexes. This prevents COLLECT STATS from producing an error if the index on which it is called does not exist. For example, when Hyper-Q creates Named Primary Indexes, they are translated as Distribution columns. You can then issue the COLLECT STATS query on that Named Index.

By default allow_collect_stats_on_missing_index is set to false, which disables it. This is compatible with the behavior of Teradata, which produces an error message.

allow_index_commands

Boolean

For backend databases that support the use of indexes, if set to true Hyper-Q allows the use of index commands. If set to false SQL statements used to create or drop an index are ignored.

"gateway".allow_index_commands=true

To learn more, see Enable or Disable Index Not Supported Error Messages.

bulk_load_chunk_size

Integer

Specifies the maximum file size of each intermediate Teradata MultiLoad or FastLoad data file. When the specified file size is reached, the file is uploaded to the database and a new file is created. This value is reflected in the number of times Hyper-Q issues the bcp command and may affect job performance. For example, if the value is too small, Hyper-Q issues several bcp commands, and uploads a small file each time. If the value is too large, Hyper-Q will wait for a long time before any data is uploaded to the destination database.

Accepted values are composed of a number and a unit. Acceptable units are: TB, GB, MB, KB and B. For example 10MB indicates a max intermediate file size of 10 MB. The default value is 1GB.

"gateway".bulk_load_chunk_size = 10MB

bulk_load_detect_encoding

String

Specifies that the property of the gateway not be used.

Caution

Before modifying this parameter, contact Datometry Support to see if it is compatible with your environment.

bulk_load_err_gen2

String

To enable the Generation 2 version of bulk load error handling, specify the configuration parameter  bulk_load_err_gen2=true. If the Generation 2 bulk load processing encounters an error, it will fall back to Generation 1 error handling for errors on DML operations other than INSERT.

"gateway".bulk_load_err_gen2=true

To disable Generation 2 bulk load error handling, specify false.

"gateway".bulk_load_err_gen2=false

When set to force, Hyper-Q will perform a hard abort that interrupts bulk loader execution if it encounters an error.

"gateway".bulk_load_err_gen2=force

To learn more, see Enable Generation 2 Bulk Load Error Handling.

bulk_load_fail_on_et_errors

Boolean

You can specify if bulk load jobs should fail if an error is encountered.

The default behavior is false, in which case when errors are present in ET error tables, the job return code is a non-zero number. If set to true, the number of error retries is set to zero (0) to save the expensive error handling when the job is forced to fail.

bulk_load_fail_on_uv_errors

Boolean

You can specify if a bulk load job fails if an error is encountered.

The default behavior is false, in which case when entries are present in UT error tables, the job return code is 0. If set to true, the job return code will be a non-zero number.

bulk_load_field_sep

String

Specifies the field separator to use during bulk loading. A comma (,) is the terminator for a field.

"gateway".bulk_load_field_sep = ","

Also see the configuration parameter bulk_load_row_sep.

bulk_load_locale

String

This parameter is used by Azure Synapse Analytics for the bcp utility and is only for use with the Teradata MultiLoad and FastLoad utilities. During Teradata Multiload and Fastload jobs Hyper-Q creates temporary worktables where incoming data is stored. If the parameter is set all CHAR fields in the worktables get wide char types (NCHAR/NVARCHAR). If the property is not set the character data types will be CHAR/VARCHAR. If not specified, the value is undefined. Currently, en_US.UTF-8 is the only value this parameter accepts.

Caution

Before modifying this parameter, contact Datometry Support to see if it is compatible with your environment.

bulk_load_max_errors

Integer

Specifies the maximum error count of the Azure Synapse Analytics bulk copy program (bcp) utility. The default value is 1000. This value is passed to the bcp utility using the -m option. It specifies the maximum number of syntax errors that can occur before the bcp operation is canceled. A syntax error implies a data conversion error to the target data type. The bulk_load_max_errors total excludes any errors that are detected only at the server, such as constraint violations.

Note

A row that cannot be copied by the bcp utility is ignored and counted as a single error.

Caution

Before modifying this parameter, contact Datometry Support to see if it is compatible with your environment.

bulk_load_mload_session_ count_override

Boolean

When this parameter is set to true, the client connects the number of times returned by server. When set to false, the client connects the number of sessions set by the script.

Important

This parameter is important when your script does not specify the number of sessions.

Caution

Before modifying this parameter, contact Datometry Support to see if it is compatible with your environment.

bulk_load_retry_max_error_rows

Integer

Specifies the maximum number of times Hyper-Q will attempt to identify a row causing an error. The default value is 0.

Hyper-Q uses a slow error handling approach that may require executing a query log several times to discover the exact row causing an error. For databases with a large number of rows, this may cause performance issues due to excessive query logging. If detailed error reporting is not required, specify a value of 0.

"gateway".bulk_load_retry_max_error_rows = 0

Hyper-Q 3.93 introduced bulk_load_err_gen2, the second generation of bulk load error handling. The generation 2 bulk load error handling improves performance and provides more information about errors than the previous version of bulk load error handling. Datometry recommends using generation 2 bulk load error handling instead of bulk_load_retry_max_error_rows. To learn more, see the configuration parameter bulk_load_err_gen2.

bulk_load_row_sep

String

Specifies the row separator to use during bulk loading. A \n specifies the row terminator as a newline character.

"gateway".bulk_load_row_sep = "\n"

Also see the configuration parameter bulk_load_field_sep.

bulk_load_utility

String

Specifies the bulk load utility to use with the database cloud-based data warehouse.

Caution

Before modifying this parameter, contact Datometry Support to see if it is compatible with your environment.

The following example specifies the Azure Synapse Analytics bulk copy program (bcp).

"gateway".bulk_load_utility = bcp

connection_string

String

Specifies the database connection. Note that the parameters Server=__HOST__, Port=__PORT__, and Database=__DATABASE__ are replaced with the corresponding settings when a user first connects to the cloud data warehouse through Hyper-Q. As such, you do not need to manually edit these setting to supply a value, as it is done automatically by Hyper-Q. However, you must provide a user name and password for the Uid=__USER__ and Pwd=__PASSWORD__ settings.

"gateway".connection_string = "Driver=ODBC Driver 17 for SQL Server;Server=__HOST__;
Port=__PORT__;Database=__DATABASE__;AutoTranslate=no;Uid=__USER__;Pwd=__PASSWORD__"

Important

Beginning with Hyper-Q 3.101, the connection_string must include the AutoTranslate=no parameter. This allows Hyper-Q to support valid Latin1 characters that are not valid UTF-8 sequences.

default_table_type=multiset

String

Specifies that Hyper-Q emulate either Teradata MULTISET or SET tables for CREATE TABLE statements that do not specify a table type. The default is to create MULTISET tables.

The following example specifies that Hyper-Q emulate Teradata SET tables.

"gateway".default_table_type=set

fload_deduplication

String

Specifies how Hyper-Q processes Teradata FastLoad (FastLoad) jobs. FastLoad is a Teradata utility that quickly loads large amounts of data into empty tables in the database. The fload_deduplication parameter accepts one of the following options:

input Discards duplicate rows. A duplicate row is any row that is an exact duplicate of a different row, with the same value in every column. Duplicate rows are ignored, and are not stored in an error table.

target Applies a uniqueness violation check, whereby any row that contains a duplicate value in a unique primary index column is stored in an error table labeled Error Table 2 (also referred to as the uniqueness violation table). The target option is the default value.

If the target is a SET table, fload_deduplication both checks rows with a duplicate value in a unique primary index column, as well as checking for duplicate rows and moving them to Error Table 2. This behavior does not apply to full mode, in which duplicate rows are discarded prior to applying uniqueness violation checks regardless of table type, as is the case when using input mode.

full Applies both the input and target options, which matches the behavior of the FastLoad utility.

Note

The fload_deduplication parameter replaces the fload_allow_duplicates parameter. If both parameters are included in a deployment's dtm.ini configuration file, Hyper-Q will use fload_deduplication and the FastLoad options it specifies.

host = hostname

String

Host name of the cloud-based data warehouse server.

ignore_stats_on_missing_named_index

Boolean

Specifies that Hyper-Q ignore statistics for missing named indexes. The ignored indexes are primary indexes. Secondary indexes are correctly recorded.

To enable ignore_stats_on_missing_named_index, set it to true. The default value is false.

"gateway".ignore_stats_on_missing_named_index = true

To learn more, see Ignore Statistics for Missing Named Indexes .Ignore Statistics for Missing Named Indexes

ipv

Integer

Specifies the type of IP address to use with the cloud-based data warehouse, which is either IPv4 or IPv6.

max_tdf_rows

Integer

Hyper-Q receives the result from the backend and splits the result into tabular data format (tdf) chunks which are sent for conversion. Specify the chunk size using the number of rows. For example: "gateway".max_tdf_rows = 10000000

Caution

Before modifying this parameter, contact Datometry Support to see if it is compatible with your environment.

max_tdf_size

Integer

Hyper-Q receives the result from the backend and splits the result into tabular data format (tdf) chunks which are sent for conversion. Specify the size of the chunk size in bytes. For example:

"gateway".max_tdf_size = 100MB

Caution

Before modifying this parameter, contact Datometry Support to see if it is compatible with your environment.

name

String

Specifies the name by which the gateway is called by other objects in the Hyper-Q configuration file.

parallel_conversion_available_ memory_size

Integer

Specifies the maximum memory that can be used by the Hyper-Q Parallel Conversion Process per session, which controls the degree of paralleism. The default value is 1 MB. To calculate the recommended value for your deployment, multiply the value of the max_tdf_size parameter by the number of cores, and multiply by 2:

max_tdf_size * number of cores * 2 = maximum memory

Supported memory units are B (byte), KB (kilobyte), MB (megabyte), GB (gigabyte), TB (terabyte). For example: 3200MB

Caution

Before modifying this parameter, contact Datometry Support to see if it is compatible with your environment.

physical_props_in_show_table

Boolean

Lets you display applicable hints related to table distribution and physical layout in SHOW TABLE output. You can use the SHOW TABLE output to create the exact same table on the target database using the hints for physical table properties.

"gateway".physical_props_in_show_table = true

port

Integer

Specifies the network port of the cloud-based data warehouse server.

synapse_username_resolution

String

On Azure Synapse CURRENT_USER returns dbo, which is the default database. SYSTEM_USER returns the current user. If your user authentication uses AAD mapped users, and you want to display the email address (or login name) of your users, you must specify SYSTEM_USER.

"gateway".synapse_username_resolution = system_user

The default value is current_user.

type

String

Specifies the cloud-based data warehouse vendor. Acceptable values are: bigquery and msadw.

version

Integer

Specifies that the endpoint treat the list of schemas as the default schema. In the following example, Hyper-Q treat another_schema as the default schema.

"gateway".version = “100000”