Creates (registers) a data catalog with the specified name and properties. Catalogs created are visible to all users of the same Amazon Web Services account.
Name |
[required] The name of the data catalog to create. The catalog name must be unique
for the Amazon Web Services account and can use a maximum of 127
alphanumeric, underscore, at sign, or hyphen characters. The remainder
of the length constraint of 256 is reserved for use by Athena.
For FEDERATED type the catalog name has following considerations and
limits:
The catalog name allows special characters such as _ , @ , \ , - .
These characters are replaced with a hyphen (-) when creating the
CFN Stack Name and with an underscore (_) when creating the Lambda
Function and Glue Connection Name.
The catalog name has a theoretical limit of 128 characters. However,
since we use it to create other resources that allow less characters
and we prepend a prefix to it, the actual catalog name limit for
FEDERATED catalog is 64 - 23 = 41 characters.
|
Type |
[required] The type of data catalog to create: LAMBDA for a federated catalog,
GLUE for an Glue Data Catalog, and HIVE for an external Apache Hive
metastore. FEDERATED is a federated catalog for which Athena creates
the connection and the Lambda function for you based on the parameters
that you pass.
|
Description |
A description of the data catalog to be created.
|
Parameters |
Specifies the Lambda function or functions to use for creating the data
catalog. This is a mapping whose values depend on the catalog type.
For the HIVE data catalog type, use the following syntax. The
metadata-function parameter is required. The sdk-version
parameter is optional and defaults to the currently supported
version.
metadata-function=lambda_arn, sdk-version=version_number
For the LAMBDA data catalog type, use one of the following sets of
required parameters, but not both.
If you have one Lambda function that processes metadata and
another for reading the actual data, use the following syntax.
Both parameters are required.
metadata-function=lambda_arn, record-function=lambda_arn
If you have a composite Lambda function that processes both
metadata and data, use the following syntax to specify your
Lambda function.
function=lambda_arn
The GLUE type takes a catalog ID parameter and is required. The
catalog_id is the account ID of the Amazon Web Services account
to which the Glue Data Catalog belongs.
catalog-id=catalog_id
The FEDERATED data catalog type uses one of the following
parameters, but not both. Use connection-arn for an existing Glue
connection. Use connection-type and connection-properties to
specify the configuration setting for a new connection.
-
connection-arn:<glue_connection_arn_to_reuse>
-
lambda-role-arn (optional): The execution role to use for the
Lambda function. If not provided, one is created.
-
connection-type:MYSQL|REDSHIFT|...., connection-properties:"<json_string>"
For \<json_string\> , use escaped JSON text, as in the
following example.
"{\"spill_bucket\":\"my_spill\",\"spill_prefix\":\"athena-spill\",\"host\":\"abc12345.snowflakecomputing.com\",\"port\":\"1234\",\"warehouse\":\"DEV_WH\",\"database\":\"TEST\",\"schema\":\"PUBLIC\",\"SecretArn\":\"arn:aws:secretsmanager:ap-south-1:111122223333:secret:snowflake-XHb67j\"}"
|
Tags |
A list of comma separated tags to add to the data catalog that is
created. All the resources that are created by the
create_data_catalog API operation with
FEDERATED type will have the tag
federated_athena_datacatalog="true" . This includes the CFN Stack, Glue
Connection, Athena DataCatalog, and all the resources created as part of
the CFN Stack (Lambda Function, IAM policies/roles).
|