class Aws::DatabaseMigrationService::Types::S3Settings

Settings for exporting data to Amazon S3.

@note When making an API call, you may pass S3Settings

data as a hash:

    {
      service_access_role_arn: "String",
      external_table_definition: "String",
      csv_row_delimiter: "String",
      csv_delimiter: "String",
      bucket_folder: "String",
      bucket_name: "String",
      compression_type: "none", # accepts none, gzip
      encryption_mode: "sse-s3", # accepts sse-s3, sse-kms
      server_side_encryption_kms_key_id: "String",
      data_format: "csv", # accepts csv, parquet
      encoding_type: "plain", # accepts plain, plain-dictionary, rle-dictionary
      dict_page_size_limit: 1,
      row_group_length: 1,
      data_page_size: 1,
      parquet_version: "parquet-1-0", # accepts parquet-1-0, parquet-2-0
      enable_statistics: false,
      include_op_for_full_load: false,
      cdc_inserts_only: false,
      timestamp_column_name: "String",
      parquet_timestamp_in_millisecond: false,
      cdc_inserts_and_updates: false,
      date_partition_enabled: false,
      date_partition_sequence: "YYYYMMDD", # accepts YYYYMMDD, YYYYMMDDHH, YYYYMM, MMYYYYDD, DDMMYYYY
      date_partition_delimiter: "SLASH", # accepts SLASH, UNDERSCORE, DASH, NONE
      use_csv_no_sup_value: false,
      csv_no_sup_value: "String",
      preserve_transactions: false,
      cdc_path: "String",
      canned_acl_for_objects: "none", # accepts none, private, public-read, public-read-write, authenticated-read, aws-exec-read, bucket-owner-read, bucket-owner-full-control
      add_column_name: false,
      cdc_max_batch_interval: 1,
      cdc_min_file_size: 1,
      csv_null_value: "String",
      ignore_header_rows: 1,
      max_file_size: 1,
      rfc_4180: false,
    }

@!attribute [rw] service_access_role_arn

The Amazon Resource Name (ARN) used by the service to access the IAM
role. The role must allow the `iam:PassRole` action. It is a
required parameter that enables DMS to write and read objects from
an S3 bucket.
@return [String]

@!attribute [rw] external_table_definition

Specifies how tables are defined in the S3 source files only.
@return [String]

@!attribute [rw] csv_row_delimiter

The delimiter used to separate rows in the .csv file for both source
and target. The default is a carriage return (`\n`).
@return [String]

@!attribute [rw] csv_delimiter

The delimiter used to separate columns in the .csv file for both
source and target. The default is a comma.
@return [String]

@!attribute [rw] bucket_folder

An optional parameter to set a folder name in the S3 bucket. If
provided, tables are created in the path `
bucketFolder/schema_name/table_name/`. If this parameter isn't
specified, then the path used is ` schema_name/table_name/`.
@return [String]

@!attribute [rw] bucket_name

The name of the S3 bucket.
@return [String]

@!attribute [rw] compression_type

An optional parameter to use GZIP to compress the target files. Set
to GZIP to compress the target files. Either set this parameter to
NONE (the default) or don't use it to leave the files uncompressed.
This parameter applies to both .csv and .parquet file formats.
@return [String]

@!attribute [rw] encryption_mode

The type of server-side encryption that you want to use for your
data. This encryption type is part of the endpoint settings or the
extra connections attributes for Amazon S3. You can choose either
`SSE_S3` (the default) or `SSE_KMS`.

<note markdown="1"> For the `ModifyEndpoint` operation, you can change the existing
value of the `EncryptionMode` parameter from `SSE_KMS` to `SSE_S3`.
But you can’t change the existing value from `SSE_S3` to `SSE_KMS`.

 </note>

To use `SSE_S3`, you need an Identity and Access Management (IAM)
role with permission to allow `"arn:aws:s3:::dms-*"` to use the
following actions:

* `s3:CreateBucket`

* `s3:ListBucket`

* `s3:DeleteBucket`

* `s3:GetBucketLocation`

* `s3:GetObject`

* `s3:PutObject`

* `s3:DeleteObject`

* `s3:GetObjectVersion`

* `s3:GetBucketPolicy`

* `s3:PutBucketPolicy`

* `s3:DeleteBucketPolicy`
@return [String]

@!attribute [rw] server_side_encryption_kms_key_id

If you are using `SSE_KMS` for the `EncryptionMode`, provide the KMS
key ID. The key that you use needs an attached policy that enables
Identity and Access Management (IAM) user permissions and allows use
of the key.

Here is a CLI example: `aws dms create-endpoint
--endpoint-identifier value --endpoint-type target --engine-name s3
--s3-settings
ServiceAccessRoleArn=value,BucketFolder=value,BucketName=value,EncryptionMode=SSE_KMS,ServerSideEncryptionKmsKeyId=value
`
@return [String]

@!attribute [rw] data_format

The format of the data that you want to use for output. You can
choose one of the following:

* `csv`\: This is a row-based file format with comma-separated
  values (.csv).

* `parquet`\: Apache Parquet (.parquet) is a columnar storage file
  format that features efficient compression and provides faster
  query response.
@return [String]

@!attribute [rw] encoding_type

The type of encoding you are using:

* `RLE_DICTIONARY` uses a combination of bit-packing and run-length
  encoding to store repeated values more efficiently. This is the
  default.

* `PLAIN` doesn't use encoding at all. Values are stored as they
  are.

* `PLAIN_DICTIONARY` builds a dictionary of the values encountered
  in a given column. The dictionary is stored in a dictionary page
  for each column chunk.
@return [String]

@!attribute [rw] dict_page_size_limit

The maximum size of an encoded dictionary page of a column. If the
dictionary page exceeds this, this column is stored using an
encoding type of `PLAIN`. This parameter defaults to 1024 * 1024
bytes (1 MiB), the maximum size of a dictionary page before it
reverts to `PLAIN` encoding. This size is used for .parquet file
format only.
@return [Integer]

@!attribute [rw] row_group_length

The number of rows in a row group. A smaller row group size provides
faster reads. But as the number of row groups grows, the slower
writes become. This parameter defaults to 10,000 rows. This number
is used for .parquet file format only.

If you choose a value larger than the maximum, `RowGroupLength` is
set to the max row group length in bytes (64 * 1024 * 1024).
@return [Integer]

@!attribute [rw] data_page_size

The size of one data page in bytes. This parameter defaults to 1024
* 1024 bytes (1 MiB). This number is used for .parquet file format
only.
@return [Integer]

@!attribute [rw] parquet_version

The version of the Apache Parquet format that you want to use:
`parquet_1_0` (the default) or `parquet_2_0`.
@return [String]

@!attribute [rw] enable_statistics

A value that enables statistics for Parquet pages and row groups.
Choose `true` to enable statistics, `false` to disable. Statistics
include `NULL`, `DISTINCT`, `MAX`, and `MIN` values. This parameter
defaults to `true`. This value is used for .parquet file format
only.
@return [Boolean]

@!attribute [rw] include_op_for_full_load

A value that enables a full load to write INSERT operations to the
comma-separated value (.csv) output files only to indicate how the
rows were added to the source database.

<note markdown="1"> DMS supports the `IncludeOpForFullLoad` parameter in versions 3.1.4
and later.

 </note>

For full load, records can only be inserted. By default (the `false`
setting), no information is recorded in these output files for a
full load to indicate that the rows were inserted at the source
database. If `IncludeOpForFullLoad` is set to `true` or `y`, the
INSERT is recorded as an I annotation in the first field of the .csv
file. This allows the format of your target records from a full load
to be consistent with the target records from a CDC load.

<note markdown="1"> This setting works together with the `CdcInsertsOnly` and the
`CdcInsertsAndUpdates` parameters for output to .csv files only. For
more information about how these settings work together, see
[Indicating Source DB Operations in Migrated S3 Data][1] in the
*Database Migration Service User Guide.*.

 </note>

[1]: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html#CHAP_Target.S3.Configuring.InsertOps
@return [Boolean]

@!attribute [rw] cdc_inserts_only

A value that enables a change data capture (CDC) load to write only
INSERT operations to .csv or columnar storage (.parquet) output
files. By default (the `false` setting), the first field in a .csv
or .parquet record contains the letter I (INSERT), U (UPDATE), or D
(DELETE). These values indicate whether the row was inserted,
updated, or deleted at the source database for a CDC load to the
target.

If `CdcInsertsOnly` is set to `true` or `y`, only INSERTs from the
source database are migrated to the .csv or .parquet file. For .csv
format only, how these INSERTs are recorded depends on the value of
`IncludeOpForFullLoad`. If `IncludeOpForFullLoad` is set to `true`,
the first field of every CDC record is set to I to indicate the
INSERT operation at the source. If `IncludeOpForFullLoad` is set to
`false`, every CDC record is written without a first field to
indicate the INSERT operation at the source. For more information
about how these settings work together, see [Indicating Source DB
Operations in Migrated S3 Data][1] in the *Database Migration
Service User Guide.*.

<note markdown="1"> DMS supports the interaction described preceding between the
`CdcInsertsOnly` and `IncludeOpForFullLoad` parameters in versions
3.1.4 and later.

 `CdcInsertsOnly` and `CdcInsertsAndUpdates` can't both be set to
`true` for the same endpoint. Set either `CdcInsertsOnly` or
`CdcInsertsAndUpdates` to `true` for the same endpoint, but not
both.

 </note>

[1]: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html#CHAP_Target.S3.Configuring.InsertOps
@return [Boolean]

@!attribute [rw] timestamp_column_name

A value that when nonblank causes DMS to add a column with timestamp
information to the endpoint data for an Amazon S3 target.

<note markdown="1"> DMS supports the `TimestampColumnName` parameter in versions 3.1.4
and later.

 </note>

DMS includes an additional `STRING` column in the .csv or .parquet
object files of your migrated data when you set
`TimestampColumnName` to a nonblank value.

For a full load, each row of this timestamp column contains a
timestamp for when the data was transferred from the source to the
target by DMS.

For a change data capture (CDC) load, each row of the timestamp
column contains the timestamp for the commit of that row in the
source database.

The string format for this timestamp column value is `yyyy-MM-dd
HH:mm:ss.SSSSSS`. By default, the precision of this value is in
microseconds. For a CDC load, the rounding of the precision depends
on the commit timestamp supported by DMS for the source database.

When the `AddColumnName` parameter is set to `true`, DMS also
includes a name for the timestamp column that you set with
`TimestampColumnName`.
@return [String]

@!attribute [rw] parquet_timestamp_in_millisecond

A value that specifies the precision of any `TIMESTAMP` column
values that are written to an Amazon S3 object file in .parquet
format.

<note markdown="1"> DMS supports the `ParquetTimestampInMillisecond` parameter in
versions 3.1.4 and later.

 </note>

When `ParquetTimestampInMillisecond` is set to `true` or `y`, DMS
writes all `TIMESTAMP` columns in a .parquet formatted file with
millisecond precision. Otherwise, DMS writes them with microsecond
precision.

Currently, Amazon Athena and Glue can handle only millisecond
precision for `TIMESTAMP` values. Set this parameter to `true` for
S3 endpoint object files that are .parquet formatted only if you
plan to query or process the data with Athena or Glue.

<note markdown="1"> DMS writes any `TIMESTAMP` column values written to an S3 file in
.csv format with microsecond precision.

 Setting `ParquetTimestampInMillisecond` has no effect on the string
format of the timestamp column value that is inserted by setting the
`TimestampColumnName` parameter.

 </note>
@return [Boolean]

@!attribute [rw] cdc_inserts_and_updates

A value that enables a change data capture (CDC) load to write
INSERT and UPDATE operations to .csv or .parquet (columnar storage)
output files. The default setting is `false`, but when
`CdcInsertsAndUpdates` is set to `true` or `y`, only INSERTs and
UPDATEs from the source database are migrated to the .csv or
.parquet file.

For .csv file format only, how these INSERTs and UPDATEs are
recorded depends on the value of the `IncludeOpForFullLoad`
parameter. If `IncludeOpForFullLoad` is set to `true`, the first
field of every CDC record is set to either `I` or `U` to indicate
INSERT and UPDATE operations at the source. But if
`IncludeOpForFullLoad` is set to `false`, CDC records are written
without an indication of INSERT or UPDATE operations at the source.
For more information about how these settings work together, see
[Indicating Source DB Operations in Migrated S3 Data][1] in the
*Database Migration Service User Guide.*.

<note markdown="1"> DMS supports the use of the `CdcInsertsAndUpdates` parameter in
versions 3.3.1 and later.

 `CdcInsertsOnly` and `CdcInsertsAndUpdates` can't both be set to
`true` for the same endpoint. Set either `CdcInsertsOnly` or
`CdcInsertsAndUpdates` to `true` for the same endpoint, but not
both.

 </note>

[1]: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html#CHAP_Target.S3.Configuring.InsertOps
@return [Boolean]

@!attribute [rw] date_partition_enabled

When set to `true`, this parameter partitions S3 bucket folders
based on transaction commit dates. The default value is `false`. For
more information about date-based folder partitioning, see [Using
date-based folder partitioning][1].

[1]: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html#CHAP_Target.S3.DatePartitioning
@return [Boolean]

@!attribute [rw] date_partition_sequence

Identifies the sequence of the date format to use during folder
partitioning. The default value is `YYYYMMDD`. Use this parameter
when `DatePartitionedEnabled` is set to `true`.
@return [String]

@!attribute [rw] date_partition_delimiter

Specifies a date separating delimiter to use during folder
partitioning. The default value is `SLASH`. Use this parameter when
`DatePartitionedEnabled` is set to `true`.
@return [String]

@!attribute [rw] use_csv_no_sup_value

This setting applies if the S3 output files during a change data
capture (CDC) load are written in .csv format. If set to `true` for
columns not included in the supplemental log, DMS uses the value
specified by [ `CsvNoSupValue` ][1]. If not set or set to `false`,
DMS uses the null value for these columns.

<note markdown="1"> This setting is supported in DMS versions 3.4.1 and later.

 </note>

[1]: https://docs.aws.amazon.com/dms/latest/APIReference/API_S3Settings.html#DMS-Type-S3Settings-CsvNoSupValue
@return [Boolean]

@!attribute [rw] csv_no_sup_value

This setting only applies if your Amazon S3 output files during a
change data capture (CDC) load are written in .csv format. If [
`UseCsvNoSupValue` ][1] is set to true, specify a string value that
you want DMS to use for all columns not included in the supplemental
log. If you do not specify a string value, DMS uses the null value
for these columns regardless of the `UseCsvNoSupValue` setting.

<note markdown="1"> This setting is supported in DMS versions 3.4.1 and later.

 </note>

[1]: https://docs.aws.amazon.com/dms/latest/APIReference/API_S3Settings.html#DMS-Type-S3Settings-UseCsvNoSupValue
@return [String]

@!attribute [rw] preserve_transactions

If set to `true`, DMS saves the transaction order for a change data
capture (CDC) load on the Amazon S3 target specified by [ `CdcPath`
][1]. For more information, see [Capturing data changes (CDC)
including transaction order on the S3 target][2].

<note markdown="1"> This setting is supported in DMS versions 3.4.2 and later.

 </note>

[1]: https://docs.aws.amazon.com/dms/latest/APIReference/API_S3Settings.html#DMS-Type-S3Settings-CdcPath
[2]: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html#CHAP_Target.S3.EndpointSettings.CdcPath
@return [Boolean]

@!attribute [rw] cdc_path

Specifies the folder path of CDC files. For an S3 source, this
setting is required if a task captures change data; otherwise, it's
optional. If `CdcPath` is set, DMS reads CDC files from this path
and replicates the data changes to the target endpoint. For an S3
target if you set [ `PreserveTransactions` ][1] to `true`, DMS
verifies that you have set this parameter to a folder path on your
S3 target where DMS can save the transaction order for the CDC load.
DMS creates this CDC folder path in either your S3 target working
directory or the S3 target location specified by [ `BucketFolder`
][2] and [ `BucketName` ][3].

For example, if you specify `CdcPath` as `MyChangedData`, and you
specify `BucketName` as `MyTargetBucket` but do not specify
`BucketFolder`, DMS creates the CDC folder path following:
`MyTargetBucket/MyChangedData`.

If you specify the same `CdcPath`, and you specify `BucketName` as
`MyTargetBucket` and `BucketFolder` as `MyTargetData`, DMS creates
the CDC folder path following:
`MyTargetBucket/MyTargetData/MyChangedData`.

For more information on CDC including transaction order on an S3
target, see [Capturing data changes (CDC) including transaction
order on the S3 target][4].

<note markdown="1"> This setting is supported in DMS versions 3.4.2 and later.

 </note>

[1]: https://docs.aws.amazon.com/dms/latest/APIReference/API_S3Settings.html#DMS-Type-S3Settings-PreserveTransactions
[2]: https://docs.aws.amazon.com/dms/latest/APIReference/API_S3Settings.html#DMS-Type-S3Settings-BucketFolder
[3]: https://docs.aws.amazon.com/dms/latest/APIReference/API_S3Settings.html#DMS-Type-S3Settings-BucketName
[4]: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html#CHAP_Target.S3.EndpointSettings.CdcPath
@return [String]

@!attribute [rw] canned_acl_for_objects

A value that enables DMS to specify a predefined (canned) access
control list for objects created in an Amazon S3 bucket as .csv or
.parquet files. For more information about Amazon S3 canned ACLs,
see [Canned ACL][1] in the *Amazon S3 Developer Guide.*

The default value is NONE. Valid values include NONE, PRIVATE,
PUBLIC\_READ, PUBLIC\_READ\_WRITE, AUTHENTICATED\_READ,
AWS\_EXEC\_READ, BUCKET\_OWNER\_READ, and
BUCKET\_OWNER\_FULL\_CONTROL.

[1]: http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl
@return [String]

@!attribute [rw] add_column_name

An optional parameter that, when set to `true` or `y`, you can use
to add column name information to the .csv output file.

The default value is `false`. Valid values are `true`, `false`, `y`,
and `n`.
@return [Boolean]

@!attribute [rw] cdc_max_batch_interval

Maximum length of the interval, defined in seconds, after which to
output a file to Amazon S3.

When `CdcMaxBatchInterval` and `CdcMinFileSize` are both specified,
the file write is triggered by whichever parameter condition is met
first within an DMS CloudFormation template.

The default value is 60 seconds.
@return [Integer]

@!attribute [rw] cdc_min_file_size

Minimum file size, defined in megabytes, to reach for a file output
to Amazon S3.

When `CdcMinFileSize` and `CdcMaxBatchInterval` are both specified,
the file write is triggered by whichever parameter condition is met
first within an DMS CloudFormation template.

The default value is 32 MB.
@return [Integer]

@!attribute [rw] csv_null_value

An optional parameter that specifies how DMS treats null values.
While handling the null value, you can use this parameter to pass a
user-defined string as null when writing to the target. For example,
when target columns are not nullable, you can use this option to
differentiate between the empty string value and the null value. So,
if you set this parameter value to the empty string ("" or ''),
DMS treats the empty string as the null value instead of `NULL`.

The default value is `NULL`. Valid values include any valid string.
@return [String]

@!attribute [rw] ignore_header_rows

When this value is set to 1, DMS ignores the first row header in a
.csv file. A value of 1 turns on the feature; a value of 0 turns off
the feature.

The default is 0.
@return [Integer]

@!attribute [rw] max_file_size

A value that specifies the maximum size (in KB) of any .csv file to
be created while migrating to an S3 target during full load.

The default value is 1,048,576 KB (1 GB). Valid values include 1 to
1,048,576.
@return [Integer]

@!attribute [rw] rfc_4180

For an S3 source, when this value is set to `true` or `y`, each
leading double quotation mark has to be followed by an ending double
quotation mark. This formatting complies with RFC 4180. When this
value is set to `false` or `n`, string literals are copied to the
target as is. In this case, a delimiter (row or column) signals the
end of the field. Thus, you can't use a delimiter as part of the
string, because it signals the end of the value.

For an S3 target, an optional parameter used to set behavior to
comply with RFC 4180 for data migrated to Amazon S3 using .csv file
format only. When this value is set to `true` or `y` using Amazon S3
as a target, if the data has quotation marks or newline characters
in it, DMS encloses the entire column with an additional pair of
double quotation marks ("). Every quotation mark within the data is
repeated twice.

The default value is `true`. Valid values include `true`, `false`,
`y`, and `n`.
@return [Boolean]

@see docs.aws.amazon.com/goto/WebAPI/dms-2016-01-01/S3Settings AWS API Documentation

Constants

SENSITIVE