class Google::Cloud::Bigquery::ExtractJob
A {Job} subclass representing an export operation that may be performed on a {Table} or {Model}. A ExtractJob
instance is returned when you call {Project#extract_job}, {Table#extract_job} or {Model#extract_job}.
@see cloud.google.com/bigquery/docs/exporting-data
Exporting table data
@see cloud.google.com/bigquery-ml/docs/exporting-models
Exporting models
@see cloud.google.com/bigquery/docs/reference/v2/jobs Jobs API
reference
@example Export table data
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" table = dataset.table "my_table" extract_job = table.extract_job "gs://my-bucket/file-name.json", format: "json" extract_job.wait_until_done! extract_job.done? #=> true
@example Export a model
require "google/cloud/bigquery" bigquery = Google::Cloud::Bigquery.new dataset = bigquery.dataset "my_dataset" model = dataset.model "my_model" extract_job = model.extract_job "gs://my-bucket/#{model.model_id}" extract_job.wait_until_done! extract_job.done? #=> true
Public Instance Methods
Checks if the destination format for the table data is [Avro](avro.apache.org/). The default is `false`. Not applicable when extracting models.
@return [Boolean] `true` when `AVRO`, `false` if not `AVRO` or not a
table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 145 def avro? return false unless table? @gapi.configuration.extract.destination_format == "AVRO" end
Checks if the export operation compresses the data using gzip. The default is `false`. Not applicable when extracting models.
@return [Boolean] `true` when `GZIP`, `false` if not `GZIP` or not a
table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 104 def compression? return false unless table? @gapi.configuration.extract.compression == "GZIP" end
Checks if the destination format for the table data is CSV. Tables with nested or repeated fields cannot be exported as CSV. The default is `true` for tables. Not applicable when extracting models.
@return [Boolean] `true` when `CSV`, or `false` if not `CSV` or not a
table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 130 def csv? return false unless table? val = @gapi.configuration.extract.destination_format return true if val.nil? val == "CSV" end
The character or symbol the operation uses to delimit fields in the exported data. The default is a comma (,) for tables. Not applicable when extracting models.
@return [String, nil] A string containing the character, such as `“,”`,
`nil` if not a table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 184 def delimiter return unless table? val = @gapi.configuration.extract.field_delimiter val = "," if val.nil? val end
A hash containing the URI or URI pattern specified in {#destinations} mapped to the counts of files per destination.
@return [Hash<String, Integer>] A Hash with the URI patterns as keys
and the counts as values.
# File lib/google/cloud/bigquery/extract_job.rb, line 223 def destinations_counts Hash[destinations.zip destinations_file_counts] end
The number of files per destination URI or URI pattern specified in {#destinations}.
@return [Array<Integer>] An array of values in the same order as the
URI patterns.
# File lib/google/cloud/bigquery/extract_job.rb, line 212 def destinations_file_counts Array @gapi.statistics.extract.destination_uri_file_counts end
Checks if the destination format for the table data is [newline-delimited JSON](jsonlines.org/). The default is `false`. Not applicable when extracting models.
@return [Boolean] `true` when `NEWLINE_DELIMITED_JSON`, `false` if not
`NEWLINE_DELIMITED_JSON` or not a table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 117 def json? return false unless table? @gapi.configuration.extract.destination_format == "NEWLINE_DELIMITED_JSON" end
Checks if the destination format for the model is TensorFlow SavedModel. The default is `true` for models. Not applicable when extracting tables.
@return [Boolean] `true` when `ML_TF_SAVED_MODEL`, `false` if not
`ML_TF_SAVED_MODEL` or not a model extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 157 def ml_tf_saved_model? return false unless model? val = @gapi.configuration.extract.destination_format return true if val.nil? val == "ML_TF_SAVED_MODEL" end
Checks if the destination format for the model is XGBoost. The default is `false`. Not applicable when extracting tables.
@return [Boolean] `true` when `ML_XGBOOST_BOOSTER`, `false` if not
`ML_XGBOOST_BOOSTER` or not a model extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 171 def ml_xgboost_booster? return false unless model? @gapi.configuration.extract.destination_format == "ML_XGBOOST_BOOSTER" end
Whether the source of the export job is a model. See {#source}.
@return [Boolean] `true` when the source is a model, `false`
otherwise.
# File lib/google/cloud/bigquery/extract_job.rb, line 94 def model? !@gapi.configuration.extract.source_model.nil? end
Checks if the exported data contains a header row. The default is `true` for tables. Not applicable when extracting models.
@return [Boolean] `true` when the print header configuration is
present or `nil`, `false` if disabled or not a table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 198 def print_header? return false unless table? val = @gapi.configuration.extract.print_header val = true if val.nil? val end
The table or model which is exported.
@return [Table, Model
, nil] A table or model instance, or `nil`.
# File lib/google/cloud/bigquery/extract_job.rb, line 70 def source if (table = @gapi.configuration.extract.source_table) retrieve_table table.project_id, table.dataset_id, table.table_id elsif (model = @gapi.configuration.extract.source_model) retrieve_model model.project_id, model.dataset_id, model.model_id end end
Whether the source of the export job is a table. See {#source}.
@return [Boolean] `true` when the source is a table, `false`
otherwise.
# File lib/google/cloud/bigquery/extract_job.rb, line 84 def table? !@gapi.configuration.extract.source_table.nil? end
If `#avro?` (`#format` is set to `“AVRO”`), this flag indicates whether to enable extracting applicable column types (such as `TIMESTAMP`) to their corresponding AVRO logical types (`timestamp-micros`), instead of only using their raw types (`avro-long`). Not applicable when extracting models.
@return [Boolean] `true` when applicable column types will use their
corresponding AVRO logical types, `false` if not enabled or not a table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 238 def use_avro_logical_types? return false unless table? @gapi.configuration.extract.use_avro_logical_types end
Protected Instance Methods
# File lib/google/cloud/bigquery/extract_job.rb, line 464 def retrieve_model project_id, dataset_id, model_id ensure_service! gapi = service.get_project_model project_id, dataset_id, model_id Model.from_gapi_json gapi, service rescue Google::Cloud::NotFoundError nil end