class Google::Cloud::Bigquery::ExtractJob

# ExtractJob

A {Job} subclass representing an export operation that may be performed on a {Table} or {Model}. A ExtractJob instance is returned when you call {Project#extract_job}, {Table#extract_job} or {Model#extract_job}.

@see cloud.google.com/bigquery/docs/exporting-data

Exporting table data

@see cloud.google.com/bigquery-ml/docs/exporting-models

Exporting models

@see cloud.google.com/bigquery/docs/reference/v2/jobs Jobs API

reference

@example Export table data

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

extract_job = table.extract_job "gs://my-bucket/file-name.json",
                                format: "json"
extract_job.wait_until_done!
extract_job.done? #=> true

@example Export a model

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model"

extract_job = model.extract_job "gs://my-bucket/#{model.model_id}"

extract_job.wait_until_done!
extract_job.done? #=> true

Public Instance Methods

avro?() click to toggle source

Checks if the destination format for the table data is [Avro](avro.apache.org/). The default is `false`. Not applicable when extracting models.

@return [Boolean] `true` when `AVRO`, `false` if not `AVRO` or not a

table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 145
def avro?
  return false unless table?
  @gapi.configuration.extract.destination_format == "AVRO"
end
compression?() click to toggle source

Checks if the export operation compresses the data using gzip. The default is `false`. Not applicable when extracting models.

@return [Boolean] `true` when `GZIP`, `false` if not `GZIP` or not a

table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 104
def compression?
  return false unless table?
  @gapi.configuration.extract.compression == "GZIP"
end
csv?() click to toggle source

Checks if the destination format for the table data is CSV. Tables with nested or repeated fields cannot be exported as CSV. The default is `true` for tables. Not applicable when extracting models.

@return [Boolean] `true` when `CSV`, or `false` if not `CSV` or not a

table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 130
def csv?
  return false unless table?
  val = @gapi.configuration.extract.destination_format
  return true if val.nil?
  val == "CSV"
end
delimiter() click to toggle source

The character or symbol the operation uses to delimit fields in the exported data. The default is a comma (,) for tables. Not applicable when extracting models.

@return [String, nil] A string containing the character, such as `“,”`,

`nil` if not a table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 184
def delimiter
  return unless table?
  val = @gapi.configuration.extract.field_delimiter
  val = "," if val.nil?
  val
end
destinations() click to toggle source

The URI or URIs representing the Google Cloud Storage files to which the data is exported.

# File lib/google/cloud/bigquery/extract_job.rb, line 61
def destinations
  Array @gapi.configuration.extract.destination_uris
end
destinations_counts() click to toggle source

A hash containing the URI or URI pattern specified in {#destinations} mapped to the counts of files per destination.

@return [Hash<String, Integer>] A Hash with the URI patterns as keys

and the counts as values.
# File lib/google/cloud/bigquery/extract_job.rb, line 223
def destinations_counts
  Hash[destinations.zip destinations_file_counts]
end
destinations_file_counts() click to toggle source

The number of files per destination URI or URI pattern specified in {#destinations}.

@return [Array<Integer>] An array of values in the same order as the

URI patterns.
# File lib/google/cloud/bigquery/extract_job.rb, line 212
def destinations_file_counts
  Array @gapi.statistics.extract.destination_uri_file_counts
end
json?() click to toggle source

Checks if the destination format for the table data is [newline-delimited JSON](jsonlines.org/). The default is `false`. Not applicable when extracting models.

@return [Boolean] `true` when `NEWLINE_DELIMITED_JSON`, `false` if not

`NEWLINE_DELIMITED_JSON` or not a table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 117
def json?
  return false unless table?
  @gapi.configuration.extract.destination_format == "NEWLINE_DELIMITED_JSON"
end
ml_tf_saved_model?() click to toggle source

Checks if the destination format for the model is TensorFlow SavedModel. The default is `true` for models. Not applicable when extracting tables.

@return [Boolean] `true` when `ML_TF_SAVED_MODEL`, `false` if not

`ML_TF_SAVED_MODEL` or not a model extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 157
def ml_tf_saved_model?
  return false unless model?
  val = @gapi.configuration.extract.destination_format
  return true if val.nil?
  val == "ML_TF_SAVED_MODEL"
end
ml_xgboost_booster?() click to toggle source

Checks if the destination format for the model is XGBoost. The default is `false`. Not applicable when extracting tables.

@return [Boolean] `true` when `ML_XGBOOST_BOOSTER`, `false` if not

`ML_XGBOOST_BOOSTER` or not a model extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 171
def ml_xgboost_booster?
  return false unless model?
  @gapi.configuration.extract.destination_format == "ML_XGBOOST_BOOSTER"
end
model?() click to toggle source

Whether the source of the export job is a model. See {#source}.

@return [Boolean] `true` when the source is a model, `false`

otherwise.
# File lib/google/cloud/bigquery/extract_job.rb, line 94
def model?
  !@gapi.configuration.extract.source_model.nil?
end
print_header?() click to toggle source

Checks if the exported data contains a header row. The default is `true` for tables. Not applicable when extracting models.

@return [Boolean] `true` when the print header configuration is

present or `nil`, `false` if disabled or not a table extraction.
source() click to toggle source

The table or model which is exported.

@return [Table, Model, nil] A table or model instance, or `nil`.

# File lib/google/cloud/bigquery/extract_job.rb, line 70
def source
  if (table = @gapi.configuration.extract.source_table)
    retrieve_table table.project_id, table.dataset_id, table.table_id
  elsif (model = @gapi.configuration.extract.source_model)
    retrieve_model model.project_id, model.dataset_id, model.model_id
  end
end
table?() click to toggle source

Whether the source of the export job is a table. See {#source}.

@return [Boolean] `true` when the source is a table, `false`

otherwise.
# File lib/google/cloud/bigquery/extract_job.rb, line 84
def table?
  !@gapi.configuration.extract.source_table.nil?
end
use_avro_logical_types?() click to toggle source

If `#avro?` (`#format` is set to `“AVRO”`), this flag indicates whether to enable extracting applicable column types (such as `TIMESTAMP`) to their corresponding AVRO logical types (`timestamp-micros`), instead of only using their raw types (`avro-long`). Not applicable when extracting models.

@return [Boolean] `true` when applicable column types will use their

corresponding AVRO logical types, `false` if not enabled or not a
table extraction.
# File lib/google/cloud/bigquery/extract_job.rb, line 238
def use_avro_logical_types?
  return false unless table?
  @gapi.configuration.extract.use_avro_logical_types
end

Protected Instance Methods

retrieve_model(project_id, dataset_id, model_id) click to toggle source
# File lib/google/cloud/bigquery/extract_job.rb, line 464
def retrieve_model project_id, dataset_id, model_id
  ensure_service!
  gapi = service.get_project_model project_id, dataset_id, model_id
  Model.from_gapi_json gapi, service
rescue Google::Cloud::NotFoundError
  nil
end