Skip to main content

gcp_cloud_storage

BETA

This component is mostly stable but breaking changes could still be made outside of major version releases if a fundamental problem with the component is found.

Downloads objects within a Google Cloud Storage bucket, optionally filtered by a prefix.

Introduced in version 1.0.0.

# Common config fields, showing default values
input:
label: ""
gcp_cloud_storage:
bucket: "" # No default (required)
prefix: ""
scanner:
to_the_end: {}

Metadata

This input adds the following metadata fields to each message:

- gcs_key
- gcs_bucket
- gcs_last_modified
- gcs_last_modified_unix
- gcs_content_type
- gcs_content_encoding
- All user defined metadata

You can access these metadata fields using function interpolation.

Credentials

By default Bento will use a shared credentials file when connecting to GCP services. You can find out more in this document.

Fields

bucket

The name of the bucket from which to download objects.

Type: string

prefix

An optional path prefix, if set only objects with the prefix are consumed.

Type: string
Default: ""

scanner

The scanner by which the stream of bytes consumed will be broken out into individual messages. Scanners are useful for processing large sources of data without holding the entirety of it within memory. For example, the csv scanner allows you to process individual CSV rows without loading the entire CSV file in memory at once.

Type: scanner
Default: {"to_the_end":{}}
Requires version 1.0.0 or newer

delete_objects

Whether to delete downloaded objects from the bucket once they are processed.

Type: bool
Default: false