gcp_cloud_storage
This component is mostly stable but breaking changes could still be made outside of major version releases if a fundamental problem with the component is found.
Downloads objects within a Google Cloud Storage bucket, optionally filtered by a prefix.
Introduced in version 1.0.0.
- Common
- Advanced
# Common config fields, showing default values
input:
label: ""
gcp_cloud_storage:
bucket: "" # No default (required)
prefix: ""
scanner:
to_the_end: {}
# All config fields, showing default values
input:
label: ""
gcp_cloud_storage:
bucket: "" # No default (required)
prefix: ""
scanner:
to_the_end: {}
delete_objects: false
Metadata
This input adds the following metadata fields to each message:
- gcs_key
- gcs_bucket
- gcs_last_modified
- gcs_last_modified_unix
- gcs_content_type
- gcs_content_encoding
- All user defined metadata
You can access these metadata fields using function interpolation.
Credentials
By default Bento will use a shared credentials file when connecting to GCP services. You can find out more in this document.
Fields
bucket
The name of the bucket from which to download objects.
Type: string
prefix
An optional path prefix, if set only objects with the prefix are consumed.
Type: string
Default: ""
scanner
The scanner by which the stream of bytes consumed will be broken out into individual messages. Scanners are useful for processing large sources of data without holding the entirety of it within memory. For example, the csv
scanner allows you to process individual CSV rows without loading the entire CSV file in memory at once.
Type: scanner
Default: {"to_the_end":{}}
Requires version 1.0.0 or newer
delete_objects
Whether to delete downloaded objects from the bucket once they are processed.
Type: bool
Default: false