Components
A good ninja gets clued up on its gear.
Core Components
Every Bento pipeline has at least one input, an optional buffer, an output and any number of processors:
input:
kafka:
addresses: [ TODO ]
topics: [ foo, bar ]
consumer_group: foogroup
buffer:
type: none
pipeline:
processors:
- mapping: |
message = this
meta.link_count = links.length()
output:
aws_s3:
bucket: TODO
path: '${! metadata("kafka_topic") }/${! json("message.id") }.json'
These are the main components within Bento and they provide the majority of useful behaviour.
Observability Components
There are also the observability components http, logger, metrics, and tracing, which allow you to specify how Bento exposes observability data:
http:
address: 0.0.0.0:4195
enabled: true
debug_endpoints: false
logger:
format: json
level: WARN
metrics:
statsd:
address: localhost:8125
flush_period: 100ms
tracer:
jaeger:
agent_address: localhost:6831
Resource Components
Finally, there are caches and rate limits. These are components that are referenced by core components and can be shared.
input:
http_client: # This is an input
url: TODO
rate_limit: foo_ratelimit # This is a reference to a rate limit
pipeline:
processors:
- cache: # This is a processor
resource: baz_cache # This is a reference to a cache
operator: add
key: '${! json("id") }'
value: "x"
- mapping: root = if errored() { deleted() }
rate_limit_resources:
- label: foo_ratelimit
local:
count: 500
interval: 1s
cache_resources:
- label: baz_cache
memcached:
addresses: [ localhost:11211 ]
It's also possible to configure inputs, outputs and processors as resources which allows them to be reused throughout a configuration with the resource
input, resource
output and resource
processor respectively.
For more information about any of these component types check out their sections: