Filtering and Sampling
Configure Bento to conditionally drop messages.
Events are like eyebrows, sometimes it's best to just get rid of them. Filtering events in Bento is both easy and flexible, this cookbook demonstrates a few different types of filtering you can do. All of these examples make use of the mapping
processor but shouldn't require any prior knowledge.
The Basic Filter
Dropping events with Bloblang is done by mapping the function deleted()
to the root
of the mapped document. To remove all events indiscriminately you can simply do:
pipeline:
processors:
- mapping: root = deleted()
But that's most likely not what you want. We can instead only delete an event under certain conditions with a match
or if
expression:
pipeline:
processors:
- mapping: |
root = if @topic.or("") == "foo" ||
this.doc.type == "bar" ||
this.doc.urls.contains("https://warpstreamlabs.github.io/bento/").catch(false) {
deleted()
}
The above config removes any events where:
- The metadata field
topic
is equal tofoo
- The event field
doc.type
(a string) is equal tobar
- The event field
doc.urls
(an array) contains the stringhttps://warpstreamlabs.github.io/bento/
Events that do not match any of these conditions will remain unchanged.
Sample Events
Another type of filter we might want is a sampling filter, we can do that with a random number generator:
pipeline:
processors:
- mapping: |
# Drop 50% of documents randomly
root = if random_int() % 2 == 0 { deleted() }
We can also do this in a deterministic way by hashing events and filtering by that hash value:
pipeline:
processors:
- mapping: |
# Drop ~10% of documents deterministically (same docs filtered each run)
root = if content().hash("xxhash64").slice(-8).number() % 10 == 0 {
deleted()
}