Skip to main content

Bloblang Functions

Functions can be placed anywhere and allow you to extract information from your environment, generate values, or access data from the underlying message being mapped:

root.doc.id = uuid_v4()
root.doc.received_at = now()
root.doc.host = hostname()

Functions support both named and nameless style arguments:

root.values_one = range(start: 0, stop: this.max, step: 2)
root.values_two = range(0, this.max, 2)

General

counter

EXPERIMENTAL

This function is experimental and therefore breaking changes could be made to it outside of major version releases.

Returns a non-negative integer that increments each time it is resolved, yielding the minimum (1 by default) as the first value. Each instantiation of counter has its own independent count. Once the maximum integer (or max argument) is reached the counter resets back to the minimum.

Parameters

min <query expression, default 1> The minimum value of the counter, this is the first value that will be yielded. If this parameter is dynamic it will be resolved only once during the lifetime of the mapping.
max <query expression, default 9223372036854775807> The maximum value of the counter, once this value is yielded the counter will reset back to the min. If this parameter is dynamic it will be resolved only once during the lifetime of the mapping.
set <(optional) query expression> An optional mapping that when specified will be executed each time the counter is resolved. When this mapping resolves to a non-negative integer value it will cause the counter to reset to this value and yield it. If this mapping is omitted or doesn't resolve to anything then the counter will increment and yield the value as normal. If this mapping resolves to null then the counter is not incremented and the current value is yielded. If this mapping resolves to a deletion then the counter is reset to the min value.

Examples

root.id = counter()

# In: {}
# Out: {"id":1}

# In: {}
# Out: {"id":2}

It's possible to increment a counter multiple times within a single mapping invocation using a map.


map foos {
root = counter()
}

root.meow_id = null.apply("foos")
root.woof_id = null.apply("foos")


# In: {}
# Out: {"meow_id":1,"woof_id":2}

# In: {}
# Out: {"meow_id":3,"woof_id":4}

By specifying an optional set parameter it is possible to dynamically reset the counter based on input data.

root.consecutive_doggos = counter(min: 1, set: if !this.sound.lowercase().contains("woof") { 0 })

# In: {"sound":"woof woof"}
# Out: {"consecutive_doggos":1}

# In: {"sound":"woofer wooooo"}
# Out: {"consecutive_doggos":2}

# In: {"sound":"meow"}
# Out: {"consecutive_doggos":0}

# In: {"sound":"uuuuh uh uh woof uhhhhhh"}
# Out: {"consecutive_doggos":1}

The set parameter can also be utilised to peek at the counter without mutating it by returning null.

root.things = counter(set: if this.id == null { null })

# In: {"id":"a"}
# Out: {"things":1}

# In: {"id":"b"}
# Out: {"things":2}

# In: {"what":"just checking"}
# Out: {"things":2}

# In: {"id":"c"}
# Out: {"things":3}

deleted

A function that returns a result indicating that the mapping target should be deleted. Deleting, also known as dropping, messages will result in them being acknowledged as successfully processed to inputs in a Bento pipeline. For more information about error handling patterns read here.

Examples

root = this
root.bar = deleted()

# In: {"bar":"bar_value","baz":"baz_value","foo":"foo value"}
# Out: {"baz":"baz_value","foo":"foo value"}

Since the result is a value it can be used to do things like remove elements of an array within map_each.

root.new_nums = this.nums.map_each(num -> if num < 10 { deleted() } else { num - 10 })

# In: {"nums":[3,11,4,17]}
# Out: {"new_nums":[1,7]}

ksuid

Generates a new ksuid each time it is invoked and prints a string representation.

Examples

root.id = ksuid()

nanoid

Generates a new nanoid each time it is invoked and prints a string representation.

Parameters

length <(optional) integer> An optional length.
alphabet <(optional) string> An optional custom alphabet to use for generating IDs. When specified the field length must also be present.

Examples

root.id = nanoid()

It is possible to specify an optional length parameter.

root.id = nanoid(54)

It is also possible to specify an optional custom alphabet after the length parameter.

root.id = nanoid(54, "abcde")

pi

Returns the value of the mathematical constant Pi.

Examples

root.radians = this.degrees * (pi() / 180)

# In: {"degrees":45}
# Out: {"radians":0.7853981633974483}
root.degrees = this.radians * (180 / pi())

# In: {"radians":0.78540}
# Out: {"degrees":45.00010522957486}

random_int

Generates a non-negative pseudo-random 64-bit integer. An optional integer argument can be provided in order to seed the random number generator.

Optional min and max arguments can be provided in order to only generate numbers within a range. Neither of these parameters can be set via a dynamic expression (i.e. from values taken from mapped data). Instead, for dynamic ranges extract a min and max manually using a modulo operator (random_int() % a + b).

Parameters

seed <query expression, default {"Value":0}> A seed to use, if a query is provided it will only be resolved once during the lifetime of the mapping.
min <integer, default 0> The minimum value the random generated number will have. The default value is 0.
max <integer, default 9223372036854775806> The maximum value the random generated number will have. The default value is 9223372036854775806 (math.MaxInt64 - 1).

Examples

root.first = random_int()
root.second = random_int(1)
root.third = random_int(max:20)
root.fourth = random_int(min:10, max:20)
root.fifth = random_int(timestamp_unix_nano(), 5, 20)
root.sixth = random_int(seed:timestamp_unix_nano(), max:20)

It is possible to specify a dynamic seed argument, in which case the argument will only be resolved once during the lifetime of the mapping.

root.first = random_int(timestamp_unix_nano())

range

The range function creates an array of integers following a range between a start, stop and optional step integer argument. If the step argument is omitted then it defaults to 1. A negative step can be provided as long as stop < start.

Parameters

start <integer> The start value.
stop <integer> The stop value.
step <integer, default 1> The step value.

Examples

root.a = range(0, 10)
root.b = range(start: 0, stop: this.max, step: 2) # Using named params
root.c = range(0, -this.max, -2)

# In: {"max":10}
# Out: {"a":[0,1,2,3,4,5,6,7,8,9],"b":[0,2,4,6,8],"c":[0,-2,-4,-6,-8]}

snowflake_id

Generate a new snowflake ID each time it is invoked and prints a string representation. I.e.: 1559229974454472704

Parameters

node_id <integer, default 1> It is possible to specify the node_id.

Examples

root.id = snowflake_id()

It is possible to specify the node_id.

root.id = snowflake_id(2)

throw

Throws an error similar to a regular mapping error. This is useful for abandoning a mapping entirely given certain conditions.

Parameters

why <string> A string explanation for why an error was thrown, this will be added to the resulting error message.

Examples

root.doc.type = match {
this.exists("header.id") => "foo"
this.exists("body.data") => "bar"
_ => throw("unknown type")
}
root.doc.contents = (this.body.content | this.thing.body)

# In: {"header":{"id":"first"},"thing":{"body":"hello world"}}
# Out: {"doc":{"contents":"hello world","type":"foo"}}

# In: {"nothing":"matches"}
# Out: Error("failed assignment (line 1): unknown type")

ulid

EXPERIMENTAL

This function is experimental and therefore breaking changes could be made to it outside of major version releases.

Generate a random ULID.

Parameters

encoding <string, default "crockford"> The format to encode a ULID into. Valid options are: crockford, hex
random_source <string, default "secure_random"> The source of randomness to use for generating ULIDs. "secure_random" is recommended for most use cases. "fast_random" can be used if security is not a concern.

Examples

Using the defaults of Crockford Base32 encoding and secure random source

root.id = ulid()

ULIDs can be hex-encoded too.

root.id = ulid("hex")

They can be generated using a fast, but unsafe, random source for use cases that are not security-sensitive.

root.id = ulid("crockford", "fast_random")

uuid_v4

Generates a new RFC-4122 UUID each time it is invoked and prints a string representation.

Examples

root.id = uuid_v4()

Message Info

batch_index

Returns the index of the mapped message within a batch. This is useful for applying maps only on certain messages of a batch.

Examples

root = if batch_index() > 0 { deleted() }

batch_size

Returns the size of the message batch.

Examples

root.foo = batch_size()

content

Returns the full raw contents of the mapping target message as a byte array. When mapping to a JSON field the value should be encoded using the method encode, or cast to a string directly using the method string, otherwise it will be base64 encoded by default.

Examples

root.doc = content().string()

# In: {"foo":"bar"}
# Out: {"doc":"{\"foo\":\"bar\"}"}

error

If an error has occurred during the processing of a message this function returns the reported cause of the error as a string, otherwise null. For more information about error handling patterns read here.

Examples

root.doc.error = error()

errored

Returns a boolean value indicating whether an error has occurred during the processing of a message. For more information about error handling patterns read here.

Examples

root.doc.status = if errored() { 400 } else { 200 }

json

Returns the value of a field within a JSON message located by a dot path argument. This function always targets the entire source JSON document regardless of the mapping context.

Parameters

path <string, default ""> An optional dot path identifying a field to obtain.

Examples

root.mapped = json("foo.bar")

# In: {"foo":{"bar":"hello world"}}
# Out: {"mapped":"hello world"}

The path argument is optional and if omitted the entire JSON payload is returned.

root.doc = json()

# In: {"foo":{"bar":"hello world"}}
# Out: {"doc":{"foo":{"bar":"hello world"}}}

metadata

Returns the value of a metadata key from the input message, or null if the key does not exist. Since values are extracted from the read-only input message they do NOT reflect changes made from within the map, in order to query metadata mutations made within a mapping use the @ operator. This function supports extracting metadata from other messages of a batch with the from method.

Parameters

key <string, default ""> An optional key of a metadata value to obtain.

Examples

root.topic = metadata("kafka_topic")

The key parameter is optional and if omitted the entire metadata contents are returned as an object.

root.all_metadata = metadata()

tracing_id

EXPERIMENTAL

This function is experimental and therefore breaking changes could be made to it outside of major version releases.

Provides the message trace id. The returned value will be zeroed if the message does not contain a span.

Examples

meta trace_id = tracing_id()

tracing_span

EXPERIMENTAL

This function is experimental and therefore breaking changes could be made to it outside of major version releases.

Provides the message tracing span (created via Open Telemetry APIs) as an object serialised via text map formatting. The returned value will be null if the message does not have a span.

Examples

root.headers.traceparent = tracing_span().traceparent

# In: {"some_stuff":"just can't be explained by science"}
# Out: {"headers":{"traceparent":"00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01"}}

Environment

env

Returns the value of an environment variable, or null if the environment variable does not exist.

Parameters

name <string> The name of an environment variable.
no_cache <bool, default false> Force the variable lookup to occur for each mapping invocation.

Examples

root.thing.key = env("key").or("default value")
root.thing.key = env(this.thing.key_name)

When the name parameter is static this function will only resolve once and yield the same result for each invocation as an optimisation, this means that updates to env vars during runtime will not be reflected. You can disable this cache with the optional parameter no_cache, which when set to true will cause the variable lookup to be performed for each execution of the mapping.

root.thing.key = env(name: "key", no_cache: true)

file

Reads a file and returns its contents. Relative paths are resolved from the directory of the process executing the mapping. In order to read files relative to the mapping file use the newer file_rel function

Parameters

path <string> The path of the target file.
no_cache <bool, default false> Force the file to be read for each mapping invocation.

Examples

root.doc = file(env("BENTO_TEST_BLOBLANG_FILE")).parse_json()

# In: {}
# Out: {"doc":{"foo":"bar"}}

When the path parameter is static this function will only read the specified file once and yield the same result for each invocation as an optimisation, this means that updates to files during runtime will not be reflected. You can disable this cache with the optional parameter no_cache, which when set to true will cause the file to be read for each execution of the mapping.

root.doc = file(path: env("BENTO_TEST_BLOBLANG_FILE"), no_cache: true).parse_json()

# In: {}
# Out: {"doc":{"foo":"bar"}}

file_rel

Reads a file and returns its contents. Relative paths are resolved from the directory of the mapping.

Parameters

path <string> The path of the target file.
no_cache <bool, default false> Force the file to be read for each mapping invocation.

Examples

root.doc = file_rel(env("BENTO_TEST_BLOBLANG_FILE")).parse_json()

# In: {}
# Out: {"doc":{"foo":"bar"}}

When the path parameter is static this function will only read the specified file once and yield the same result for each invocation as an optimisation, this means that updates to files during runtime will not be reflected. You can disable this cache with the optional parameter no_cache, which when set to true will cause the file to be read for each execution of the mapping.

root.doc = file_rel(path: env("BENTO_TEST_BLOBLANG_FILE"), no_cache: true).parse_json()

# In: {}
# Out: {"doc":{"foo":"bar"}}

hostname

Returns a string matching the hostname of the machine running Bento.

Examples

root.thing.host = hostname()

now

Returns the current timestamp as a string in RFC 3339 format with the local timezone. Use the method ts_format in order to change the format and timezone.

Examples

root.received_at = now()
root.received_at = now().ts_format("Mon Jan 2 15:04:05 -0700 MST 2006", "UTC")

timestamp_unix

Returns the current unix timestamp in seconds.

Examples

root.received_at = timestamp_unix()

timestamp_unix_micro

Returns the current unix timestamp in microseconds.

Examples

root.received_at = timestamp_unix_micro()

timestamp_unix_milli

Returns the current unix timestamp in milliseconds.

Examples

root.received_at = timestamp_unix_milli()

timestamp_unix_nano

Returns the current unix timestamp in nanoseconds.

Examples

root.received_at = timestamp_unix_nano()

Fake Data Generation

fake

BETA

This function is mostly stable but breaking changes could still be made outside of major version releases if a fundamental problem with it is found.

Takes in a string that maps to a faker function and returns the result from that faker function. Returns an error if the given string doesn't match a supported faker function. Supported functions: latitude, longitude, unix_time, date, time_string, month_name, year_string, day_of_week, day_of_month, timestamp, century, timezone, time_period, email, mac_address, domain_name, url, username, ipv4, ipv6, password, jwt, word, sentence, paragraph, cc_type, cc_number, currency, amount_with_currency, title_male, title_female, first_name, first_name_male, first_name_female, last_name, name, gender, chinese_first_name, chinese_last_name, chinese_name, phone_number, toll_free_phone_number, e164_phone_number, uuid_hyphenated, uuid_digit. Refer to the faker docs for details on these functions.

Parameters

function <string, default ""> The name of the function to use to generate the value.

Examples

Use time_string to generate a time in the format 00:00:00:

root.time = fake("time_string")

Use email to generate a string in email address format:

root.email = fake("email")

Use jwt to generate a JWT token:

root.jwt = fake("jwt")

Use uuid_hyphenated to generate a hypenated UUID:

root.uuid = fake("uuid_hyphenated")

Deprecated

count

The count function is a counter starting at 1 which increments after each time it is called. Count takes an argument which is an identifier for the counter, allowing you to specify multiple unique counters in your configuration.

Parameters

name <string> An identifier for the counter.

Examples

root = this
root.id = count("bloblang_function_example")

# In: {"message":"foo"}
# Out: {"id":1,"message":"foo"}

# In: {"message":"bar"}
# Out: {"id":2,"message":"bar"}

meta

Returns the value of a metadata key from the input message as a string, or null if the key does not exist. Since values are extracted from the read-only input message they do NOT reflect changes made from within the map. In order to query metadata mutations made within a mapping use the root_meta function. This function supports extracting metadata from other messages of a batch with the from method.

Parameters

key <string, default ""> An optional key of a metadata value to obtain.

Examples

root.topic = meta("kafka_topic")

The key parameter is optional and if omitted the entire metadata contents are returned as an object.

root.all_metadata = meta()

root_meta

Returns the value of a metadata key from the new message being created as a string, or null if the key does not exist. Changes made to metadata during a mapping will be reflected by this function.

Parameters

key <string, default ""> An optional key of a metadata value to obtain.

Examples

root.topic = root_meta("kafka_topic")

The key parameter is optional and if omitted the entire metadata contents are returned as an object.

root.all_metadata = root_meta()