switch
Select a child scanner dynamically for source data based on factors such as the filename.
# Config fields, showing default values
switch: [] # No default (required)
This scanner outlines a list of potential child scanner candidates to be chosen, and for each source of data the first candidate to pass will be selected. A candidate without any conditions acts as a catch-all and will pass for every source, it is recommended to always have a catch-all scanner at the end of your list. If a given source of data does not pass a candidate an error is returned and the data is rejected.
Fields
[].re_match_name
A regular expression to test against the name of each source of data fed into the scanner (filename or equivalent). If this pattern matches the child scanner is selected.
Type: string
[].scanner
The scanner to activate if this candidate passes.
Type: scanner
Examples
- Switch based on file name
In this example a file input chooses a scanner based on the extension of each file
input:
file:
paths: [ ./data/* ]
scanner:
switch:
- re_match_name: '\.avro$'
scanner: { avro: {} }
- re_match_name: '\.csv$'
scanner: { csv: {} }
- re_match_name: '\.csv.gz$'
scanner:
decompress:
algorithm: gzip
into:
csv: {}
- re_match_name: '\.tar$'
scanner: { tar: {} }
- re_match_name: '\.tar.gz$'
scanner:
decompress:
algorithm: gzip
into:
tar: {}
- scanner: { to_the_end: {} }