Skip to main content

Formulas

← Previous: Update Operations | Next: Aggregations →


Formula Syntax

Streams supports JavaScript-like expressions for calculated fields.


Formula Types

1. Column Reference (Non-Post-Processed)

Used in stream files during data read:

'#column.1#' + ' ' + '#column.2#'
#column.4# * #column.5#

Syntax: #column.N# where N is the column order

When: "isPostProcessed": false

Example:

{
"contentType": "formula",
"content": {
"value": "#column.4# * #column.5#",
"type": "double",
"isPostProcessed": false
}
}

2. Source Reference (Post-Processed)

Used in model files after aggregation:

count(orders[].id)
sum(orders[].total)
'source.table.column1' + ' - ' + 'source.table.column2'

When: "isPostProcessed": true

Example:

{
"contentType": "formula",
"content": {
"value": "count(orders[].id)",
"type": "integer",
"isPostProcessed": true
}
}

Available Functions

count()

Count non-null values:

count(orders[].id)          // Count order items
count(addresses[].city) // Count addresses

sum()

Sum numeric values:

sum(orders[].total)         // Total amount
sum(items[].quantity) // Total quantity

String Methods

'value'.toUpperCase()       // UPPERCASE
'value'.substring(0, 3) // Substring
'value'.length // Length

Operators

// Arithmetic
#column.1# + #column.2#
#column.1# - #column.2#
#column.1# * #column.2#
#column.1# / #column.2#

// String concatenation
'#column.1#' + ' ' + '#column.2#'

// Comparison (in future versions)
#column.1# > 100

Examples

Example 1: Total Price

{
"encapsulationKeyType": "element",
"encapsulationKey": "totalPrice",
"contentType": "formula",
"content": {
"value": "#column.4# * #column.5#",
"type": "double",
"isPostProcessed": false
}
}

Example 2: Full Name

{
"encapsulationKeyType": "element",
"encapsulationKey": "fullName",
"contentType": "formula",
"content": {
"value": "'#column.2#' + ' ' + '#column.3#'",
"type": "string",
"isPostProcessed": false
}
}

Example 3: Order Count (Post-Processed)

{
"encapsulationKeyType": "element",
"encapsulationKey": "orderCount",
"contentType": "formula",
"content": {
"value": "count(orders[].id)",
"type": "integer",
"isPostProcessed": true
}
}

Example 4: Total Revenue (Post-Processed)

{
"encapsulationKeyType": "element",
"encapsulationKey": "totalRevenue",
"contentType": "formula",
"content": {
"value": "sum(orders[].amount)",
"type": "double",
"isPostProcessed": true
}
}

isPostProcessed

ValueWhen EvaluatedCan ReferenceUse Case
falseDuring data read#column.N#Simple calculations on source columns
trueAfter aggregationArrays (arr[].field)Calculations on aggregated arrays

Best Practices

1. Quote Strings

Always quote string values:

'#column.1#' + ' ' + '#column.2#'
✗ #column.1# + ' ' + #column.2#

2. Specify Type

Always provide type hint:

"type": "integer"    // For count()
"type": "double" // For sum(), arithmetic
"type": "string" // For concatenation

3. Use Post-Processing for Arrays

Array operations require isPostProcessed: true:

{
"value": "count(items[].id)",
"isPostProcessed": true
}

4. Keep Formulas Simple

Complex logic should be in multiple fields:

✓ Good: totalPrice = quantity * price
discount = totalPrice * 0.1

✗ Bad: finalPrice = (quantity * price) - ((quantity * price) * 0.1)