Skip to main content

Data Transformations

Our data pipeline product provides two ways to create data transformation rules that clean retrieved data into the preferred format: Formulas or JavaScript functions

Formulas

You can perform various data transformations and manipulations by using our formulas. Starting from creating if conditions =IF(), replacing data =REPLACE(), concatenating data =CONCATENATE() to using our custom formulas to tell the system to remove specific rows REMOVE(). Our product offers various formulas. Some of them you might know from Microsoft Excel or Google Sheets.

Here are a few examples of formulas that you can use inside our pipeline product:

=CONCATENATE(firstName;" "; lastName)

=REPLACE(phoneNumber; 1; 0; "+49")

=IF(ERROR();REMOVE())

=IF(EMPTY(organisation);REMOVE())

=IF(OR(amount > 10; amount < 2); amount; 20)

=TEXT(closingDate;"dd.mm.yyyy")

These are just a few examples of the many formulas available in our tool. Using these, you can efficiently perform complex data manipulations and transformations.

Date & time

Function IDDescriptionSyntax
DATE Returns the specified date as the number of full days since today DATE(Year; Month; Day)
DATEDIF Calculates distance between two dates, in provided unit parameter (“D”, “M” or “Y”). DATEDIF(Date1; Date2; Units)
DATEVALUE Parses a date string and returns it as the number of full days since today DATEVALUE(Datestring)
DAY Returns the day of the given date value. DAY(Number)
DAYS Calculates the difference between two date values. DAYS(Date2; Date1)
HOUR Returns hour component of given time. HOUR(Time)
ISOWEEKNUM Returns an ISO week number that corresponds to the week of year. ISOWEEKNUM(Date)
MINUTE Returns minute component of given time. MINUTE(Time)
MONTH Returns the month for the given date value. MONTH(Number)
NOW Returns current date + time as a number of days since today. NOW()
TODAY Returns an integer representing the current date as the number of full days since today. TODAY()

Logical

Function ID Description Syntax
AND Returns true if all arguments are true. AND(Logicalvalue1; Logicalvalue2; ...; Logicalvalue30)
IF Specifies a logical test to be performed. If the test is true the second argument is returned. If it is false, the third argument is returned. The third argument is optional. IF(Test; Then value; Otherwisevalue)
ERROR Returns true if the output value contains an error. ERROR()
NOT Complements (inverts) a logical value. NOT(Logicalvalue)
SWITCH Evaluates a list of arguments, consisting of an expression followed by a value. SWITCH(Expression1; Value1[; Expression2; Value2[...; Expression_n; Value_n]])
OR Returns true if at least one argument is true. OR(Logicalvalue1; Logicalvalue2; ...; Logicalvalue30)
XOR Returns true if an odd number of arguments evaluates to true. XOR(Logicalvalue1; Logicalvalue2; ...; Logicalvalue30)

Math & trigonometry

Function ID Description Syntax
ABS Returns the absolute value of a number. ABS(Number)
CEILING Rounds a number up to the nearest multiple of Significance. CEILING(Number; Significance)
CEILING.PRECISE Rounds a number up to the nearest multiple of Significance. CEILING.PRECISE(Number[; Significance])
COMBIN Returns number of combinations (without repetitions). COMBIN(Number; Number)
COMBINA Returns number of combinations (with repetitions). COMBINA(Number; Number)
DECIMAL Converts text with characters from a number system to a positive integer in the base radix given. DECIMAL("Text"; Radix)
EVEN Rounds a positive number up to the next even integer and a negative number down to the next even integer. EVEN(Number)
EXP Returns constant e raised to the power of a number. EXP(Number)
FLOOR Rounds a number down to the nearest multiple of Significance. FLOOR(Number; Significance)
INT Rounds a number down to the nearest integer. INT(Number)
MOD Returns the remainder when one integer is divided by another. MOD(Dividend; Divisor)
POWER Returns a number raised to another number. POWER(Base; Exponent)
ROUND Rounds a number to a certain number of decimal places. ROUND(Number; Count)
ROUNDDOWN Rounds a number down, toward zero, to a certain precision. ROUNDDOWN(Number; Count)
ROUNDUP Rounds a number up, away from zero, to a certain precision. ROUNDUP(Number; Count)
SQRT Returns the positive square root of a number. SQRT(Number)
SQRTPI Returns sqrt of number times pi. SQRTPI(Number)
SUBTOTAL Computes aggregation using function specified by number. SUBTOTAL(Function; Number1; Number2; ...; Number30)
SUM Sums up the values of the specified cells. SUM(Number1; Number2; ...; Number30)

Operator

Function ID Description Syntax
ADD Adds two values. ADD(Number; Number)
CONCAT Concatenates two strings. CONCAT(String; String)
DIVIDE Divides two values. DIVIDE(Number; Number)
EQ Tests two values for equality. EQ(Value; Value)
LTE Tests two values for less-equal relation. LEQ(Value; Value)
LT Tests two values for less-than relation. LT(Value; Value)
GTE Tests two values for greater-equal relation. GEQ(Value; Value)
GT Tests two values for greater-than relation. GT(Value; Value)
MINUS Subtracts two values. MINUS(Number; Number)
MULTIPLY Multiplies two values. MULTIPLY(Number; Number)
UNARY_PERCENT Applies percent operator. UNARY_PERCENT(Number)

Text

Function ID Description Syntax
CHAR Converts a number into a character according to the current code table. CHAR(Number)
CLEAN Returns text that has been "cleaned" of line breaks and other non-printable characters. CLEAN("Text")
CONCATENATE Combines several text strings into one string. CONCATENATE("Text1"; ...; "Text30")
EXACT Returns true if both text strings are exactly the same. EXACT("Text"; "Text")
LEFT Extracts a given number of characters from the left side of a text string. LEFT("Text"; Number)
LEN Returns length of a given text. LEN("Text")
LOWER Returns text converted to lowercase. LOWER("Text")
MID Returns substring of a given length starting from Start_position. MID("Text"; Start_position; Length)
PROPER Capitalizes words given text string. PROPER("Text")
REPLACE Replaces substring of a text of a given length that starts at given position. REPLACE("Text"; Start_position; Length; "New_text")
REPT Repeats text a given number of times. REPT("Text"; Number)
SPLIT Divides the provided text using the space character as a separator and returns the substring at the zero-based position specified by the second argument. SPLIT("Lorem ipsum"; 0) -> "Lorem" – SPLIT("Lorem ipsum"; 1) -> "ipsum" SPLIT("Text"; Index)
TEXT Converts a number into text according to a given format. TEXT(Number; Format)
TRIM Strips extra spaces from text. TRIM("Text")
UPPER Returns text converted to uppercase. UPPER("Text")

Custom formulas

Function ID Description Syntax
REMOVE If called on a row, the row will be removed/ignored for the output. REMOVE()
EMPTY Returns true if the given input value is empty. EMPTY(column)
ERROR Returns true if the output value contains an error. ERROR()
VALUEAT Returns the value of the given column at the given index VALUEAT(column; index)

Best Practices

Mathematical Operators

As you can see above, you can use all the mathematical operators. But instead of using ADD(), MINUS(), DIVIDE() and MULTIPLY() we recommend you to use +, -, / and *. This saves you time when creating formulas.

For example, you would create a formula for a multiplication like =amount*costs instead of =MULTIPLY(amount; costs).

Delimiter (";" instead of ",")

If you are used to Excel, you probably use "," as the delimiter for formulas with more than one argument. When you work inside our data pipeline tool, you must adjust because our syntax prefers to use ";" as a delimiter, similar to Google Sheets.

For example, you would create a formula for an if-condition like =IF(amount > 1000; REMOVE()) instead of =IF(amount > 1000, REMOVE()).

Custom formula: REMOVE()

With the custom formula REMOVE(), you can tell our system when to ignore/discard a row for the pipeline output. Every row marked as "Removed" will not be exported or arrive at the defined output target. TIP: We recommend using REMOVE() with if-conditions to remove the unwanted rows selectively.

Custom formula: EMPTY()

With the custom formula EMPTY(), you can check whether an input value is empty. If the value equals empty, the formula returns true. If not, the formula returns false. TIP: EMPTY() can be combined with IF() and REMOVE() to selectively remove the rows whose input is empty (=IF(EMPTY(column); REMOVE())).

Custom formula: ERROR()

With the custom formula ERROR(), you can check whether a row's output throws an error regarding the validations. If the output throws an error, the formula returns true. If not, the formula returns false. TIP: ERROR() can be combined with IF() and REMOVE() to selectively remove the rows whose output throws an error (=IF(ERROR(); REMOVE())).

Compare operators

If you want to compare specfic values, you can use the following operators: =, >, <, >= or <=

For example, if you want to check if the amount is greater than 1000, the formula would be: =IF(amount > 1000; Then value; Otherwisevalue)

Javascript

JavaScript offers a wide range of functionality, including mathematical operations, string manipulation, conditional statements, loops, and more. By using JavaScript in your application, users can perform various data transformations, mappings, validations, and cleanings.

Here are a few examples of common JavaScript functions that your users might find useful:

Mathematical Operations

Addition: var x = 5 + 3;
Subtraction: var x = 5 - 3;
Multiplication: var x = 5 * 3;
Division: var x = 5 / 3;
Modulus: var x = 5 % 3;

String Manipulation

Concatenation: var name = "John" + " " + "Doe";
Length: var length = name.length;
Substring: var sub = name.substring(0, 4);
Replace: var replaced = name.replace("John", "Jane");

Conditional Statements

var age = 25;
if (age >= 18) {
return "You are an adult";
} else {
return "You are not an adult";
}

Loops

​​for (var i = 0; i < 5; i++) {
return i;
}

var numbers = [1, 2, 3, 4, 5];
numbers.forEach(function(number) {
return number;
});

Async

async (row, rowIndex) => {
try {
const response = await fetch("https://api.example.com/data");
const data = await response.json();
return data;
} catch (error) {
return row.data;
}
};

These are just a few examples of what your users can do with JavaScript. The possibilities are endless, and your users can combine these functions and add their own custom logic to perform even more complex data transformations, mappings, validations, and cleanings. For more information, check out the JavaScript documentation.

Global States

Central state management can be achieved through the use of global states. These states enable you to execute an asynchronous function once and save the output for future use across multiple columns. As a user, you can create up to 5 global states for each pipeline. The data returned from these states can then be accessed in every column mapping, making it convenient for data validation and cleaning purposes.

Global states are especially useful in scenarios where you need to make a single API request and use the results to validate input data for multiple rows and columns. By using global states, you can avoid making multiple API requests and reduce the overall processing time.

info

The memory capacity for each global state is limited to 1mb.

Best Practices

HTTP 429 Too Many Requests

The HTTP 429 Too Many Requests response status code is a HTTP standard response code indicating that the user has sent too many requests in a specified time frame, also known as "rate limiting." This status code is used to protect the server from being overwhelmed with too many requests.

To avoid hitting the rate limit, it's important to use global states to store the fetched data. For example, if you want to retrieve the data from the database, we suggest to use global states to make the async call:

async () => {
const response = await fetch(`https://api.example.com/data/${row.id}`);
const data = await response.json();
return data;
};

The data stored inside the global state can be accessed across column.