Transform Nodes

Transform nodes are the workhorses of your workflows. They sit between your data sources and outputs, processing and reshaping data as it flows through. FactoryThread provides 14 transform functions covering filtering, mapping, aggregating, joining, splitting, and converting data.

Transform Function Categories

Transform nodes fall into five categories based on what they do to your data:

Category	Functions	Purpose
Row Selection	Filter, Condition, Divide	Control which rows pass through
Column Operations	Map, Transform, Convert	Add, remove, rename, or modify columns
Combining Data	Merge, Append, Lookup	Join or stack data from multiple sources
Aggregation	Group By, Distinct	Summarize or deduplicate rows
Row Expansion	Split, Sort	Create additional rows or reorder data

Quick Reference

Before diving into each function, here's a quick decision guide:

I want to...	Use this function
Keep only rows matching criteria	Filter
Route rows to different paths	Divide or Condition
Remove or rename columns	Map
Add calculated fields	Transform
Change data types	Convert
Join two datasets on a key	Merge
Combine datasets vertically	Append
Enrich data from a reference table	Lookup
Calculate totals, averages, counts	Group By
Remove duplicate rows	Distinct
Expand delimited values to rows	Split
Reorder rows	Sort
Categorize text with AI	AI Classify

Row Selection Functions

These functions control which rows continue through your workflow.

Filter

Purpose: Include or exclude rows based on conditions.

Filter is one of the most commonly used transforms. It evaluates each row against conditions you define and only passes through rows that match. Rows that don't match are discarded.

Key capabilities:

14 comparison operators (equals, contains, greater than, is null, etc.)
AND/OR logic for combining conditions
Nested condition groups for complex rules
Case-sensitive or insensitive text matching

Common uses:

Remove invalid or incomplete records
Select data for a specific date range
Filter by status, category, or region
Exclude test or sample data

Example: Keep only orders from the last 30 days with status "Complete":

OrderDate >= [30 days ago]
AND Status equals "Complete"

📖 Full Filter Documentation — Complete guide with all operators and examples.

Divide

Purpose: Split data into multiple parallel branches based on conditions.

Divide works like a multi-way switch. Each row is evaluated and routed to one of several output branches based on which condition it matches. Unlike Filter, Divide doesn't discard data—it directs it to different processing paths.

Key capabilities:

Define 2-10 output branches
First-match or all-match routing
Default branch for unmatched rows
Each branch becomes a separate workflow path

Common uses:

Different processing for different regions
Separate success and error records
Priority-based routing
Type-specific transformations

Example: Route orders by priority:

Branch 1: Priority equals "Urgent" → Fast-track processing
Branch 2: Priority equals "Normal" → Standard processing
Default: → Review queue

📖 Full Divide Documentation — All branching patterns and configuration options.

Condition

Purpose: Add a flag column based on whether conditions are met.

Condition adds a new boolean column to your data indicating whether each row matches your criteria. Unlike Filter (which removes rows) or Divide (which routes rows), Condition keeps all rows and annotates them.

Key capabilities:

Adds true/false column to every row
Same condition builder as Filter
Custom column naming
Chain multiple conditions for complex flags

Common uses:

Flag records for review
Mark rows as valid/invalid
Identify exceptions
Pre-compute filter criteria

Example: Add "IsHighValue" column:

IF Amount > 10000 THEN true ELSE false

📖 Full Condition Documentation — Pattern examples and use cases.

Column Operations

These functions modify the columns in your data.

Map

Purpose: Select, rename, and reorder columns.

Map controls which columns appear in your output and what they're named. It's essential for cleaning up data, standardizing column names, and preparing data for destinations with specific schema requirements.

Key capabilities:

Include/exclude specific columns
Rename columns
Reorder columns
Bulk select/deselect

Common uses:

Remove unnecessary columns before export
Rename columns to match destination schema
Prepare data for merge operations (key alignment)
Simplify data for downstream processing

Example: Rename and select columns:

customer_name → CustomerName (include)
customer_id → CustomerID (include)
internal_notes → (exclude)
created_timestamp → CreatedDate (include)

📖 Full Map Documentation — Column management patterns.

Transform

Purpose: Create new calculated columns using expressions.

Transform is the most flexible function. It lets you create new columns by applying formulas, combining existing fields, performing calculations, and using built-in functions. Think of it as adding computed columns to a spreadsheet.

Key capabilities:

80+ built-in functions (text, math, date, logic)
Field references with ${FieldName} syntax
Visual expression builder
Multiple transforms in one node

Function categories:

Text: CONCAT, UPPER, LOWER, TRIM, SUBSTRING, REPLACE, SPLIT
Math: +, -, *, /, ABS, ROUND, FLOOR, CEILING, MOD
Date: YEAR, MONTH, DAY, DATEDIFF, DATEADD, NOW, TODAY
Logic: IF, CASE, COALESCE, ISNULL, NULLIF
Conversion: CAST, TODATE, TONUMBER, TOTEXT

Example: Create full address:

CONCAT(${Street}, ", ", ${City}, ", ", ${State}, " ", ${ZipCode})

📖 Full Transform Documentation — Complete function reference and expression guide.

Convert

Purpose: Change column data types.

Convert changes the data type of columns—turning text to numbers, strings to dates, numbers to booleans, and more. It's essential when your source data has incorrect types or when preparing data for destinations with strict type requirements.

Key capabilities:

Text ↔ Number conversions
Text → Date with format patterns
Text ↔ Boolean with custom mappings
Number formatting and precision control
Error handling options

Supported conversions:

From	To	Example
Text	Number	"123.45" → 123.45
Text	Date	"01/15/2024" → 2024-01-15
Text	Boolean	"Yes" → true
Number	Text	1234.5 → "1,234.50"
Date	Text	2024-01-15 → "January 15, 2024"
Boolean	Number	true → 1

📖 Full Convert Documentation — All type conversions and format patterns.

Combining Data

These functions merge data from multiple sources.

Merge

Purpose: Join two datasets based on matching key columns.

Merge combines rows from two inputs where specified columns match—like a SQL JOIN or Excel VLOOKUP. It's essential for enriching data from one source with information from another.

Key capabilities:

Inner, Left, Right, and Full join types
Single or composite keys
Handle duplicate matches
Column conflict resolution

Join types explained:

Type	Result
Inner	Only rows that exist in both datasets
Left	All rows from left + matching from right
Right	Matching from left + all rows from right
Full	All rows from both (nulls where no match)

Example: Join orders with customer details:

Left: Orders (OrderID, CustomerID, Amount)
Right: Customers (CustomerID, Name, Email)
Join: Orders.CustomerID = Customers.CustomerID
Type: Left (keep all orders)
Result: OrderID, CustomerID, Amount, Name, Email

📖 Full Merge Documentation — Join strategies and performance tips.

Append

Purpose: Stack data from multiple sources vertically.

Append combines rows from two inputs into a single output—adding the rows from the second input after the rows from the first. Unlike Merge (which joins columns), Append stacks rows. It's like SQL UNION.

Key capabilities:

Combine rows from any number of sources
Column matching by name or position
Handle schema mismatches
Deduplicate option

When to use:

Combining monthly data files
Merging data from similar sources
Aggregating regional datasets
Stacking before/after comparisons

Example: Combine Q1-Q4 sales data:

Input 1: Q1_Sales (1,000 rows)
Input 2: Q2_Sales (1,200 rows)
Input 3: Q3_Sales (950 rows)
Input 4: Q4_Sales (1,100 rows)
Output: AllSales (4,250 rows)

📖 Full Append Documentation — Schema handling and patterns.

Lookup

Purpose: Enrich data by looking up values from a reference table.

Lookup adds columns to your data by finding matching rows in a reference dataset. It's optimized for the common pattern of enriching a main dataset with reference data—like looking up product prices, customer details, or code descriptions.

Key capabilities:

Single or composite lookup keys
Multiple return columns
Configurable no-match behavior
Caching for performance

Difference from Merge:

Lookup is optimized for reference data enrichment
Lookup handles the common "one main, one reference" pattern
Merge is more flexible for complex join scenarios

Example: Look up product prices:

Main: OrderLines (ProductCode, Quantity)
Lookup: Products (Code, Price, Description)
Match: OrderLines.ProductCode = Products.Code
Return: Price, Description

📖 Full Lookup Documentation — Optimization and caching strategies.

Aggregation

These functions summarize or deduplicate your data.

Group By

Purpose: Aggregate rows by grouping columns.

Group By combines multiple rows into summary rows based on grouping keys. It calculates aggregate values like sums, counts, averages, and min/max for each group. The output has one row per unique combination of grouping columns.

Key capabilities:

Multiple grouping columns
7 aggregate functions (SUM, COUNT, AVG, MIN, MAX, FIRST, LAST)
Multiple aggregations per group
Null handling options

Aggregate functions:

Function	Description	Works With
SUM	Add all values	Numbers
COUNT	Count rows	Any
AVG	Calculate mean	Numbers
MIN	Find smallest	Numbers, Dates, Text
MAX	Find largest	Numbers, Dates, Text
FIRST	First value in group	Any
LAST	Last value in group	Any

Example: Sales by region and product:

Group by: Region, ProductCategory
SUM: SalesAmount → TotalSales
COUNT: OrderID → OrderCount
AVG: SalesAmount → AvgOrderValue

📖 Full Group By Documentation — Aggregation patterns and examples.

Distinct

Purpose: Remove duplicate rows.

Distinct keeps only unique rows based on all or selected columns. When duplicates are found, it keeps the first occurrence and discards the rest.

Key capabilities:

Compare all columns or selected subset
Keep first or last occurrence
Count duplicates removed
Preserve original order

Common uses:

Deduplicate after merging data
Get unique values from a column
Clean up dirty data
Prepare data for loading

Example: Remove duplicate customer records:

Distinct on: Email, Phone
Keep: First occurrence
Result: One row per unique email+phone combination

📖 Full Distinct Documentation — Deduplication strategies.

Row Expansion and Ordering

These functions change the number or order of rows.

Split

Purpose: Expand rows by splitting delimited values.

Split takes a column containing delimited values (like "a,b,c") and creates multiple rows—one for each value. All other columns are duplicated for each new row. It's the opposite of Group By.

Key capabilities:

Custom delimiter support
Split to array elements
Preserve original row option
Handle empty values

Example: Expand product tags:

Before:
| ProductID | Tags              |
|-----------|-------------------|
| 1         | "red,blue,green"  |

After (split on comma):
| ProductID | Tag   |
|-----------|-------|
| 1         | red   |
| 1         | blue  |
| 1         | green |

📖 Full Split Documentation — Delimiter patterns and edge cases.

Sort

Purpose: Reorder rows by column values.

Sort arranges rows in ascending or descending order based on one or more columns. Multi-column sorting applies each sort in sequence.

Key capabilities:

Ascending or descending order
Multiple sort columns
Null handling (first or last)
Stable sort (preserves original order for ties)

Example: Sort orders by date (newest first), then by amount:

Sort 1: OrderDate (Descending)
Sort 2: Amount (Descending)

📖 Full Sort Documentation — Multi-column sorting and performance.

AI-Powered Transforms

AI Classify

Purpose: Categorize text data using AI.

AI Classify uses large language models (via your AI connection) to categorize text values into predefined categories. It's ideal for classifying unstructured text that would be difficult to categorize with rules.

Key capabilities:

Connect to OpenAI or Azure OpenAI
Define custom categories with descriptions
Confidence scores
Batch processing for efficiency

Common uses:

Classify support tickets by type
Categorize customer feedback sentiment
Route documents by topic
Tag products by description

Example: Classify support tickets:

Categories:
- "Bug Report" - Customer reporting a software bug or error
- "Feature Request" - Request for new functionality
- "Question" - General inquiry about product usage
- "Complaint" - Expression of dissatisfaction

Input: TicketDescription
Output: TicketCategory, ConfidenceScore

Choosing the Right Transform

Decision Tree

Do you need to reduce the number of rows?
├── Yes → Are you removing rows entirely?
│   ├── Yes → Use Filter
│   └── No (routing) → Use Divide
└── No → Continue...

Do you need to change columns?
├── Add new calculated columns → Use Transform
├── Rename or remove columns → Use Map
├── Change data types → Use Convert
└── No → Continue...

Do you need to combine data?
├── From the same source (stacking) → Use Append
├── From different sources (joining) → Use Merge or Lookup
└── No → Continue...

Do you need to aggregate?
├── Summarize with totals/counts → Use Group By
├── Just remove duplicates → Use Distinct
└── No → Continue...

Do you need to expand rows?
├── Split delimited values → Use Split
└── No → Continue...

Do you need to reorder rows?
└── Use Sort

Common Patterns

Pattern 1: Data Cleanup Pipeline

Entity → Filter (remove invalids) → Map (select columns) → Convert (fix types) → Transform (add calculations)

Pattern 2: Data Enrichment

Main Entity → Merge (join reference data) → Transform (calculate totals) → Map (select final columns)

Pattern 3: Aggregation Report

Entity → Filter (date range) → Group By (aggregate) → Sort (order results) → Response

Pattern 4: Multi-Source Consolidation

Entity A ─┐
Entity B ─┼→ Append → Distinct (dedupe) → Transform → Output
Entity C ─┘

Transform Performance Tips

Order Matters

The sequence of your transforms affects performance:

Filter first — Remove unwanted rows before heavy processing
Map early — Drop unnecessary columns to reduce data size
Aggregate before output — Summarize to reduce row counts
Transform last — Calculate fields on the filtered/reduced dataset

Memory Considerations

Some transforms hold data in memory:

Transform	Memory Usage
Filter, Map, Transform, Convert	Low (stream)
Sort, Group By, Distinct	Moderate (buffer)
Merge, Lookup	Higher (one dataset in memory)
Append	Low (stream)

For very large datasets:

Filter early to reduce data volume
Use database-level filtering when possible
Consider chunking with Divide

Next Steps

Filter Function — Detailed filter configuration
Transform Function — Expression building guide
Merge Function — Join configuration details
Building Flows — End-to-end workflow creation

Transform Function Categories​

Quick Reference​

Row Selection Functions​

Filter​

Divide​

Condition​

Column Operations​

Map​

Transform​

Convert​

Combining Data​

Merge​

Append​

Lookup​

Aggregation​

Group By​

Distinct​

Row Expansion and Ordering​

Split​

Sort​

AI-Powered Transforms​

AI Classify​

Choosing the Right Transform​

Decision Tree​

Common Patterns​

Transform Performance Tips​

Order Matters​

Memory Considerations​

Next Steps​

Transform Function Categories

Quick Reference

Row Selection Functions

Filter

Divide

Condition

Column Operations

Map

Transform

Convert

Combining Data

Merge

Append

Lookup

Aggregation

Group By

Distinct

Row Expansion and Ordering

Split

Sort

AI-Powered Transforms

AI Classify

Choosing the Right Transform

Decision Tree

Common Patterns

Transform Performance Tips

Order Matters

Memory Considerations

Next Steps