Split Function
The Split function expands a single row into multiple rows by splitting a field that contains multiple values. It's essential for normalizing denormalized data where multiple values are stored in a single field.
How It Works
Split takes a field containing delimited values and:
- Parses the delimited string into individual values
- Creates one output row for each value
- Copies all other fields to each new row
- Replaces the original field with single values
Example:
Input: 1 row with Tags = "red,blue,green"
Output: 3 rows with Tag = "red", "blue", "green"
Adding a Split
- Drag Split from the Functions section of the Element Panel
- Connect it to your data source
- Click the Split node to configure split settings
Configuration Panel
Field to Split
Select the field containing delimited values:
Dropdown: Shows all available text fields from upstream data.
Field requirements:
- Must be text/string type
- Should contain delimited values
- Empty or null fields handled gracefully
Delimiter
Specify the character(s) separating values:
Common delimiters:
| Delimiter | Character | Example Data |
|---|---|---|
| Comma | , | "red,blue,green" |
| Semicolon | ; | "A;B;C" |
| Pipe | | | "val1|val2|val3" |
| Tab | \t | "col1 col2 col3" |
| Newline | \n | "line1\nline2\nline3" |
| Space | | "word1 word2 word3" |
Custom delimiter: Enter any string:
" - " for "A - B - C"
"::" for "part1::part2::part3"
" and " for "apples and oranges and bananas"
Output Field Name
Name for the field containing individual values:
Default: Same as source field name
Custom name:
- Rename to indicate singular form
- Example: "Tags" → "Tag"
- Example: "Categories" → "Category"
Trim Values
Remove whitespace from split values:
Enabled (recommended):
"red, blue, green" → "red", "blue", "green"
Disabled:
"red, blue, green" → "red", " blue", " green"
Handle Empty Values
What to do when split produces empty values:
Skip empty (default):
"A,,B" → "A", "B" (2 rows)
Keep empty:
"A,,B" → "A", "", "B" (3 rows)
Keep Original
Optionally retain the original unsplit field:
Keep original: Yes
Output has: Tag (split value) AND Tags (original "red,blue,green")
Keep original: No (default)
Output only has: Tag (split value)
Row Expansion
Understanding Row Multiplication
Split multiplies rows:
Input:
| ID | Name | Tags |
|---|---|---|
| 1 | Item A | red,blue |
| 2 | Item B | green |
| 3 | Item C | red,blue,yellow |
Output (6 rows):
| ID | Name | Tag |
|---|---|---|
| 1 | Item A | red |
| 1 | Item A | blue |
| 2 | Item B | green |
| 3 | Item C | red |
| 3 | Item C | blue |
| 3 | Item C | yellow |
Row Count Impact
New row count = sum of values per row
Before: 100 rows, average 3 values each After: ~300 rows
Plan downstream processing for increased volume.
Preserving Row Identity
All non-split fields are duplicated:
- Primary keys appear multiple times
- Other fields copied to each new row
- Consider adding a sequence number if needed
Common Use Cases
Normalizing Tags/Categories
Scenario: Products with comma-separated tags
Input:
| ProductID | ProductName | Tags |
|---|---|---|
| P001 | Widget | electronics,gadget,sale |
After Split:
| ProductID | ProductName | Tag |
|---|---|---|
| P001 | Widget | electronics |
| P001 | Widget | gadget |
| P001 | Widget | sale |
Expanding Multi-Select Fields
Scenario: Form submissions with multiple selections
Input:
| ResponseID | Question | Answers |
|---|---|---|
| R001 | Interests | Sports;Music;Travel |
After Split:
| ResponseID | Question | Answer |
|---|---|---|
| R001 | Interests | Sports |
| R001 | Interests | Music |
| R001 | Interests | Travel |
Processing Recipients
Scenario: Emails with multiple recipients
Input:
| EmailID | Subject | Recipients |
|---|---|---|
| E001 | Meeting | a@co.com;b@co.com;c@co.com |
After Split:
| EmailID | Subject | Recipient |
|---|---|---|
| E001 | Meeting | a@co.com |
| E001 | Meeting | b@co.com |
| E001 | Meeting | c@co.com |
Breaking Apart Composite Keys
Scenario: Compound identifiers
Input:
| CompositeKey | Value |
|---|---|
| US-2024-001 | 100 |
Configuration: Split on "-"
After Split:
| CompositeKey | Value | KeyPart |
|---|---|---|
| US-2024-001 | 100 | US |
| US-2024-001 | 100 | 2024 |
| US-2024-001 | 100 | 001 |
Note: Often better to use Transform with SUBSTRING for structured keys.
Hierarchical Path Expansion
Scenario: File paths or hierarchies
Input:
| FilePath |
|---|
| /home/user/docs/file.txt |
Configuration: Split on "/"
After Split:
| FilePath | PathPart |
|---|---|
| /home/user/docs/file.txt | home |
| /home/user/docs/file.txt | user |
| /home/user/docs/file.txt | docs |
| /home/user/docs/file.txt | file.txt |
Working with Arrays
JSON Arrays
For JSON array fields, Split can expand:
Input:
| ID | Items |
|---|---|
| 1 | ["A","B","C"] |
Configuration:
- Parse JSON first if needed
- Split on array elements
Nested Arrays
For complex nested data:
- Extract the array field
- Apply Split
- Further process as needed
Combining with Other Functions
Split + Group By
Normalize then aggregate:
[Products with Tags] → [Split: Tags] → [GroupBy: Tag, COUNT ProductID]
Result: Count of products per tag
Split + Distinct
Get unique values from all rows:
[Data with Categories] → [Split: Categories] → [Distinct: Category]
Result: Unique list of all categories used
Split + Filter
Filter after splitting:
[Data] → [Split: Tags] → [Filter: Tag = "featured"]
Result: Only rows where "featured" was one of the tags
Split + Merge
Enrich split values:
[Orders with ProductCodes] → [Split: ProductCodes] → [Merge: Products on Code]
Result: Separate row for each product with full product details
Handling Edge Cases
Empty Field
When split field is empty or null:
- Default: Row is kept with null/empty value
- Skip empty: Row might be excluded
Single Value
When field contains no delimiter:
- Row passes through unchanged
- Single value in output field
Delimiter at Start/End
Input: ",A,B,C,"
Skip empty: "A", "B", "C"
Keep empty: "", "A", "B", "C", ""
Consecutive Delimiters
Input: "A,,B"
Skip empty: "A", "B"
Keep empty: "A", "", "B"
Delimiter in Value
If your data might contain the delimiter:
Problem:
Address: "123 Main St, Apt 4, City, State"
Split on comma: Wrong results
Solutions:
- Use different delimiter in source data
- Use more specific delimiter (e.g., " | ")
- Pre-process to escape or replace problematic instances
Performance Considerations
Row Explosion
Split can dramatically increase row count:
- 1,000 rows × 10 values = 10,000 rows
- Plan for expanded data volume
- Filter before split when possible
Memory Usage
Large splits consume memory:
- Each new row requires memory
- Consider batch processing for very large datasets
Downstream Impact
More rows means:
- Longer processing time
- More data to merge/join
- Larger output files
Optimization:
[Filter to relevant rows] → [Split] → [Continue]
vs.
[Split all data] → [Filter] (processes more rows)
Troubleshooting
No Rows Created
Symptoms: Output has same row count as input
Causes:
- Wrong delimiter specified
- Field doesn't contain the delimiter
- Wrong field selected
Solutions:
- Preview source data
- Check actual delimiter in data
- Verify field selection
Too Many Rows
Symptoms: Unexpected row explosion
Causes:
- Delimiter appears more than expected
- Delimiter found in regular content
Solutions:
- Use more specific delimiter
- Pre-process data to standardize
- Filter before split
Whitespace Issues
Symptoms: Values have leading/trailing spaces
Causes:
- Trim option disabled
- Source data has inconsistent spacing
Solutions:
- Enable trim option
- Add Transform after split:
TRIM({Field})
Missing Values
Symptoms: Some values not appearing
Causes:
- Empty values being skipped
- Delimiter issues
Solutions:
- Check "Handle empty values" setting
- Preview source data
Examples
Product Attribute Expansion
Scenario: Normalize product attributes for filtering
Input:
| SKU | Attributes |
|---|---|
| SKU001 | Color:Red|Size:Large|Material:Cotton |
Flow:
[Products] → [Split: Attributes on |] → [Further processing]
Output:
| SKU | Attribute |
|---|---|
| SKU001 | Color:Red |
| SKU001 | Size:Large |
| SKU001 | Material:Cotton |
Follow-up Transform:
AttributeName = SUBSTRING({Attribute}, 0, INDEXOF({Attribute}, ":"))
AttributeValue = SUBSTRING({Attribute}, INDEXOF({Attribute}, ":") + 1)
Order Line Items
Scenario: Expand order with product list
Input:
| OrderID | Products | Quantities |
|---|---|---|
| O001 | P1,P2,P3 | 1,2,1 |
Note: This requires parallel splits or different approach.
Better data model:
| OrderID | Product | Quantity |
|---|---|---|
| O001 | P1 | 1 |
| O001 | P2 | 2 |
| O001 | P3 | 1 |
Email Distribution List
Scenario: Send individual emails from distribution list
Input:
| CampaignID | Recipients | Subject |
|---|---|---|
| C001 | user1@co.com;user2@co.com;user3@co.com | Newsletter |
After Split:
| CampaignID | Recipient | Subject |
|---|---|---|
| C001 | user1@co.com | Newsletter |
| C001 | user2@co.com | Newsletter |
| C001 | user3@co.com | Newsletter |
Then: Each row triggers individual email send
Best Practices
Choose Appropriate Delimiters
When designing data:
- Use delimiters unlikely to appear in data
- Consider pipe (|) or double-pipe (||)
- Document the delimiter used
Validate Before Splitting
Ensure data quality:
- Check for unexpected delimiters
- Verify field contains expected format
- Handle edge cases
Consider Normalization
If frequently splitting:
- Consider normalizing source data
- One row per value from the start
- Reduces processing complexity
Add Row Identifiers
After splitting, track originals:
- Keep original ID
- Add sequence number within group
- Helps with debugging and tracing
Next Steps
- Transform Function - Process split values
- Group By Function - Aggregate split data
- Distinct Function - Get unique split values
- Building Flows - Complete workflow guide