Smart Transforms Build-Along Recipe with Clustering and Time Series

5
(2)

In this blog, we’ll show you how to perform white space analysis on merchandise recommendations to improve revenue and perform time series for product order volume using the smart transformations clustering and time series. This use case was part of DataDreamin’s Cup of Data vol 2 workshop, which you can find the recording to here.

Note: If you want a deep dive on Clustering check out this blog.

Getting Started with the Use Case

The purpose of this blog is to demonstrate how to solve real-world business issues using smart transforms in Data Prep recipes:

  • Perform white space analysis for sales opportunities using Clustering and Dashboard Visualization
  • Perform time series forecasting for product order volumes.

This blog uses the following Connected Objects: Account, Opportunity, Opportunity Line Item, and Product2.

  • Please ensure Data Sync is enabled and Account, Opportunity, Opportunity Line Item, Product2 are synced with the following fields (fields with asterisk are required, and the other fields are not required but useful)
  • If you do not use Opportunity Line Item or Product2 to track sales, I recommend following along with the Build-Along session with the pre-built recipe to ensure you understand the approach and can get the recipe working properly, so that you can adjust it according to your org’s needs at a later time.
AccountOpportunityOpportunity Line ItemProduct2
Account IDOpportunity ID
Line Item ID

Product ID
Account NameAccount IDOpportunity IDProduct Name
IndustryNameProduct IDProduct Family
Billing CountryStageOpportunity Product NameProduct Code
AmountQuantity
Close DateTotal Price
Closed
Won
  • To make use of the pre-built recipe JSONs, create an App called “Smart Transforms Example App” (exactly as it is written). The blog includes downloadable recipe JSONs that create the datasets in that App so this will simplify your recipe troubleshooting experience.

Smart Transforms Overview

Smart Transforms are ML-powered or algorithmic data transformations in Data Prep Recipes, designed specifically for Data Admin or tech-savvy Business Users who are interested in ML capabilities but do not have data scientists on staff and do not want to take on the added burden of managing model life cycles.

Clustering in the Data Prep recipe is an ML-powered way to segment data into similar groups based on the columns you want to segment them on. It works as follows: in the data prep recipe, you choose the data grain (eg Account) to cluster, you choose which columns are relevant to your clustering objective (eg Win Rate, Purchase Trend/Changes by Time Period, etc). The Clustering Transform then applies the ML algorithm and creates a new dimension column with the corresponding cluster label for each row. We support clustering using numerical data.

What’s Our Clustering Algorithm? We use K-Means under the hood for our clustering transform. The algorithm works by grouping data points closest to each other into a “cluster”. More specifically, “K” represents the number of clusters, and “means” is how we can calculate data points to each over: using average distance across the data points. The center of a cluster is where it has the shortest average distance to data points in the cluster, and a data point is associated with a specific cluster because it is closer to that cluster’s center than all other clusters centers (cluster centers are called “centroids”). You can find more info here. We chose K-Means because it is fast, scales to large volumes of data, and well-used in Data Science community.

Time Series Forecasting in Data Prep recipe applies statistical algorithm to forecasts data based on aggregated data points on a time scale. To forecast data, you first aggregate your data by a date field, then you choose what to forecast, how many data points to forecast, and the algorithm does the work. Because it is part of a data prep recipe, you can augment the forecast data using forward-looking external factors, write it into a dataset for visualization, or perform forecast trending comparisons.

What’s Our Time Series Forecast Algorithm? We uses Holt-Winters Exponential Smoothing to forecast data. It works the same as the timeseries function in our SAQL, and supports seasonal variations. You can find more info here.

White Space Analysis using Clustering

In this section, we will use Cluster transform in Data Prep Recipe to segment accounts using opportunities, opportunity line items, and products. While we won’t be building the dashboards in the exercise, we will need dashboards to visualize the characteristics of the clusters and finding white spaces according to cluster benchmarks. A package zip file including the final recipe and two sample dashboards are included at the bottom of the exercise, and you can install them directly into your org.

Cluster Analysis – Comparing Metrics Across Clusters

A dashboard can aid in visually comparing key business metrics and other characteristics across clusters quickly.

Screen Shot 2021-05-31 at 4.30.15 PM.png

Benchmark-Comparison-Driven White Space Finder

While it is easy to know exactly what a customer is not buying, it is not always clear why the customer is not buying certain products, nor how well that customer’s purchase patterns align with the “benchmark” of other customers’ purchase patterns. Using clustering, we can search across the typical rule-based data segmentation, to find new groups of data with similar patterns, and build new benchmarks to compare with.

Here’s an example of that benchmark-comparison-based white space finder dashboard.

Screen Shot 2021-05-31 at 4.32.12 PM.png

Project 1: Clustering Accounts by Opportunity Line Item/Product/Opportunity

In this exercise, we will cluster Accounts and create a new column for the cluster label, and also update Opportunity Line Items with the cluster label so that we can link data across Accounts and Opportunity Line Items.

Step A: Calculate Win Rate by Account
  1. Starting with a blank recipe, add Input Data using Account and Opportunity Connected Objects using the fields outlined above.
  2. Add a Filter node at the Opportunity Input: filter Opportunity with “Closed = True”
  3. Add an Aggregate node from the Filter Node; Aggregate the following values: Count of Rows, Average Amount, and Sum Amount, and Group by by Account ID
  4. At the Filter node from (2), add a branch to another Filter node. Filter Opportunity with “Won = True”.
  5. Add an Aggregate node at the Filter node from (4), aggregating the same stats (Count of Rows, Average Amount, Sum Amount, group by Account ID)
    1. You can optionally change the field names via a Transform node to avoid any confusion looking at field names.
  6. Drag the “Won” Branch to the Aggregate node from (3) to create a new Join node. Use “Lookup” as the Join Type, and ensure the Join Key is on Account ID, and rename the API Prefix to “Won”, then click “Apply”.
  7. Add a Transform node. We will use Custom Formula steps to calculate Win Rate by Rows and by Sum of Amount
    1. Add a custom formula for “Won.COUNT_Rows” / COUNT_Rows; choose “Numeric” as the output type, with Precision of 10, Scale of 2 and Default Value as 0; label it “Win Rate by Rows”
    2. Add a custom formula for “Won.SUM_Amount” / SUM_Amount; choose “Numeric'” as the output type, with Precision of 10, Scale of 2 and Default Value as 0; label it “Win Rate By Amount”
    3. Optionally, drop other fields that are not needed for clustering
    4. You can optionally apply Edit Attribute transforms on these two fields to change the API names.
  8. Drag the Transform node onto the Account input node to create a Join node (Join Type should be Lookup, with the join key is Account ID, and rename the API Prefix to “Opps”).

After Step A, your recipe should look like this:

Screen Shot 2021-05-31 at 9.19.55 PM.png

And here is the recipe JSON:

{
    "version": "52.0",
    "nodes": {
        "LOAD_DATASET0": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "type": "connectedDataset",
                    "connectionName": "SFDC_LOCAL",
                    "sourceObjectName": "Account",
                    "label": "Account"
                },
                "fields": [
                    "Id",
                    "Name",
                    "Industry",
                    "BillingCountry"
                ]
            },
            "sources": []
        },
        "LOAD_DATASET1": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "type": "connectedDataset",
                    "connectionName": "SFDC_LOCAL",
                    "sourceObjectName": "Opportunity",
                    "label": "Opportunity"
                },
                "fields": [
                    "Id",
                    "AccountId",
                    "Name",
                    "StageName",
                    "Amount",
                    "CloseDate",
                    "IsClosed",
                    "IsWon"
                ]
            },
            "sources": []
        },
        "FILTER0": {
            "action": "filter",
            "parameters": {
                "filterExpressions": [
                    {
                        "field": "IsClosed",
                        "operator": "EQUAL",
                        "operands": [
                            "true"
                        ],
                        "type": "TEXT"
                    }
                ]
            },
            "sources": [
                "LOAD_DATASET1"
            ]
        },
        "EXTRACT0": {
            "action": "extractGrains",
            "parameters": {
                "grainExtractions": []
            },
            "sources": [
                "FILTER0"
            ]
        },
        "AGGREGATE1": {
            "action": "aggregate",
            "parameters": {
                "aggregations": [
                    {
                        "name": "COUNT_Rows",
                        "label": "Rows",
                        "action": "COUNT"
                    },
                    {
                        "name": "AVG_Amount",
                        "label": "Average Amount",
                        "action": "AVG",
                        "source": "Amount"
                    },
                    {
                        "name": "SUM_Amount",
                        "label": "Sum of Amount",
                        "action": "SUM",
                        "source": "Amount"
                    }
                ],
                "groupings": [
                    "AccountId"
                ]
            },
            "sources": [
                "EXTRACT0"
            ]
        },
        "FILTER1": {
            "action": "filter",
            "parameters": {
                "filterExpressions": [
                    {
                        "field": "IsWon",
                        "operator": "EQUAL",
                        "operands": [
                            "true"
                        ],
                        "type": "TEXT"
                    }
                ]
            },
            "sources": [
                "FILTER0"
            ]
        },
        "EXTRACT1": {
            "action": "extractGrains",
            "parameters": {
                "grainExtractions": []
            },
            "sources": [
                "FILTER1"
            ]
        },
        "AGGREGATE3": {
            "action": "aggregate",
            "parameters": {
                "aggregations": [
                    {
                        "name": "COUNT_Rows",
                        "label": "Rows",
                        "action": "COUNT"
                    },
                    {
                        "name": "AVG_Amount",
                        "label": "Average Amount",
                        "action": "AVG",
                        "source": "Amount"
                    },
                    {
                        "name": "SUM_Amount",
                        "label": "Sum of Amount",
                        "action": "SUM",
                        "source": "Amount"
                    }
                ],
                "groupings": [
                    "AccountId"
                ]
            },
            "sources": [
                "EXTRACT1"
            ]
        },
        "JOIN0": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "AccountId"
                ],
                "rightQualifier": "Won",
                "rightKeys": [
                    "AccountId"
                ]
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": []
                }
            },
            "sources": [
                "AGGREGATE1",
                "AGGREGATE3"
            ]
        },
        "FORMULA0": {
            "action": "formula",
            "parameters": {
                "expressionType": "SQL",
                "fields": [
                    {
                        "name": "AccountId_formula",
                        "label": "Win Rate By Row",
                        "formulaExpression": "\"Won.COUNT_Rows\" / COUNT_Rows",
                        "type": "NUMBER",
                        "precision": 10,
                        "scale": 2,
                        "defaultValue": "0"
                    }
                ]
            },
            "sources": [
                "JOIN0"
            ]
        },
        "FORMULA1": {
            "action": "formula",
            "parameters": {
                "expressionType": "SQL",
                "fields": [
                    {
                        "name": "AccountId_formula1",
                        "label": "Win Rate By Amount",
                        "formulaExpression": "\"Won.SUM_Amount\" / SUM_Amount",
                        "type": "NUMBER",
                        "precision": 10,
                        "scale": 2,
                        "defaultValue": "0"
                    }
                ]
            },
            "sources": [
                "FORMULA0"
            ]
        },
        "JOIN1": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "Id"
                ],
                "rightQualifier": "Opps",
                "rightKeys": [
                    "AccountId"
                ]
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": [
                        "Opps.AccountId",
                        "Opps.Won.AccountId",
                        "Opps.Won.COUNT_Rows",
                        "Opps.Won.AVG_Amount",
                        "Opps.Won.SUM_Amount"
                    ]
                }
            },
            "sources": [
                "LOAD_DATASET0",
                "FORMULA1"
            ]
        }
    },
    "ui": {
        "nodes": {
            "LOAD_DATASET0": {
                "label": "Account",
                "type": "LOAD_DATASET",
                "top": 112,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "LOAD_DATASET1": {
                "label": "Opportunity",
                "type": "LOAD_DATASET",
                "top": 252,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "FILTER0": {
                "label": "Filter (Closed = True)",
                "description": "",
                "type": "FILTER",
                "top": 252,
                "left": 252
            },
            "AGGREGATE0": {
                "label": "Aggregate",
                "type": "AGGREGATE",
                "top": 252,
                "left": 392,
                "graph": {
                    "EXTRACT0": null,
                    "AGGREGATE1": null
                },
                "connectors": [
                    {
                        "source": "EXTRACT0",
                        "target": "AGGREGATE1"
                    }
                ]
            },
            "FILTER1": {
                "label": "Filter (Won = True)",
                "description": "",
                "type": "FILTER",
                "top": 392,
                "left": 392
            },
            "AGGREGATE2": {
                "label": "Aggregate",
                "type": "AGGREGATE",
                "top": 392,
                "left": 532,
                "graph": {
                    "EXTRACT1": null,
                    "AGGREGATE3": null
                },
                "connectors": [
                    {
                        "source": "EXTRACT1",
                        "target": "AGGREGATE3"
                    }
                ]
            },
            "JOIN0": {
                "label": "Join",
                "type": "JOIN",
                "top": 252,
                "left": 672
            },
            "TRANSFORM0": {
                "label": "Calculate Win Rates",
                "description": "",
                "type": "TRANSFORM",
                "top": 252,
                "left": 812,
                "graph": {
                    "FORMULA0": {
                        "parameters": {
                            "type": "BASE_FORMULA_UI"
                        }
                    },
                    "FORMULA1": {
                        "parameters": {
                            "type": "BASE_FORMULA_UI"
                        }
                    }
                },
                "connectors": [
                    {
                        "source": "FORMULA0",
                        "target": "FORMULA1"
                    }
                ]
            },
            "JOIN1": {
                "label": "Join",
                "type": "JOIN",
                "top": 252,
                "left": 952
            }
        },
        "connectors": [
            {
                "source": "LOAD_DATASET1",
                "target": "FILTER0"
            },
            {
                "source": "FILTER0",
                "target": "AGGREGATE0"
            },
            {
                "source": "FILTER0",
                "target": "FILTER1"
            },
            {
                "source": "FILTER1",
                "target": "AGGREGATE2"
            },
            {
                "source": "AGGREGATE0",
                "target": "JOIN0"
            },
            {
                "source": "AGGREGATE2",
                "target": "JOIN0"
            },
            {
                "source": "JOIN0",
                "target": "TRANSFORM0"
            },
            {
                "source": "LOAD_DATASET0",
                "target": "JOIN1"
            },
            {
                "source": "TRANSFORM0",
                "target": "JOIN1"
            }
        ],
        "hiddenColumns": []
    }
}
Step B: Cluster Account by Aggregated Opps Metrics
  1. After the Join node, add a Transform node; we will edit the field names.
    1. Click on “Columns” tab, select each row, and click on “Edit Attribute” icon from the toolbar.
    2. You can edit the Label and the API Name for the field.
    3. Continue until your Column Labels represents the screenshot below
  2. Screen Shot 2021-05-31 at 9.27.13 PM.png
  3. Add another Transform node; select a measure field, and add “Cluster” from the toolbar.
    1. Leave the Number of Clusters defaulted to 3.
    2. Choose the columns to determine the clusters “Deal Count”, “Average Deal Size”, “Account Lifetime Value”, “Win Rate By Row”, “Win Rate By Amount”
    3. Edit the Column Label if needed; the default value of “Cluster” is also fine.
  4. Add an Output node after the Transform node, name the output dataset “Smart Transforms – Clustered Accounts”, and save it in the “Smart Transforms Example App” you created earlier.
  5. Save and Run the Recipe.

Your recipe should look like this now:

Screen Shot 2021-05-31 at 9.33.41 PM.png

And here is the recipe JSON:

{
    "version": "52.0",
    "nodes": {
        "LOAD_DATASET0": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "type": "connectedDataset",
                    "connectionName": "SFDC_LOCAL",
                    "sourceObjectName": "Account",
                    "label": "Account"
                },
                "fields": [
                    "Id",
                    "Name",
                    "Industry",
                    "BillingCountry"
                ]
            },
            "sources": []
        },
        "LOAD_DATASET1": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "type": "connectedDataset",
                    "connectionName": "SFDC_LOCAL",
                    "sourceObjectName": "Opportunity",
                    "label": "Opportunity"
                },
                "fields": [
                    "Id",
                    "AccountId",
                    "Name",
                    "StageName",
                    "Amount",
                    "CloseDate",
                    "IsClosed",
                    "IsWon"
                ]
            },
            "sources": []
        },
        "FILTER0": {
            "action": "filter",
            "parameters": {
                "filterExpressions": [
                    {
                        "field": "IsClosed",
                        "operator": "EQUAL",
                        "operands": [
                            "true"
                        ],
                        "type": "TEXT"
                    }
                ]
            },
            "sources": [
                "LOAD_DATASET1"
            ]
        },
        "EXTRACT0": {
            "action": "extractGrains",
            "parameters": {
                "grainExtractions": []
            },
            "sources": [
                "FILTER0"
            ]
        },
        "AGGREGATE1": {
            "action": "aggregate",
            "parameters": {
                "aggregations": [
                    {
                        "name": "COUNT_Rows",
                        "label": "Rows",
                        "action": "COUNT"
                    },
                    {
                        "name": "AVG_Amount",
                        "label": "Average Amount",
                        "action": "AVG",
                        "source": "Amount"
                    },
                    {
                        "name": "SUM_Amount",
                        "label": "Sum of Amount",
                        "action": "SUM",
                        "source": "Amount"
                    }
                ],
                "groupings": [
                    "AccountId"
                ]
            },
            "sources": [
                "EXTRACT0"
            ]
        },
        "FILTER1": {
            "action": "filter",
            "parameters": {
                "filterExpressions": [
                    {
                        "field": "IsWon",
                        "operator": "EQUAL",
                        "operands": [
                            "true"
                        ],
                        "type": "TEXT"
                    }
                ]
            },
            "sources": [
                "FILTER0"
            ]
        },
        "EXTRACT1": {
            "action": "extractGrains",
            "parameters": {
                "grainExtractions": []
            },
            "sources": [
                "FILTER1"
            ]
        },
        "AGGREGATE3": {
            "action": "aggregate",
            "parameters": {
                "aggregations": [
                    {
                        "name": "COUNT_Rows",
                        "label": "Rows",
                        "action": "COUNT"
                    },
                    {
                        "name": "AVG_Amount",
                        "label": "Average Amount",
                        "action": "AVG",
                        "source": "Amount"
                    },
                    {
                        "name": "SUM_Amount",
                        "label": "Sum of Amount",
                        "action": "SUM",
                        "source": "Amount"
                    }
                ],
                "groupings": [
                    "AccountId"
                ]
            },
            "sources": [
                "EXTRACT1"
            ]
        },
        "JOIN0": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "AccountId"
                ],
                "rightQualifier": "Won",
                "rightKeys": [
                    "AccountId"
                ]
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": []
                }
            },
            "sources": [
                "AGGREGATE1",
                "AGGREGATE3"
            ]
        },
        "FORMULA0": {
            "action": "formula",
            "parameters": {
                "expressionType": "SQL",
                "fields": [
                    {
                        "name": "AccountId_formula",
                        "label": "Win Rate By Row",
                        "formulaExpression": "\"Won.COUNT_Rows\" / COUNT_Rows",
                        "type": "NUMBER",
                        "precision": 10,
                        "scale": 2,
                        "defaultValue": "0"
                    }
                ]
            },
            "sources": [
                "JOIN0"
            ]
        },
        "FORMULA1": {
            "action": "formula",
            "parameters": {
                "expressionType": "SQL",
                "fields": [
                    {
                        "name": "AccountId_formula1",
                        "label": "Win Rate By Amount",
                        "formulaExpression": "\"Won.SUM_Amount\" / SUM_Amount",
                        "type": "NUMBER",
                        "precision": 10,
                        "scale": 2,
                        "defaultValue": "0"
                    }
                ]
            },
            "sources": [
                "FORMULA0"
            ]
        },
        "JOIN1": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "Id"
                ],
                "rightQualifier": "Opps",
                "rightKeys": [
                    "AccountId"
                ]
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": [
                        "Opps.AccountId",
                        "Opps.Won.AccountId",
                        "Opps.Won.COUNT_Rows",
                        "Opps.Won.AVG_Amount",
                        "Opps.Won.SUM_Amount"
                    ]
                }
            },
            "sources": [
                "LOAD_DATASET0",
                "FORMULA1"
            ]
        },
        "EDIT_ATTRIBUTES0": {
            "action": "schema",
            "parameters": {
                "fields": [
                    {
                        "name": "Opps.COUNT_Rows",
                        "newProperties": {
                            "name": "DealCount",
                            "label": "Deal Count",
                            "typeProperties": null
                        }
                    }
                ]
            },
            "sources": [
                "JOIN1"
            ]
        },
        "EDIT_ATTRIBUTES1": {
            "action": "schema",
            "parameters": {
                "fields": [
                    {
                        "name": "Opps.AVG_Amount",
                        "newProperties": {
                            "name": "AverageDealSize",
                            "label": "Average Deal Size",
                            "typeProperties": null
                        }
                    }
                ]
            },
            "sources": [
                "EDIT_ATTRIBUTES0"
            ]
        },
        "EDIT_ATTRIBUTES2": {
            "action": "schema",
            "parameters": {
                "fields": [
                    {
                        "name": "Opps.SUM_Amount",
                        "newProperties": {
                            "name": "AccountLifetimeValue",
                            "label": "Account Lifetime Value",
                            "typeProperties": null
                        }
                    }
                ]
            },
            "sources": [
                "EDIT_ATTRIBUTES1"
            ]
        },
        "EDIT_ATTRIBUTES3": {
            "action": "schema",
            "parameters": {
                "fields": [
                    {
                        "name": "Opps.AccountId_formula",
                        "newProperties": {
                            "name": "WinRateByRow",
                            "label": "Win Rate By Row",
                            "typeProperties": null
                        }
                    }
                ]
            },
            "sources": [
                "EDIT_ATTRIBUTES2"
            ]
        },
        "EDIT_ATTRIBUTES4": {
            "action": "schema",
            "parameters": {
                "fields": [
                    {
                        "name": "Opps.AccountId_formula1",
                        "newProperties": {
                            "name": "WinRateByAmount",
                            "label": "Win Rate By Amount",
                            "typeProperties": null
                        }
                    }
                ]
            },
            "sources": [
                "EDIT_ATTRIBUTES3"
            ]
        },
        "CLUSTERING0": {
            "action": "clustering",
            "parameters": {
                "clusterCount": 3,
                "sourceFields": [
                    "DealCount",
                    "AverageDealSize",
                    "AccountLifetimeValue",
                    "WinRateByRow",
                    "WinRateByAmount"
                ],
                "targetField": {
                    "label": "Cluster",
                    "name": "Id_clustering"
                }
            },
            "sources": [
                "EDIT_ATTRIBUTES4"
            ]
        },
        "OUTPUT0": {
            "action": "save",
            "parameters": {
                "dataset": {
                    "type": "analyticsDataset",
                    "label": "Smart Transforms - Clustered Accounts",
                    "folderName": "Smart_Transforms_Example_App"
                },
                "fields": []
            },
            "sources": [
                "CLUSTERING0"
            ]
        }
    },
    "ui": {
        "nodes": {
            "LOAD_DATASET0": {
                "label": "Account",
                "type": "LOAD_DATASET",
                "top": 112,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "LOAD_DATASET1": {
                "label": "Opportunity",
                "type": "LOAD_DATASET",
                "top": 252,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "FILTER0": {
                "label": "Filter (Closed = True)",
                "description": "",
                "type": "FILTER",
                "top": 252,
                "left": 252
            },
            "AGGREGATE0": {
                "label": "Aggregate",
                "type": "AGGREGATE",
                "top": 252,
                "left": 392,
                "graph": {
                    "EXTRACT0": null,
                    "AGGREGATE1": null
                },
                "connectors": [
                    {
                        "source": "EXTRACT0",
                        "target": "AGGREGATE1"
                    }
                ]
            },
            "FILTER1": {
                "label": "Filter (Won = True)",
                "description": "",
                "type": "FILTER",
                "top": 392,
                "left": 392
            },
            "AGGREGATE2": {
                "label": "Aggregate",
                "type": "AGGREGATE",
                "top": 392,
                "left": 532,
                "graph": {
                    "EXTRACT1": null,
                    "AGGREGATE3": null
                },
                "connectors": [
                    {
                        "source": "EXTRACT1",
                        "target": "AGGREGATE3"
                    }
                ]
            },
            "JOIN0": {
                "label": "Join",
                "type": "JOIN",
                "top": 252,
                "left": 672
            },
            "TRANSFORM0": {
                "label": "Calculate Win Rates",
                "description": "",
                "type": "TRANSFORM",
                "top": 252,
                "left": 812,
                "graph": {
                    "FORMULA0": {
                        "parameters": {
                            "type": "BASE_FORMULA_UI"
                        }
                    },
                    "FORMULA1": {
                        "parameters": {
                            "type": "BASE_FORMULA_UI"
                        }
                    }
                },
                "connectors": [
                    {
                        "source": "FORMULA0",
                        "target": "FORMULA1"
                    }
                ]
            },
            "JOIN1": {
                "label": "Join",
                "type": "JOIN",
                "top": 112,
                "left": 952
            },
            "TRANSFORM1": {
                "label": "Rename Fields",
                "description": "",
                "type": "TRANSFORM",
                "top": 112,
                "left": 1092,
                "graph": {
                    "EDIT_ATTRIBUTES0": null,
                    "EDIT_ATTRIBUTES1": null,
                    "EDIT_ATTRIBUTES2": null,
                    "EDIT_ATTRIBUTES3": null,
                    "EDIT_ATTRIBUTES4": null
                },
                "connectors": [
                    {
                        "source": "EDIT_ATTRIBUTES0",
                        "target": "EDIT_ATTRIBUTES1"
                    },
                    {
                        "source": "EDIT_ATTRIBUTES1",
                        "target": "EDIT_ATTRIBUTES2"
                    },
                    {
                        "source": "EDIT_ATTRIBUTES2",
                        "target": "EDIT_ATTRIBUTES3"
                    },
                    {
                        "source": "EDIT_ATTRIBUTES3",
                        "target": "EDIT_ATTRIBUTES4"
                    }
                ]
            },
            "TRANSFORM2": {
                "label": "Clustering Accounts",
                "description": "",
                "type": "TRANSFORM",
                "top": 112,
                "left": 1232,
                "graph": {
                    "CLUSTERING0": null
                },
                "connectors": []
            },
            "OUTPUT0": {
                "label": "Output",
                "type": "OUTPUT",
                "top": 112,
                "left": 1372
            }
        },
        "connectors": [
            {
                "source": "LOAD_DATASET1",
                "target": "FILTER0"
            },
            {
                "source": "FILTER0",
                "target": "AGGREGATE0"
            },
            {
                "source": "FILTER0",
                "target": "FILTER1"
            },
            {
                "source": "FILTER1",
                "target": "AGGREGATE2"
            },
            {
                "source": "AGGREGATE0",
                "target": "JOIN0"
            },
            {
                "source": "AGGREGATE2",
                "target": "JOIN0"
            },
            {
                "source": "JOIN0",
                "target": "TRANSFORM0"
            },
            {
                "source": "LOAD_DATASET0",
                "target": "JOIN1"
            },
            {
                "source": "TRANSFORM0",
                "target": "JOIN1"
            },
            {
                "source": "JOIN1",
                "target": "TRANSFORM1"
            },
            {
                "source": "TRANSFORM1",
                "target": "TRANSFORM2"
            },
            {
                "source": "TRANSFORM2",
                "target": "OUTPUT0"
            }
        ],
        "hiddenColumns": []
    }
}
Step C: Apply Account Clusters to Opportunity Line Items for White Space Finder
  1. Add Data Sources for “Opportunity Line Item” and “Product2”. This should end up with two new input types.
  2. Drag the slider from Product2 and drop it onto Opportunity Line Item, and choose “Join”.
    1. Join Type: Lookup
    2. Join Key: Product ID on both sides
    3. API Prefix: Product2
  3. This step is very important: drag the “branch” icon from the “Opportunity” input node to the Join node to create a new Join node
  4. drag_and_join.gif
  5. Apply the same approach from the “Clustering Accounts” transform node: drag that Transform node onto the new Join node. This will give you Opportunity Line Item with Product Family, and Opp info, as well as Account Clustering. From the “Columns” tab, you can deselect any fields from Opportunity or Account if you do not need them at the Opportunity Line Item grain.
    1. Join Type: Lookup
    2. Join Key: Account ID on both sides
    3. API Prefix: Account
  6. You may observe that Opportunity Line Item may not have an associated account (and therefore the Cluster field is null); that would point to a data problem but for completeness sake, you can also replace the Null in Cluster column with “N/A” with another Transform node: Add a transform node after the Join node. In the Preview pane, locate and select the “Cluster” column, and click on “Replace” icon from the toolbar to replace Null with a different text; leave it in the Original Column.
  7. Create an Output node with a dataset name “Smart Transforms – Clustered Opp Products” in the “Smart Transforms Example App”.

After Step C, your recipe now should look like this:

Screen Shot 2021-05-31 at 9.51.46 PM.png

And here is the recipe JSON:

{
    "version": "52.0",
    "nodes": {
        "LOAD_DATASET0": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "type": "connectedDataset",
                    "connectionName": "SFDC_LOCAL",
                    "sourceObjectName": "Account",
                    "label": "Account"
                },
                "fields": [
                    "Id",
                    "Name",
                    "Industry",
                    "BillingCountry"
                ]
            },
            "sources": []
        },
        "LOAD_DATASET1": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "type": "connectedDataset",
                    "connectionName": "SFDC_LOCAL",
                    "sourceObjectName": "Opportunity",
                    "label": "Opportunity"
                },
                "fields": [
                    "Id",
                    "AccountId",
                    "Name",
                    "StageName",
                    "Amount",
                    "CloseDate",
                    "IsClosed",
                    "IsWon"
                ]
            },
            "sources": []
        },
        "FILTER0": {
            "action": "filter",
            "parameters": {
                "filterExpressions": [
                    {
                        "field": "IsClosed",
                        "operator": "EQUAL",
                        "operands": [
                            "true"
                        ],
                        "type": "TEXT"
                    }
                ]
            },
            "sources": [
                "LOAD_DATASET1"
            ]
        },
        "EXTRACT0": {
            "action": "extractGrains",
            "parameters": {
                "grainExtractions": []
            },
            "sources": [
                "FILTER0"
            ]
        },
        "AGGREGATE1": {
            "action": "aggregate",
            "parameters": {
                "aggregations": [
                    {
                        "name": "COUNT_Rows",
                        "label": "Rows",
                        "action": "COUNT"
                    },
                    {
                        "name": "AVG_Amount",
                        "label": "Average Amount",
                        "action": "AVG",
                        "source": "Amount"
                    },
                    {
                        "name": "SUM_Amount",
                        "label": "Sum of Amount",
                        "action": "SUM",
                        "source": "Amount"
                    }
                ],
                "groupings": [
                    "AccountId"
                ]
            },
            "sources": [
                "EXTRACT0"
            ]
        },
        "FILTER1": {
            "action": "filter",
            "parameters": {
                "filterExpressions": [
                    {
                        "field": "IsWon",
                        "operator": "EQUAL",
                        "operands": [
                            "true"
                        ],
                        "type": "TEXT"
                    }
                ]
            },
            "sources": [
                "FILTER0"
            ]
        },
        "EXTRACT1": {
            "action": "extractGrains",
            "parameters": {
                "grainExtractions": []
            },
            "sources": [
                "FILTER1"
            ]
        },
        "AGGREGATE3": {
            "action": "aggregate",
            "parameters": {
                "aggregations": [
                    {
                        "name": "COUNT_Rows",
                        "label": "Rows",
                        "action": "COUNT"
                    },
                    {
                        "name": "AVG_Amount",
                        "label": "Average Amount",
                        "action": "AVG",
                        "source": "Amount"
                    },
                    {
                        "name": "SUM_Amount",
                        "label": "Sum of Amount",
                        "action": "SUM",
                        "source": "Amount"
                    }
                ],
                "groupings": [
                    "AccountId"
                ]
            },
            "sources": [
                "EXTRACT1"
            ]
        },
        "JOIN0": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "AccountId"
                ],
                "rightQualifier": "Won",
                "rightKeys": [
                    "AccountId"
                ]
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": []
                }
            },
            "sources": [
                "AGGREGATE1",
                "AGGREGATE3"
            ]
        },
        "FORMULA0": {
            "action": "formula",
            "parameters": {
                "expressionType": "SQL",
                "fields": [
                    {
                        "name": "AccountId_formula",
                        "label": "Win Rate By Row",
                        "formulaExpression": "\"Won.COUNT_Rows\" / COUNT_Rows",
                        "type": "NUMBER",
                        "precision": 10,
                        "scale": 2,
                        "defaultValue": "0"
                    }
                ]
            },
            "sources": [
                "JOIN0"
            ]
        },
        "FORMULA1": {
            "action": "formula",
            "parameters": {
                "expressionType": "SQL",
                "fields": [
                    {
                        "name": "AccountId_formula1",
                        "label": "Win Rate By Amount",
                        "formulaExpression": "\"Won.SUM_Amount\" / SUM_Amount",
                        "type": "NUMBER",
                        "precision": 10,
                        "scale": 2,
                        "defaultValue": "0"
                    }
                ]
            },
            "sources": [
                "FORMULA0"
            ]
        },
        "JOIN1": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "Id"
                ],
                "rightQualifier": "Opps",
                "rightKeys": [
                    "AccountId"
                ]
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": [
                        "Opps.AccountId",
                        "Opps.Won.AccountId",
                        "Opps.Won.COUNT_Rows",
                        "Opps.Won.AVG_Amount",
                        "Opps.Won.SUM_Amount"
                    ]
                }
            },
            "sources": [
                "LOAD_DATASET0",
                "FORMULA1"
            ]
        },
        "EDIT_ATTRIBUTES0": {
            "action": "schema",
            "parameters": {
                "fields": [
                    {
                        "name": "Opps.COUNT_Rows",
                        "newProperties": {
                            "name": "DealCount",
                            "label": "Deal Count",
                            "typeProperties": null
                        }
                    }
                ]
            },
            "sources": [
                "JOIN1"
            ]
        },
        "EDIT_ATTRIBUTES1": {
            "action": "schema",
            "parameters": {
                "fields": [
                    {
                        "name": "Opps.AVG_Amount",
                        "newProperties": {
                            "name": "AverageDealSize",
                            "label": "Average Deal Size",
                            "typeProperties": null
                        }
                    }
                ]
            },
            "sources": [
                "EDIT_ATTRIBUTES0"
            ]
        },
        "EDIT_ATTRIBUTES2": {
            "action": "schema",
            "parameters": {
                "fields": [
                    {
                        "name": "Opps.SUM_Amount",
                        "newProperties": {
                            "name": "AccountLifetimeValue",
                            "label": "Account Lifetime Value",
                            "typeProperties": null
                        }
                    }
                ]
            },
            "sources": [
                "EDIT_ATTRIBUTES1"
            ]
        },
        "EDIT_ATTRIBUTES3": {
            "action": "schema",
            "parameters": {
                "fields": [
                    {
                        "name": "Opps.AccountId_formula",
                        "newProperties": {
                            "name": "WinRateByRow",
                            "label": "Win Rate By Row",
                            "typeProperties": null
                        }
                    }
                ]
            },
            "sources": [
                "EDIT_ATTRIBUTES2"
            ]
        },
        "EDIT_ATTRIBUTES4": {
            "action": "schema",
            "parameters": {
                "fields": [
                    {
                        "name": "Opps.AccountId_formula1",
                        "newProperties": {
                            "name": "WinRateByAmount",
                            "label": "Win Rate By Amount",
                            "typeProperties": null
                        }
                    }
                ]
            },
            "sources": [
                "EDIT_ATTRIBUTES3"
            ]
        },
        "CLUSTERING0": {
            "action": "clustering",
            "parameters": {
                "clusterCount": 3,
                "sourceFields": [
                    "DealCount",
                    "AverageDealSize",
                    "AccountLifetimeValue",
                    "WinRateByRow",
                    "WinRateByAmount"
                ],
                "targetField": {
                    "label": "Cluster",
                    "name": "Id_clustering"
                }
            },
            "sources": [
                "EDIT_ATTRIBUTES4"
            ]
        },
        "OUTPUT0": {
            "action": "save",
            "parameters": {
                "dataset": {
                    "type": "analyticsDataset",
                    "label": "Smart Transforms - Clustered Accounts",
                    "folderName": "Smart_Transforms_Example_App"
                },
                "fields": []
            },
            "sources": [
                "CLUSTERING0"
            ]
        },
        "LOAD_DATASET2": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "type": "connectedDataset",
                    "connectionName": "SFDC_LOCAL",
                    "sourceObjectName": "OpportunityLineItem",
                    "label": "OpportunityLineItem"
                },
                "fields": [
                    "Id",
                    "OpportunityId",
                    "Product2Id",
                    "Name",
                    "Quantity",
                    "TotalPrice"
                ]
            },
            "sources": []
        },
        "LOAD_DATASET3": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "type": "connectedDataset",
                    "connectionName": "SFDC_LOCAL",
                    "sourceObjectName": "Product2",
                    "label": "Product2"
                },
                "fields": [
                    "Id",
                    "Name",
                    "Family",
                    "ProductCode"
                ]
            },
            "sources": []
        },
        "JOIN2": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "Product2Id"
                ],
                "rightQualifier": "Product2",
                "rightKeys": [
                    "Id"
                ]
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": []
                }
            },
            "sources": [
                "LOAD_DATASET2",
                "LOAD_DATASET3"
            ]
        },
        "JOIN3": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "OpportunityId"
                ],
                "rightQualifier": "Opp",
                "rightKeys": [
                    "Id"
                ]
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": []
                }
            },
            "sources": [
                "JOIN2",
                "LOAD_DATASET1"
            ]
        },
        "JOIN4": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "Opp.AccountId"
                ],
                "rightQualifier": "Account",
                "rightKeys": [
                    "Id"
                ]
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": []
                }
            },
            "sources": [
                "JOIN3",
                "CLUSTERING0"
            ]
        },
        "OUTPUT1": {
            "action": "save",
            "parameters": {
                "dataset": {
                    "type": "analyticsDataset",
                    "label": "Smart Transforms - Clustered Opp Products",
                    "folderName": "Smart_Transforms_Example_App"
                },
                "fields": []
            },
            "sources": [
                "JOIN4"
            ]
        }
    },
    "ui": {
        "nodes": {
            "LOAD_DATASET0": {
                "label": "Account",
                "type": "LOAD_DATASET",
                "top": 112,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "LOAD_DATASET1": {
                "label": "Opportunity",
                "type": "LOAD_DATASET",
                "top": 252,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "FILTER0": {
                "label": "Filter (Closed = True)",
                "description": "",
                "type": "FILTER",
                "top": 252,
                "left": 252
            },
            "AGGREGATE0": {
                "label": "Aggregate",
                "type": "AGGREGATE",
                "top": 252,
                "left": 392,
                "graph": {
                    "EXTRACT0": null,
                    "AGGREGATE1": null
                },
                "connectors": [
                    {
                        "source": "EXTRACT0",
                        "target": "AGGREGATE1"
                    }
                ]
            },
            "FILTER1": {
                "label": "Filter (Won = True)",
                "description": "",
                "type": "FILTER",
                "top": 392,
                "left": 392
            },
            "AGGREGATE2": {
                "label": "Aggregate",
                "type": "AGGREGATE",
                "top": 392,
                "left": 532,
                "graph": {
                    "EXTRACT1": null,
                    "AGGREGATE3": null
                },
                "connectors": [
                    {
                        "source": "EXTRACT1",
                        "target": "AGGREGATE3"
                    }
                ]
            },
            "JOIN0": {
                "label": "Join",
                "type": "JOIN",
                "top": 252,
                "left": 672
            },
            "TRANSFORM0": {
                "label": "Calculate Win Rates",
                "description": "",
                "type": "TRANSFORM",
                "top": 252,
                "left": 812,
                "graph": {
                    "FORMULA0": {
                        "parameters": {
                            "type": "BASE_FORMULA_UI"
                        }
                    },
                    "FORMULA1": {
                        "parameters": {
                            "type": "BASE_FORMULA_UI"
                        }
                    }
                },
                "connectors": [
                    {
                        "source": "FORMULA0",
                        "target": "FORMULA1"
                    }
                ]
            },
            "JOIN1": {
                "label": "Join",
                "type": "JOIN",
                "top": 112,
                "left": 952
            },
            "TRANSFORM1": {
                "label": "Rename Fields",
                "description": "",
                "type": "TRANSFORM",
                "top": 112,
                "left": 1092,
                "graph": {
                    "EDIT_ATTRIBUTES0": null,
                    "EDIT_ATTRIBUTES1": null,
                    "EDIT_ATTRIBUTES2": null,
                    "EDIT_ATTRIBUTES3": null,
                    "EDIT_ATTRIBUTES4": null
                },
                "connectors": [
                    {
                        "source": "EDIT_ATTRIBUTES0",
                        "target": "EDIT_ATTRIBUTES1"
                    },
                    {
                        "source": "EDIT_ATTRIBUTES1",
                        "target": "EDIT_ATTRIBUTES2"
                    },
                    {
                        "source": "EDIT_ATTRIBUTES2",
                        "target": "EDIT_ATTRIBUTES3"
                    },
                    {
                        "source": "EDIT_ATTRIBUTES3",
                        "target": "EDIT_ATTRIBUTES4"
                    }
                ]
            },
            "TRANSFORM2": {
                "label": "Clustering Accounts",
                "description": "",
                "type": "TRANSFORM",
                "top": 112,
                "left": 1232,
                "graph": {
                    "CLUSTERING0": null
                },
                "connectors": []
            },
            "OUTPUT0": {
                "label": "Output",
                "type": "OUTPUT",
                "top": 112,
                "left": 1512
            },
            "LOAD_DATASET2": {
                "label": "OpportunityLineItem",
                "type": "LOAD_DATASET",
                "top": 532,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "LOAD_DATASET3": {
                "label": "Product2",
                "type": "LOAD_DATASET",
                "top": 672,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "JOIN2": {
                "label": "Join",
                "type": "JOIN",
                "top": 532,
                "left": 252
            },
            "JOIN3": {
                "label": "Join",
                "type": "JOIN",
                "top": 532,
                "left": 392
            },
            "JOIN4": {
                "label": "Join",
                "type": "JOIN",
                "top": 532,
                "left": 1232
            },
            "OUTPUT1": {
                "label": "Output",
                "type": "OUTPUT",
                "top": 532,
                "left": 1372
            }
        },
        "connectors": [
            {
                "source": "LOAD_DATASET1",
                "target": "FILTER0"
            },
            {
                "source": "FILTER0",
                "target": "AGGREGATE0"
            },
            {
                "source": "FILTER0",
                "target": "FILTER1"
            },
            {
                "source": "FILTER1",
                "target": "AGGREGATE2"
            },
            {
                "source": "AGGREGATE0",
                "target": "JOIN0"
            },
            {
                "source": "AGGREGATE2",
                "target": "JOIN0"
            },
            {
                "source": "JOIN0",
                "target": "TRANSFORM0"
            },
            {
                "source": "LOAD_DATASET0",
                "target": "JOIN1"
            },
            {
                "source": "TRANSFORM0",
                "target": "JOIN1"
            },
            {
                "source": "JOIN1",
                "target": "TRANSFORM1"
            },
            {
                "source": "TRANSFORM1",
                "target": "TRANSFORM2"
            },
            {
                "source": "TRANSFORM2",
                "target": "OUTPUT0"
            },
            {
                "source": "LOAD_DATASET2",
                "target": "JOIN2"
            },
            {
                "source": "LOAD_DATASET3",
                "target": "JOIN2"
            },
            {
                "source": "JOIN2",
                "target": "JOIN3"
            },
            {
                "source": "LOAD_DATASET1",
                "target": "JOIN3"
            },
            {
                "source": "JOIN3",
                "target": "JOIN4"
            },
            {
                "source": "TRANSFORM2",
                "target": "JOIN4"
            },
            {
                "source": "JOIN4",
                "target": "OUTPUT1"
            }
        ],
        "hiddenColumns": []
    }
}

Now, go ahead and run the recipe, and explore the clustered account dataset.

Pro Tip: If you find that most of the data are in one cluster, your data may have outliers. Because K-Means is a distance-based algorithm, outliers tend to occupy clusters by themselves. You will need to examine the data to determine if the generated clusters are correct and business-relevant. To counteract the impact of outliers on your clusters, you can increase the number of clusters in your configuration. Alternatively, you may need to split a high-density cluster by running clustering on that portion of the data again.

To explore your clustered Accounts and clustered Opportunity Line Items, you will need to create dashboards to visualize the data. To simplify the process, here is a package zip file containing the App, the recipe, and the two dashboards for Cluster Comparison and White Space Finder. You can deploy this package into your sandbox org or dev org directly, and after running data sync and the recipe, the dashboards in the “Smart Transforms Example App” will display data correctly.

Time Series Forecasting

In this section, we will forecast sales pipeline generation and case volume.

Heads-Up: Time Series requires your data to be aggregated by a date field. If your data has multiple rows on the same date grouping, the measures must be aggregated (eg Count, Sum, Average, etc) by a date field before you can apply Time Series Forecasting, or it will not work correctly. Also, as of Summer ‘21, we do not support Partition, so you must aggregate your data by a single Date field only.

Project: Forecast

  • Create a recipe; Add Data Sources with Opportunity, Opportunity Line Item, and Product2
  • Drag Product2 onto Opportunity Line Item to create a Join node
    • Join Type: Lookup
    • Join Key: Product ID
    • API Prefix: Product2
  • Drag Opportunity onto the Join node to create another Join node
    • Join Type: Lookup
    • Join Key: Opportunity ID
    • API Prefix: Opps
  • Add Aggregate node after the Join node
    • Aggregates: Sum of Total Price, Sum of Quantity
    • Group Rows by: Close Date (by Year-Month)
  • Add a Transform node, and from the toolbar, add Time Series Forecasting.
    • Choose your Date Grouping (Year-Month) and the corresponding Year/Month fields
    • Select a Seasonality (eg 12 months in a year, or leave it as Auto to let the algorithm detect it).
    • Specify the number of points to forecast. Go with 12 in this exercise.
  • Create an output node to write your dataset.

And that is it!

Time Series Recipe.png

And here is the recipe JSON:

{
    "version": "52.0",
    "nodes": {
        "LOAD_DATASET0": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "connectionName": "SFDC_LOCAL",
                    "label": "OpportunityLineItem",
                    "sourceObjectName": "OpportunityLineItem",
                    "type": "connectedDataset"
                },
                "fields": [
                    "Id",
                    "OpportunityId",
                    "Product2Id",
                    "ProductCode",
                    "Name",
                    "Quantity",
                    "TotalPrice"
                ]
            },
            "sources": []
        },
        "LOAD_DATASET1": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "connectionName": "SFDC_LOCAL",
                    "label": "Opportunity",
                    "sourceObjectName": "Opportunity",
                    "type": "connectedDataset"
                },
                "fields": [
                    "Id",
                    "AccountId",
                    "Name",
                    "StageName",
                    "Amount",
                    "CloseDate",
                    "IsClosed",
                    "IsWon"
                ]
            },
            "sources": []
        },
        "LOAD_DATASET2": {
            "action": "load",
            "parameters": {
                "dataset": {
                    "connectionName": "SFDC_LOCAL",
                    "label": "Product2",
                    "sourceObjectName": "Product2",
                    "type": "connectedDataset"
                },
                "fields": [
                    "Id",
                    "Name",
                    "ProductCode",
                    "Family"
                ]
            },
            "sources": []
        },
        "JOIN0": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "Product2Id"
                ],
                "rightKeys": [
                    "Id"
                ],
                "rightQualifier": "Product2"
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": []
                }
            },
            "sources": [
                "LOAD_DATASET0",
                "LOAD_DATASET2"
            ]
        },
        "JOIN1": {
            "action": "join",
            "parameters": {
                "joinType": "LOOKUP",
                "leftKeys": [
                    "OpportunityId"
                ],
                "rightKeys": [
                    "Id"
                ],
                "rightQualifier": "Opportunit"
            },
            "schema": {
                "slice": {
                    "mode": "DROP",
                    "ignoreMissingFields": true,
                    "fields": []
                }
            },
            "sources": [
                "JOIN0",
                "LOAD_DATASET1"
            ]
        },
        "EXTRACT0": {
            "action": "extractGrains",
            "parameters": {
                "grainExtractions": [
                    {
                        "source": "Opportunit.CloseDate",
                        "targets": [
                            {
                                "grainType": "MONTH",
                                "label": "Opportunit.Close Date Month",
                                "name": "Opportunit.CloseDate_MONTH"
                            },
                            {
                                "grainType": "YEAR",
                                "label": "Opportunit.Close Date Year",
                                "name": "Opportunit.CloseDate_YEAR"
                            }
                        ]
                    }
                ]
            },
            "sources": [
                "JOIN1"
            ]
        },
        "AGGREGATE1": {
            "action": "aggregate",
            "parameters": {
                "aggregations": [
                    {
                        "action": "SUM",
                        "label": "Sum of Quantity",
                        "name": "SUM_Quantity",
                        "source": "Quantity"
                    }
                ],
                "groupings": [
                    "Opportunit.CloseDate_MONTH",
                    "Opportunit.CloseDate_YEAR"
                ],
                "pivots": [
                    {
                        "source": "Product2.Family",
                        "values": [
                            "Accessories",
                            "Laptop",
                            "Phone",
                            "Services",
                            "Tablet"
                        ]
                    }
                ]
            },
            "sources": [
                "EXTRACT0"
            ]
        },
        "TIME_SERIES0": {
            "action": "timeSeries",
            "parameters": {
                "confidenceInterval": "None",
                "confidenceIntervalFields": {},
                "dayField": "",
                "forecastFields": [
                    "Product2.Family_1",
                    "Product2.Family_2",
                    "Product2.Family_3",
                    "Product2.Family_4",
                    "Product2.Family_5"
                ],
                "forecastLength": 6,
                "groupDatesBy": "Y-M",
                "ignoreLastTimePeriod": false,
                "model": "Auto",
                "seasonality": 12,
                "subYearField": "Opportunit.CloseDate_MONTH",
                "targetDateField": {
                    "label": "Date",
                    "name": "Forecast_Date__c212"
                },
                "targetForecastFields": [
                    {
                        "label": "Accessories_Sum of Quantity Forecast",
                        "name": "Product2.Family_1_forecast205"
                    },
                    {
                        "label": "Laptop_Sum of Quantity Forecast",
                        "name": "Product2.Family_2_forecast206"
                    },
                    {
                        "label": "Phone_Sum of Quantity Forecast",
                        "name": "Product2.Family_3_forecast207"
                    },
                    {
                        "label": "Services_Sum of Quantity Forecast",
                        "name": "Product2.Family_4_forecast208"
                    },
                    {
                        "label": "Tablet_Sum of Quantity Forecast",
                        "name": "Product2.Family_5_forecast209"
                    }
                ],
                "yearField": "Opportunit.CloseDate_YEAR"
            },
            "sources": [
                "AGGREGATE1"
            ]
        },
        "OUTPUT0": {
            "action": "save",
            "parameters": {
                "dataset": {
                    "folderName": "SmartTransform_BuildAlongWorkshop",
                    "label": "Workshop - Time Series - Product Quantity Forecast Dataset",
                    "name": "Workshop_Time_Series_Product_Quantity_Forecast_Dataset",
                    "type": "analyticsDataset"
                },
                "fields": []
            },
            "sources": [
                "TIME_SERIES0"
            ]
        }
    },
    "ui": {
        "nodes": {
            "LOAD_DATASET0": {
                "label": "OpportunityLineItem",
                "type": "LOAD_DATASET",
                "top": 112,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "LOAD_DATASET1": {
                "label": "Opportunity",
                "type": "LOAD_DATASET",
                "top": 392,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "LOAD_DATASET2": {
                "label": "Product2",
                "type": "LOAD_DATASET",
                "top": 252,
                "left": 112,
                "parameters": {
                    "sampleSize": 2000
                }
            },
            "JOIN0": {
                "label": "Join",
                "type": "JOIN",
                "top": 112,
                "left": 252
            },
            "JOIN1": {
                "label": "Join",
                "type": "JOIN",
                "top": 112,
                "left": 392
            },
            "AGGREGATE0": {
                "label": "Aggregate",
                "type": "AGGREGATE",
                "top": 112,
                "left": 532,
                "graph": {
                    "EXTRACT0": null,
                    "AGGREGATE1": null
                },
                "connectors": [
                    {
                        "source": "EXTRACT0",
                        "target": "AGGREGATE1"
                    }
                ]
            },
            "TRANSFORM0": {
                "label": "Time Series Forecasting",
                "description": "",
                "type": "TRANSFORM",
                "top": 112,
                "left": 672,
                "graph": {
                    "TIME_SERIES0": null
                },
                "connectors": []
            },
            "OUTPUT0": {
                "label": "Output",
                "type": "OUTPUT",
                "top": 112,
                "left": 812
            }
        },
        "connectors": [
            {
                "source": "LOAD_DATASET0",
                "target": "JOIN0"
            },
            {
                "source": "LOAD_DATASET2",
                "target": "JOIN0"
            },
            {
                "source": "JOIN0",
                "target": "JOIN1"
            },
            {
                "source": "LOAD_DATASET1",
                "target": "JOIN1"
            },
            {
                "source": "JOIN1",
                "target": "AGGREGATE0"
            },
            {
                "source": "AGGREGATE0",
                "target": "TRANSFORM0"
            },
            {
                "source": "TRANSFORM0",
                "target": "OUTPUT0"
            }
        ],
        "hiddenColumns": []
    }
}

Heads-Up: You need to have at least 2 full seasons of data for Time Series Forecast to forecast correctly. If you specify the seasonality to be 12 months in a year, but you have only 1 year of data, it will not forecast properly.

Pro Tip: Forecast models are “Additive” or “Multiplicative”. See more here. This model selection is to handle seasonality variations across time. If the seasonal variations are more or less the same across the datasets, then Additive model is the right choice. If seasonal variations continue to increase or decrease across the series (multiple seasons), then Multiplicative model is the right choice. If you leave it as Auto, we will run both models to determine which model fits the data better.

Heads-Up: Due to the field naming conventions in Data Prep Recipes, since you are aggregating numbers prior to forecasting them, the API names may not match what you are used to in SAQL timeseries function. You may need to update them in the recipe.

The fact that you can persist the forecast data into a dataset means that you also can apply external factors to the forecast data to adjust for known events such as new product launches or significant world events.

Conversely, since the forecast data is persisted into a dataset, it does not respond to query filter changes (i.e. drill-downs) in dashboards. Use time series forecasting for overarching trends that do not require user interaction at a dashboard level.

How useful was this post?

Click on a star to rate useful the post is!

Written by


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.