Bindings for SQL queries

5
(6)

I’ve previously in great detail covered bindings aka interactions for compact and SAQL queries. This included data selection function, data serialization functions, and data manipulations. As CRM Analytics continues to evolve with each release we get new additions. With the introduction of Salesforce Direct, Snowflake Direct, and SQL query support more power and flexibility have been given to users, however, users still want the same flexibility as we have with compact and SAQL queries. A particular use case I’ve seen is to pass filters from SAQL queries to Snowflake Direct for security reasons. But I know there are many more use cases for making these new query types dynamic, so the product and engineering team has introduced a new set of SQL data serialization. In this blog, I will try to demystify them and provide examples of how to use them.

NOTE: If you are new to bindings I recommend you as a minimum read through the first part of demystifying bindings blogs series to understand how a binding is constructed. Or you can watch this Learning Days webinar on the same topic.

Anatomy of a Binding

There are now five Data Serialization functions that support SQL. Before getting to the individual functions, let’s make it clear what the function of a Data Serialization is, and to do this let’s review the components of a binding. Remember you can always go back to the binding blog series or this webinar to get a deep dive into the autonomy of a binding in a none SQL context.

In a binding, we as a minimum have a Data Selection and a Data Serialization and we can optionally use a Data Manipulation as well.

The Data Selection determines the data input of a binding and refers to a source query, the cell, column, or row of data to use as well as the type of binding it is.

The Data Serialization determines the output of the data input (the Data Selection) and will convert it to a specific syntax that is accepted by the query language.

Data Manipulation is an optional element in manipulating the data input such as dealing with null values, converting data input to an array etc. In a regular compact and SAQL binding, we rarely need to use this option, but it becomes relevant when using some of the new Data Serialization for SQL.

Available SQL Data Serializations

There are five SQL Data Serialization functions that take the Data Selection and convert it into a syntax readable for a Snowflake Direct, Dataset SQL, or Salesforce Direct (SOQL) query. Those Data Serializations are:

  • .asSQLSelect() – used in the SELECT statement
  • .asSQLGrouping() – used in the GROUPING statement
  • .asSQLWhere() – used in the WHERE statement
  • .asSQLHaving() – used in the HAVING statement
  • .asSQLOrder() – used in the ORDER statement

As mentioned these can all be used for Snowflake Direct, Dataset SQL, Salesforce Direct queries, CDP Direct, and Google Big Query Direct. This is done by specifying the type – we will get to how you do this. But in this blog, I’ll demonstrate the Data Serializations in a Dataset SQL query only, the approach is the same regardless of the SQL type.

Note: I delivered a webinar on the SQL data serialization, which can be reviewed instead or in addition to this blog. Find the webinar here.

NOTE: These SQL Data Serializations are supported from the Spring 22 release, whereas Google Big Query is supported from the Summer 22 release.

.asSQLSelect()

As mentioned the purpose of this Data Serialization is to be able to make the SELECT statement of a SQL query dynamic. Thus if you want to control what fields are included in the query then this is your best bet.

The Syntax
[Data Selection].asSQLSelect(SQLType)

The SQLType is either dataset, snowflake, sobject for SOQL, cdp or bigquery for Google Big Query, this needs to match the query type. In my example, I’ll be using dataset.

Expected Input

The Data Serialization is expecting an expression and an alias in an array.

["Industry", "Industry"]

or

["SUM(Amount)", "Sum_Amount"]
Expected Output

The Data Serialization will convert the Data Selection in the following format

Industry AS "Industry"

or

SUM(Amount) as "Sum_Amount"
Example

To illustrate the .asSQLSelect() Data Serialization I have a custom query with an “Expression” and an “Alias” column that is used to allow users to select between Account.Industry and Account.Type in my SQL query.

    "values": [
        {
            "Display": "Industry",
            "Expression": "\"Account.Industry\"",
            "Alias": "Account.Industry"
        },
        {
            "Display": "Account Type",
            "Expression": "\"Account.Type\"",
            "Alias": "Account.Type"
        }
    ]

NOTE: My expressions are wrapped in double-quotes, as I found my binding wouldn’t be accepted by the query when the expression had a dot in it as “Account” was unknown. No dots and you should be fine with excluding the double-quotes.

NOTE: I’ve made sure that my custom query is requiring as a minimum one selection, that way I don’t pass null values, which aren’t accepted by the SQL query.

The SQL query is pretty simple for this example and only contains a SELECT and LIMIT statement, as you can see below.

SELECT "Account.Industry" AS "Account.Industry" 
FROM "opportunity" 
LIMIT 2000

To make the SELECT statement dynamic based on the selection from the custom query I’m using a binding and leveraging the .asSQLSelect() Data Serialization to get the correct syntax expected by the SQL query. Because my SQL query is of the type dataset my Data Serialization looks like this:

.asSQLSelect(\"dataset\")

My Data Serialization expects two inputs – an expression and an alias. Looking at my table from the custom query I need to grab two values from a row, thus the Data Selection will be a row. And since I am allowing multiple selections from my custom query I am leaving my row index empty so it takes all rows that have been selected.

row(CustomSELECT.selection, [], [\"Expression\", \"Alias\"])

Putting the Data Selection and Serialization it looks like this:

{{row(CustomSELECT.selection, [], [\"Expression\", \"Alias\"]).asSQLSelect(\"dataset\")}}

The SQL query with the select binding looks like this:

SELECT {{row(CustomSELECT.selection, [], [\"Expression\", \"Alias\"]).asSQLSelect(\"dataset\")}}
FROM "opportunity" 
LIMIT 2000

.asSQLGrouping()

As mentioned the purpose of this Data Serialization is to be able to make the GROUP statement of a SQL query dynamic. When making your grouping dynamic, you need to make sure the field is also available in the select statement and thus you will most likely need two bindings.

The Syntax
[Data Selection].asSQLGrouping(SQLType, optional boolean) 

The SQLType is either dataset, snowflake, sobject for SOQL, cdp or bigquery for Google Big Query, this needs to match the query type. In my example, I’ll be using dataset.

The optional boolean if left out defaults to false, but can be set to true if you wish the Data Serialization to return the full group statement.

Expected Input

The Data Serialization is expecting the API name of the grouped field(s).

["Industry", "AccountType"]
Expected Output

The Data Serialization will convert the Data Selection in one of two ways depending on the optional boolean or keyword as mentioned in the syntax. If the boolean is false or left out the result will be:

"Industry", "AccountType"

If the boolean is set to true, the result will be:

GROUP BY "Industry", "AccountType"
Example

To illustrate the .asSQLGrouping() Data Serialization I have a custom query with an “Expression”, an “Alias” and a “Group” column that is used to allow users to select between Account.Industry and Account.Type in my SQL query. The Expression and Alias are used for the SELECT statement and binding, whereas the Group is used specifically for the GROUP statement.

NOTE: A pre-requisite for the group binding is to make sure the select statement includes the field used in the grouping, thus this example has a binding in the SELECT statement and thus additional columns in the custom query. The select binding was covered in the previous section and thus will not be explained in this section.

"values": [
        {
            "Display": "Industry",
            "Expression": "\"Account.Industry\"",
            "Alias": "Account.Industry",
            "Group": "Account.Industry"
        },
        {
            "Display": "Account Type",
            "Expression": "\"Account.Type\"",
            "Alias": "Account.Type",
            "Group": "Account.Type"
        }
    ]

NOTE: I’ve made sure that my custom query is requiring as a minimum one selection, that way I don’t pass null values, which aren’t accepted by the SQL query.

The SQL query is again pretty simple for this example – it selects and groups on industry showing the sum of amount for each industry.

SELECT "Account.Industry" AS "Account.Industry", SUM(Amount) as "sum_Amount" 
FROM "opportunity" 
GROUP BY "Account.Industry"
LIMIT 2000

To make the GROUP statement dynamic based on the selection from the custom query I’m using a binding and leveraging the .asSQLGrouping() Data Serialization to get the correct syntax expected by the SQL query. Because my SQL query is of the type dataset I make sure to define that in the Data Serialization. We also have the option to include the boolean true if we want the result to include the GROUP BY. If we leave it out or define it as false the binding will only return the value.

.asSQLGrouping(\"dataset\", false)

My Data Serialization just expects one input – the grouped field(s). Looking at the table from the custom query I just need to grab data from one cell or column. I would say a column if you are expecting multiple selections in the custom query.

column(CustomGROUPING_1.selection, [\"Group\"])

Putting the Data Selection and Serialization it looks like this:

{{column(CustomGROUPING_1.selection, [\"Group\"]).asSQLGrouping(\"dataset\", false)}}

The SQL query with the select binding looks like this:

SELECT {{row(CustomGROUPING_1.selection, [], [\"Expression\", \"Alias\"]).asSQLSelect(\"dataset\")}}, SUM(Amount) as "sum_Amount" 
FROM "opportunity" 
GROUP BY {{column(CustomGROUPING_1.selection, [\"Group\"]).asSQLGrouping(\"dataset\", false)}}
LIMIT 2000

NOTE: If you choose to true as the boolean you would need to remove GROUP BY as well, as the binding will include that in the result.

.asSQLWhere()

As mentioned the purpose of this Data Serialization is to be able to make the WHERE statement of a SQL query dynamic.

The Syntax
[Data Selection].asSQLWhere(SQLType, optional boolean) 

The SQLType is either dataset, snowflake, sobject for SOQL, cdp or bigquery for Google Big Query, this needs to match the query type. In my example, I’ll be using dataset.

The optional boolean if left out defaults to false, but can be set to true if you wish the Data Serialization to return the full WHERE statement.

Expected Input

The Data Serialization is expecting the API name of the field to filter on as well as the operator and then the value(s).

["AccountType", "IN", ["Enterprise"]]

NOTE: The value is expected to be in an array, thus you will not be able to use a cell Data Selection without using the Data Manipulation toArray().

You define the operator you want to use. The following are supported:

  • IN – used with an array
  • NOT IN – used with an array
  • = – equals for a single value
  • != – not equals for a single value
  • < – less than
  • <= – less than or equal to
  • > – greater than
  • >= – greater than or equal to
  • >< – greater than and less than
  • >=<= – greater than or equals to and less than or equal to
  • BETWEEN – used for dates
Expected Output

The Data Serialization will convert the Data Selection in one of two ways depending on the optional boolean or keyword as mentioned in the syntax. If the boolean is false or is left out it is assumed that there already is a static filter in the WHERE statement and thus AND is prefixed to the beginning of the output to make sure your query can run even when nothing is selected.

AND "AccountType" IN ('Enterprise')

If the boolean is set to true, the result will be:

WHERE "AccountType" IN ('Enterprise')
Example

To illustrate the .asSQLWhere() Data Serialization I have a SAQL query (in compact form) that groups on Account.Type and is added to a list widget. The selected values from Account.Type are used to filter the SQL query.

q = load "opportunity";
q = group q by 'Account.Type';
q = foreach q generate 'Account.Type' as 'Account.Type', count() as 'count';
q = order q by 'Account.Type' asc;
q = limit q 2000;

The SQL query is still simple for this example – again it selects and groups on industry showing the sum of amount for each industry. The binding will be used to add a WHERE statement.

SELECT "Account.Industry" AS "Account.Industry", SUM(Amount) AS "sum_Amount"
FROM "opportunity"
GROUP BY "Account.Industry"
LIMIT 2000

To add the WHERE statement and populate it with selected values from the SAQL query we will be leveraging the .asSQLWhere() Data Serialization which will make sure we get the correct output for the SQL query. Because my SQL query is of the type dataset I make sure to define that in the Data Serialization. We also have the option to include the boolean, since the SQL query doesn’t have a static filter I need to set it to true to avoid the AND prefix. This also means the output will provide the whole WHERE statement.

.asSQLWhere(\"dataset\", true)

As the user can select multiple values from Account.Type column in the SAQL query plus the expected input is an array we will be leveraging a column Data Selection.

column(AccountTypeFilter_1.selection, [\"Account.Type\"])

As mentioned, the expected input is to include the field to filter on as well as the operator. Hence the above Data Selection with the Data Serialization will not return a usable result. For the binding to work we need to include more than just the selected Account Types, to which I will be leveraging the Data Manipulation function toArray().

toArray() does what you are probably thinking – adding defined values to an array.

toArray(value 1, value 2, value 3, ...)

This first value we need is the API name of the field to filter on – Account.Type. The second value is the operator – IN. And the final value the Data Serialization is expecting is the selected values from the SAQL query – the Data Selection.

toArray(\"Account.Type\", \"IN\", column(AccountTypeFilter_1.selection, [\"Account.Type\"]))

Putting our array together with the Data Serialization your binding looks like this:

{{toArray(\"Account.Type\", \"IN\", column(AccountTypeFilter_1.selection, [\"Account.Type\"])).asSQLWhere(\"dataset\", true)}}

The SQL query with the select binding looks like this:

SELECT "Account.Industry" AS "Account.Industry", SUM(Amount) AS "sum_Amount" 
FROM "opportunity"
{{toArray(\"Account.Type\", \"IN\", column(AccountTypeFilter_1.selection, [\"Account.Type\"])).asSQLWhere(\"dataset\", true)}}
GROUP BY "Account.Industry"
LIMIT 2000

.asSQLHaving()

As mentioned the purpose of this Data Serialization is to be able to make the HAVING statement of a SQL query dynamic.

The Syntax
[Data Selection].asSQLHaving(SQLType, optional boolean) 

The SQLType is either dataset, snowflake, sobject for SOQL, cdp or bigquery for Google Big Query, this needs to match the query type. In my example, I’ll be using dataset.

The optional boolean if left out defaults to false, but can be set to true if you wish the Data Serialization to return the full HAVING statement.

Expected Input

The Data Serialization is expecting the aggregated measure to filter by as well as the operator and then the value.

["SUM(Amount)", ">=", [10000]]

NOTE: The value is expected to be in an array, thus you will not be able to use a cell Data Selection without using the Data Manipulation toArray().

You define the operator you want to use. The following are supported:

  • = – equals
  • != – not equals
  • < – less than
  • <= – less than or equal to
  • > – greater than
  • >= – greater than or equal to

NOTE: The operators >< and >=<= are not supported and you would need to create two bindings to achieve what those operators do.

Expected Output

The Data Serialization will convert the Data Selection in one of two ways depending on the optional boolean or keyword as mentioned in the syntax. If the boolean is false or is left out it is assumed that there already is a static filter in the HAVING statement and thus AND is prefixed to the beginning of the output to make sure your query can run even when nothing is selected.

AND SUM(Amount) >= 10000

If the boolean is set to true, the result will be:

HAVING SUM(Amount) >= 10000
Example

To illustrate the .asSQLHaving() Data Serialization I have a SAQL query that finds the max and min Amount, which is then added to a range widget. The selected range or values from the query is used to filter the SQL query.

q = load "opportunity";
q = group q by all;
q = foreach q generate min('Amount') as 'min', max('Amount') as 'max';

The SQL query is the same we use in the previous example – it selects and groups on industry showing the sum of amount for each industry. The binding will be used to add a HAVING statement.

SELECT "Account.Industry" AS "Account.Industry", SUM(Amount) AS "sum_Amount" 
FROM "opportunity"
GROUP BY "Account.Industry"
LIMIT 2000

To add the HAVING statement and populate it with selected values from the SAQL query we will be leveraging the .asSQLWhere() Data Serialization which will make sure we get the correct output for the SQL query. Because my SQL query is of the type dataset I make sure to define that in the Data Serialization.

Since .asSQLHaving() doesn’t support ranges we need to have two bindings. The first binding with the min value will include the true boolean to avoid the AND prefix. However, for the second binding we need the AND prefix and thus the boolean is either left out or set to false.

/*Min Binding*/ 
.asSQLHaving(\"dataset\", true)

/*Max Binding*/ 
.asSQLHaving(\"dataset\", false)

Since the expected input for the value is an array we will be using the column Data Selection even though we only need one number or cell.

/*Min Binding*/ 
column(Amount_1.selection, [\"min\"])

/*Max Binding*/
column(Amount_1.selection, [\"max\"])

As mentioned, the expected input is to include the field to filter on as well as the operator. Hence the above Data Selections with the Data Serialization will not return a usable result. For the binding to work we need to include more than just the selected Amount, to which I will be leveraging the Data Manipulation function toArray().

As mentioned before toArray() adds defined values to an array.

toArray(value 1, value 2, value 3, ...)

This first value we need is the aggregated measure to filter by – SUM(Amount). The second value is the operator – for Min >= and for Max <=. And the final value the Data Serialization is expecting is the selected value from the SAQL query – the Data Selection.

/*Min Binding*/ 
toArray(\"SUM(Amount)\", \">=\", column(Amount_1.selection, [\"min\"]))

/*Max Binding*/
toArray(\"SUM(Amount)\", \"<=\", column(Amount_1.selection, [\"max\"]))

Putting our array together with the Data Serialization your binding looks like this:

/*Min Binding*/ 
{{toArray(\"SUM(Amount)\", \">=\", column(Amount_1.selection, [\"min\"])).asSQLHaving(\"dataset\", true)}}

/*Max Binding*/
{{toArray(\"SUM(Amount)\", \"<=\", column(Amount_1.selection, [\"max\"])).asSQLHaving(\"dataset\", false)}}

The SQL query with the select binding looks like this:

SELECT "Account.Industry" AS "Account.Industry", SUM(Amount) AS "sum_Amount" 
FROM "opportunity"
GROUP BY "Account.Industry"
{{toArray(\"SUM(Amount)\", \">=\", column(Amount_1.selection, [\"min\"])).asSQLHaving(\"dataset\", true)}}{{toArray(\"SUM(Amount)\", \"<=\", column(Amount_1.selection, [\"max\"])).asSQLHaving(\"dataset\", false)}}
LIMIT 2000

.asSQLOrder()

As mentioned the purpose of this Data Serialization is to be able to make the ORDER statement of a SQL query dynamic.

The Syntax
[Data Selection].asSQLOrder(SQLType, optional boolean) 

The SQLType is either dataset, snowflake, sobject for SOQL, cdp or bigquery for Google Big Query, this needs to match the query type. In my example, I’ll be using dataset.

The optional boolean if left out defaults to false, but can be set to true if you wish the Data Serialization to return the full ORDER statement.

Expected Input

The Data Serialization is expecting the field to order on and the direction ASC or DESC in an array to account for ordering on multiple values.

[["Industry", "DESC"],["AccountType", "DESC"]]

NOTE: The value is expected to be in an array, thus you will not be able to use a cell Data Selection.

Expected Output

The Data Serialization will convert the Data Selection in one of two ways depending on the optional boolean or keyword as mentioned in the syntax. If the boolean is false or left out the result will be:

"Industry" DESC, "Account.Type" DESC

If the boolean is set to true, the result will be:

ORDER BY "Industry" DESC, "Account.Type" DESC
Example

To illustrate the .asSQLOrder() Data Serialization I have a custom query with an “Order” and a “Direction” column that is used to allow users to select the sorting order in my SQL query.

"values": [
        {
            "Display": "Industry ASC",
            "Order": "Account.Industry",
            "Direction": "ASC"
        },
        {
            "Display": "Industry DESC",
            "Order": "Account.Industry",
            "Direction": "DESC"
        }
    ]

The SQL query for this example is the same as previously – it selects and groups on industry showing the sum of amount for each industry.

SELECT "Account.Industry" AS "Account.Industry", SUM(Amount) as "sum_Amount" 
FROM "opportunity" 
GROUP BY "Account.Industry"
LIMIT 2000

To make the ORDER statement dynamic based on the selection from the custom query I’m using a binding and leveraging the .asSQLOrder() Data Serialization to get the correct syntax expected by the SQL query. Because my SQL query is of the type dataset I make sure to define that in the Data Serialization. Since a selection in the custom query isn’t required the binding may return a null value, to accommodate for that we will include the optional boolean and set it to true that way to a null value error.

.asSQLOrder("dataset", true)

My Data Serialization expects two values from a row – the field to order by and the direction. Thus we can use a row Data Selection. One thing to remember is that the Data Selection must return a two-dimensional array and even if only one selection is expected you should set the row index to all.

row(CustomORDER_1.selection, [], [\"Order\", \"Direction\"])

Putting the Data Selection and Serialization it looks like this:

{{row(CustomORDER_1.selection, [], [\"Order\", \"Direction\"]).asSQLOrder(\"dataset\", true)}}

The SQL query with the select binding looks like this:

SELECT "Account.Industry" AS "Account.Industry", SUM(Amount) as "sum_Amount" 
FROM "opportunity" 
GROUP BY "Account.Industry"
{{row(CustomORDER_1.selection, [], [\"Order\", \"Direction\"]).asSQLOrder(\"dataset\", true)}}
LIMIT 2000

How useful was this post?

Click on a star to rate useful the post is!

Written by


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.