If you’ve been doing date manipulations using the Date/DateTime functions in recipes, you may have come across these validation error messages:
Or you swear you had created a “Date” field in a dataset, but it changed to “Date Time” magically somehow.
Or you find yourself balling up your fists, incredulous at how difficult it is to add a few hours to a Date Time field.
Allow me to offer some insights into how Date and Date Time types operate in Data Prep Recipes! I also provide some practical code snippet examples of common requests below for date/functions so make sure you check them out. And yes, after demystifying it, I’ll outline how we’ll address this mystifying experience in the upcoming release.
At First, There Were Datasets
If you have not enabled custom timezone in your org, your datasets support three data types: Measures (Numerics), Dimension (Text and Multivalue), and Dates.
In the dataset’s context, a “Date” refers to a type that has (a) the date components such as year, month, day, as well as (b) time components such as hour, minute, seconds, etc. If your Date column in a dataset does not contain any time components in the format, those time components still exist, they are just 0s.
*[In Dataset]* 2021-01-02 = 2021-01-02T00:00:00Z
This is easy to understand; basically, all Date or DateTime fields in Salesforce translate to this “Date” type in Dataset, and to support that, the “Date” type in a Dataset is actually secretly a DateTime type despite its name.
And if you enable custom timezone in your org, your datasets will then support two different Date types: “Date Only”, and “Date Time”. “Date Only”, as you’d expect, does not have time components in it. “Date Time” is equivalent to the old “Date”, which has both date and time components.
Normally, this would not be a problem. However, since recipes are built on Apache Spark and use Spark SQL, that introduced some differences.
Then Came Recipes
Recipes are built on Apache Spark, which means they support lots of different data types. To ensure compatibility with datasets, the “Date Time” type was introduced. At the same time, a decision was made to introduce “Date” type also. This “Date” type in a recipe is exactly what you’d expect in SQL: it does not contain time parts. And this was obviously ok because everyone knows that Date and DateTime are completely different. Totes.
We also introduced a bunch of functions to operate on those Date / DateTime types, like
months_between, so you can do all the date operations in recipes using many of these familiar SQL functions.
add_months takes a Date type and returns a Date Type.
now() returns a Date Time type.
to_timestamp() returns a Date Time type, because we interchangeably refer “Date Time” type as a timestamp in recipe builder).
However, since orgs without custom timezones only support the “Date” type (which is really the “Date Time” type in Recipes, or the “Timestamp” type in SQL), all of the different Dates or Date Time columns were converted to that one “Date” type in dataset, and when that dataset is used in another recipe, the “Date” type is mapped to a “Date Time” type correctly.
Confused yet? Let’s walk through an example.
A Walk Through the Park
I have an org with two custom fields on the Account object, called “Target Date” and “Target DateTime” respectively.
In my org, I do not have custom timezone enabled.
That means both of those fields will end up in TCRM as the “Date” type, which contains both the date parts and the time parts.
So of course when I pull Account into my recipe, those two fields obviously map to the “Date Time” type.
Totally obvious because I mentioned earlier that “Date” in a dataset maps to a “Date Time” in recipes. Totes. Obvs.
So say I want to add 5 months to the Target DateTime field, and I see the
add_months function. Awesome! And since I’m aware that the field type is actually “Date Time”, I conscientiously set the Output Type to be “Date Time” as well. Perfection.
Looks like an easy fix? Just change the output type to “Date”?
Or is it?
I go and output that to a dataset called “AccountWithDateFields”.
I add that new dataset back into the same recipe. Immediately, I observe the “Target_DateTime + 5 months” field has a type of “Date Time”, and not the “Date” that I initially created.
This is because when the recipe creates a dataset, the “Date” type in recipe is automatically converted to a “Date” type in dataset, which is actually a “Date Time” type. So when the dataset is added back into a recipe, it shows up correctly as a “Date Time” type.
But when I try to append the two data sources, I don’t see a corresponding field
And that’s reasonable because the “Target_DateTime + 5 months” field is actually a Date field on the left side (see the icon of a calendar? That means it’s a Date type), but it’s a Date Time type on the right (see the clock icon?), and you can’t append columns of different types together. If you had run into this, you probably then went upstream in the recipe to change one of the types explicitly using
to_timestamp to change the data type of one of the fields while cursing loudly about the inexplicability of all of this.
If your org does not have custom timezone enabled, you should stick with using only Date Time type in recipes to avoid inadvertently causing any kinds of compatibility issues!
So How Do I Do This With DateTime in Recipes?
Below, I outline some common date-related manipulation formulas for your reference. These will all definitely remain as “Date Time” type, and you will not need to worry about type conversions.
“I need to add 5 months to a Date Time field”
Instead of using
add_months function, use
Target_DateTime__c + interval 5 month
“I need to add 6 hours to a Date Time field”
Target_DateTime__c + interval 6 hour
“I need to calculate the first day of the month for a Date Time field”
You need to use
date_trunc, and not
trunc works on Date fields whiles
date_trunc works on Date Time fields.
“I need to calculate the last day of the month for a Date Time field”
date_trunc('month', Target_DateTime__c) + INTERVAL 1 month - INTERVAL 1 day
Of course, you can change the parameters in date_trunc to get the first/last day of a quarter, of a year, etc.
“I need to zero out the time parts of a Date Time field”
“I need to calculate the time difference in MINUTES between two DateTime fields”
(to_unix_timestamp(Date2) - to_unix_timestamp(Date1))/60
“I need to calculate the time difference in DAYS between two DateTime fields”
“I need to calculate the time difference in MONTHS between two DateTime fields”
“I want to convert my [datefield_sec_epoch] number value into a DateTime field”
“I want to port my SAQL date_diff calculations from data flows to recipes!”
Here’s a direct mapping of how to convert
date_diff saql usage to custom formula in recipes.
Recipe Custom Formula Equivalent
As you can see, Date/DateTime types are MUCH easier to deal with than
[MyDateField_sec_epoch] in data flows, and you can operate on them using standard operators like <, >, =, so to check if CloseDate is between a date range, simply use the expression
(Date1<=CloseDate and CloseDate <= Date2).
Do you have any specific use cases that you aren’t sure how to deal with? Reach out to me on LinkedIn, directly on the Trailblazer Community, or join us on Slack at DataTribe – we want to help you accomplish your date calculations in Data Prep Recipes!
“Why don’t you just fix this Date/DateTime confusion?”
Yes, we should, and yes we are!
Since many customers have already started using these date and date time functions, and since those are mostly appropriately named based on the source functions in Spark SQL, we need to ensure backward compatibility and not cause any unexpected surprises. We plan to simplify the experience in the recipe builder for customers in Winter ’23 so that new users don’t continue to run into those errors/warnings about type mismatching.
This content contains forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proved incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make.
Any unreleased services or features referenced in this document or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.