Sneak Preview: Data prep of the future

When you are new to Einstein Analytics and you for the first time have to build a dataset it may go alright, yes there are a few new concepts to understand especially the grain level (or root object) but you can manage. When it comes to making additional modifications in the dataflow afterward that’s where the journey becomes a little steeper. Well, a lot of people in Salesforce have been listening and working very hard to make the vision of a simple and powerful data prep tool come true. Now the time has come for a sneak preview – of course, all safe harbor.

First, a bit of history

When I started working with Einstein Analytics (Wave) in 2015 we didn’t have much of a UI. We had the dataset builder and the dataflow JSON, any modifications happened by writing JSON. Then came the UI we know today to modify the dataflow and finally we got recipes. We now have three tools to do data prep and a lot of questions of when to use what plus, as mentioned, the onboarding for especially the data flow is steep.

Well, times have changed, the tools are being consolidated. Amazing researchers have investigated what it will take to make the journey smoother, great designers have looked at how to make the interaction with the tool intuitive and pleasant and top-notch engineers have worked hard to build the vision. All coordinated by the brilliant product manager Tim Bezold. So welcome to the future with simple and powerful data prep Recipe 3.0, let’s have a look.

Adding data

Getting started is easy, heading to the recipes I now have a drop-down option in my create recipe button to launch the new recipe 3.0. The first option I get is to create a blank recipe or choose from a template. As I don’t have any templates saved that option is greyed out for me. However, the idea is for you to be able to save recipes as templates so you can use them as a starting point for other recipes. An example could be flatting a hierarchy; an action you may want to take for multiple recipes to be able to leverage it in security predicates later.

Selecting the blank recipe I am prompt to choose a source, now I have all data sources in one view, I do not have to worry about edgemart, sfdcdigest or digest nodes. I can easily search for the dataset I want to use for my new recipe and clicking on it gives me a preview of the fields. Notice in the image below I can select the fields I wish to include.

Get more data sources

I am of course able to add more data sources after the initial selection. I can select the option from the main menu in the top or I can simply click on the plus right next to my “BasicInformation22k” dataset. Below you can see I can pick from a list of options of how to modify my data further. Selecting the append you will notice I get the same window from before and I am able to select a dataset to append.

It is also possible to do the joins we know and love from our current recipes, just in an improved way. Below you can see multiple joins in the dataflow and clicking on the node I have a preview of my data as well as the option of switching the join type between look-up, left, right, inner and outer.

Data transformations

Adding transformations is just as easy as joining data. In the flow, you click the plus again and choose the transformation option, which will load a preview of your data. Selecting a column you can decide the type of transformation you want to perform. In my example below, I am selecting my ID which is a measure, however, I want it to be a dimension so I am able to use it for groupings later on in my dashboard. After selecting the column I have a drop-down where I can choose the type of transformation. Once selected it is added to the transformation panel on the left-hand-side. I can, if I wish, do multiple transformations within the same node.

Flow overview

You can below see my very simple flow with two data sources, an append node, a transformation and of course the output which is what creates the dataset. It is very easy for me to get an overview of the dataflow. Each node has a unique icon and by clicking either of them I get a preview of the data as well as the details for what is happening in the node.

I don’t know about you, but I am personally excited to have this flow overview that allows me to quickly determine the elements of my recipe but also to get a preview of my data at each step so I know if I am getting the desired result without having to run it first.

When is it coming?

I bet as you are reading this you are thinking “when is this coming?”. Well first of all this is, of course, subject to Salesforce’s forward-looking statement (or see more details at the end). That being said the goal is to have Recipe 3.0 in beta from Summer 20 and generally available in Winter 21.

I just want to end this blog by saying a big thank you to the team that has been working very hard to make this dream of having the best data prep tool a soon reality.

Forward-looking statement

This content contains forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proved incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make.

Any unreleased services or features referenced in this document or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.