Cloud-native Data Prep of the future: A closer look

The Einstein Analytics Data Manager supplies fresh data to your Einstein Analytics dashboards and Einstein Discovery stories for actionable insights. Salesforce has made significant investments in the Data Manager platform as our customers process higher data volumes in the scale of dozens of trillions of rows every month in roughly equal parts Salesforce data and external data. Data Prep’s mission is to enable as many people in the organization as necessary to make data available for analysis as quickly as possible. In this blog, we will take a deeper look at the features for Data Prep in Summer 20.

Got an expansive data landscape? The Einstein Analytics Data Manager is here for you.

With many out-of-the-box connectors and the world’s fastest and most efficient way to access your Salesforce objects, getting access to raw data has never been easier. You add connections to your external systems like Snowflake or Heroku in the Connect tab. Once the data has been cached using the sync opportunities, you can access it in the Input node in Data Prep. You will also be able to browse the full object model of your Salesforce org from the Data Prep Input node and add the fields you need to your Recipe.

Creating your Recipe for delicious results

Data Prep allows you to inspect and preview your data at every step of the way. Combined with an intuitive outline of the whole flow that shows each step in a distinct node classified by colors and shapes, users can quickly spot the essence of what this Recipe accomplishes and navigate to where changes need to be made.

Pat it, and prick it, and mark it with “T”

Introducing the Transform Node – your new Swiss Army Knife when it comes to transforming your data. If you are familiar with Data Manager picture a Recipe embedded in a Dataflow. Need to transpose a data type? Reformat dates? Create calculated fields? You will find hundreds of transformations by selecting the column(s) you need to work on via the dynamic toolbar. Applicable options pop into this toolbar at the top of the data sample with all the available data transformations for the selected data types. You get the choice of creating a new field or even replace the original content of the field. Renaming labels or the API Name is also available for all field types. And finally, you can have multiple transformations across all the fields available in a single transformation node.

Take data science to the next level

The Transformation Node also features powerful Machine Learning functions to help you prepare your data. Does a dimension have too many NULL values? You can define predictor fields that will train an algorithm to predict what the right value could be. Have you been loading Salesforce Service Cloud case description data? Detect sentiment in a free-form text to identify account health status on average, over time. Combine the sophistication of our algorithms with the scale and flexibility of our query engine (think slicing through billions of rows like a hot knife through butter) as well as linking all possible actions back into Salesforce workflows. The Winter ’21 release* will also see an introduction of a dedicated node for Einstein Discovery predictions that will allow you to score your data using Einstein Discovery stories when the recipe runs.

The rest of the node lot: Bring more data in with Join, Append and drop rows with Filter

The join node now supports creating multi-value fields in the augment/lookup mode. This way you get the choice between keeping your data at the left-hand side grain, while not losing any data from the right-hand side. The append node has an automatic mapping of fields that helps you focus on the hard problems.

Like the mailman, it delivers: The Output node

Einstein Analytics is your Analytics solution built natively for and in Salesforce. You can easily leverage the existing Salesforce security model by inheriting the row-level security from the Salesforce core objects. External data is secured via the security predicate dynamic filtering rules. Data Prep allows you to define these dataset security features in the output node. Users can also pick the dataset alias, dataset API name, and the associated App folder. New for this next-generation tool is the ability to write to external storage systems like S3 or Snowflake. These output options will show up here once you have set up these connections in the Data Manager Connect page.

What’s next for the future

The Data Manager teams are working hard on closing the functionality gap to Dataflows. Ultimately this will give you the ability to convert your dataflows into Data Prep. Features like flattening a role hierarchy, complex & custom case statements, and window & data partitioning functions will soon make their way into Data Prep. We are also tracking new features such as easy date math functions that allow you to calculate the time between two dates. Leave your comments on this page to provide your thoughts and feedback. We are looking forward to learning from you!

*Forward-looking statement

This content contains forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proved incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make.

Any unreleased services or features referenced in this document or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.


1 thought on “Cloud-native Data Prep of the future: A closer look”

  • Avatar 1
    Itzik sadeh on April 29, 2020 Reply

    Can’t wait to these new features. Thank you for the great article.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.