Genie: Real-Time Hyperscale Data Platform


If you have been following Dreamforce 2022, you probably have seen a bunny hopping around and heard the buzz around Genie. Genie has many components to it, however, it’s all about bringing all your data together real-time with the aim of getting a complete picture of your customer in such a way that you can customize and act on any interaction.

We as individual generates a lot of data every day and that data is scattered across multiple data sources, needless to say, it’s not easy to accomplish one single view of a customer with all that data. Genie (Customer Data Platform or CDP) aims to make it easy to connect and harmonize your data within Salesforce.

A closed Genie pilot that is currently available is the visual point-and-click data prep, where you take data lake objects to clean, transform and enrich your data before writing it back into the data lake. Sounds familiar? It should be because we have taken the learnings from building the next-gen data prep for CRM Analytics and Industries and rebuilt it in the CDP context. That means you can harmonize your CDP data further and leverage all the transformations available including out-of-the-box ML capabilities such as sentiment analysis and detect missing values.

Let’s have a closer look at how you can use Data Prep with Genie.

Let’s Prep Some Data

Once you log in to Salesforce and are on the Customer Data Platform home screen you’ll notice a new tab called “Data Prep”. From here you will access Jobs Monitor, which shows a log of all the jobs, as well as recipes, which are where we define what data to use and how to transform it and finally write it back to the data lake.

Access Data Prep with Job Monitor and Recipes

From the menu on the left, you can view existing recipes, create a new one or modify an existing one. Let’s have a look at the recipe that predicts pageviews from the website.

You can add any object from the data lake as input nodes. In this instance, we already have the objects CDP Page Views and CDP Customers.

Adding Data Lake Objects as input nodes.
Recipe canvas with input, transformations, and output nodes.

Once we have the data we want to enrich we can start applying transformations by simply clicking the plus icon next to any node, which includes, joining, appending, updating, filtering, aggregating, and transforming data.

Adding new nodes to the recipe.

In this recipe we are only interested in page views coming from the website, so we are using the filter node to remove values where the field category is not “website”. Notice there is a preview of your data, so you can always see how your transformations manifest.

Filtering page views.

If we have more data objects that we want to join we can do so with the join node using keys. This will generate an output where the two data sources are one.

Joining page views and customers.

In the recipe you’ll notice after the join we have two streams or branches, thus it is possible to break out the data transformations and generate multiple outputs. Firstly, we are using the aggregate node to group the data by month-year and summarize pageviews by female, male and non-binary.

Using the aggregate note to summarize page views by gender.

In the second branch, we are forecasting the predicted page views for each month by gender using one of the many smart transforms that are leveraging ML to enrich the data. This is the same algorithm that the Revenue Intelligence app for Salesforce uses for sales forecasting. You’ll notice that the preview shows a sample text as these values won’t be generated until the recipe runs.

Using the timeseries smart transforms to predict future page views.

All there is left to do is create an output node for each branch that will write back the output to the data lake.

Output node that maps the recipe columns to a Data Lake object.

Once you are happy with your recipe you should save it. You can run the recipe directly from the editor or you can schedule it to run at regular intervals or when another recipe has run from the recipe overview page.

Save and run your recipe.
Schedule your recipe to run.

What’s Your Use Case?

As mentioned, Data Prep Recipes for CDP is currently a pilot to a small group of customers, we are planning a wider beta program in future releases. Regardless we would love to hear your use case for CDP with Data Prep by adding a comment or reaching out in the community. Hearing your use case will help us create a better experience and understand what kind of data magic you want to be able to create.

Forward-looking Statement

This content contains forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proved incorrect, the results of, inc. could differ materially from the results expressed or implied by the forward-looking statements we make.

Any unreleased services or features referenced in this document or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available., inc. assumes no obligation and does not intend to update these forward-looking statements.

How useful was this post?

Click on a star to rate useful the post is!

Written by

4 thoughts on “Genie: Real-Time Hyperscale Data Platform”

  • 1
    George Esteller on November 22, 2022 Reply

    Is there a way we can be part of this pilot? Our org has data coming in from Data lakes, Marketing cloud, datasets from other org, API end points and maybe this can be something to unite them all.

    • 2
      Rikke on November 28, 2022 Reply

      Do you have CDP? That’s the foundation.

  • 3
    Tim Dries on November 23, 2022 Reply

    Hi Rikke,

    Great blog, good to see some specific stuff on Genie.
    Question: What environment is meant when you say “Data Lake”?


    • 4
      Rikke on November 28, 2022 Reply

      The CDP data lake – you can write back to an existing data lake object or you can create a new one.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.