Genie: Real-Time Hyperscale Data Platform
If you have been following Dreamforce 2022, you probably have seen a bunny hopping around and heard the buzz around Genie. Genie has many components to it, however, it’s all about bringing all your data together real-time with the aim of getting a complete picture of your customer in such a way that you can customize and act on any interaction.
We as individual generates a lot of data every day and that data is scattered across multiple data sources, needless to say, it’s not easy to accomplish one single view of a customer with all that data. Genie (Customer Data Platform or CDP) aims to make it easy to connect and harmonize your data within Salesforce.
A closed Genie pilot that is currently available is the visual point-and-click data prep, where you take data lake objects to clean, transform and enrich your data before writing it back into the data lake. Sounds familiar? It should be because we have taken the learnings from building the next-gen data prep for CRM Analytics and Industries and rebuilt it in the CDP context. That means you can harmonize your CDP data further and leverage all the transformations available including out-of-the-box ML capabilities such as sentiment analysis and detect missing values.
Let’s have a closer look at how you can use Data Prep with Genie.
Let’s Prep Some Data
Once you log in to Salesforce and are on the Customer Data Platform home screen you’ll notice a new tab called “Data Prep”. From here you will access Jobs Monitor, which shows a log of all the jobs, as well as recipes, which are where we define what data to use and how to transform it and finally write it back to the data lake.
From the menu on the left, you can view existing recipes, create a new one or modify an existing one. Let’s have a look at the recipe that predicts pageviews from the website.
You can add any object from the data lake as input nodes. In this instance, we already have the objects CDP Page Views and CDP Customers.
Once we have the data we want to enrich we can start applying transformations by simply clicking the plus icon next to any node, which includes, joining, appending, updating, filtering, aggregating, and transforming data.
In this recipe we are only interested in page views coming from the website, so we are using the filter node to remove values where the field category is not “website”. Notice there is a preview of your data, so you can always see how your transformations manifest.
If we have more data objects that we want to join we can do so with the join node using keys. This will generate an output where the two data sources are one.
In the recipe you’ll notice after the join we have two streams or branches, thus it is possible to break out the data transformations and generate multiple outputs. Firstly, we are using the aggregate node to group the data by month-year and summarize pageviews by female, male and non-binary.
In the second branch, we are forecasting the predicted page views for each month by gender using one of the many smart transforms that are leveraging ML to enrich the data. This is the same algorithm that the Revenue Intelligence app for Salesforce uses for sales forecasting. You’ll notice that the preview shows a sample text as these values won’t be generated until the recipe runs.
All there is left to do is create an output node for each branch that will write back the output to the data lake.
Once you are happy with your recipe you should save it. You can run the recipe directly from the editor or you can schedule it to run at regular intervals or when another recipe has run from the recipe overview page.
What’s Your Use Case?
As mentioned, Data Prep Recipes for CDP is currently a pilot to a small group of customers, we are planning a wider beta program in future releases. Regardless we would love to hear your use case for CDP with Data Prep by adding a comment or reaching out in the community. Hearing your use case will help us create a better experience and understand what kind of data magic you want to be able to create.
This content contains forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proved incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make.
Any unreleased services or features referenced in this document or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.
4 thoughts on “Genie: Real-Time Hyperscale Data Platform”
Is there a way we can be part of this pilot? Our org has data coming in from Data lakes, Marketing cloud, datasets from other org, API end points and maybe this can be something to unite them all.
Do you have CDP? That’s the foundation.
Great blog, good to see some specific stuff on Genie.
Question: What environment is meant when you say “Data Lake”?
The CDP data lake – you can write back to an existing data lake object or you can create a new one.