Why All Salesforce Professionals Should Know About Data Pipelines


Big statement, right? Well, I stand by it 100%. And not just because Salesforce Data Pipelines (SDP) is a big part of what I work with. The past few months I’ve had conversations with some of my friends that work with Salesforce – not Tableau CRM – around data admin tasks. It made me think about some of the data tasks I had when I once upon a time was a core Salesforce and marketing consultant and I really wish I had had a tool like Salesforce Data Pipelines – it would have made things a lot easier, simpler, and quicker.

If this is the first time you have heard about Salesforce Data Pipelines, don’t worry I’ll catch you up. SDP is a relatively new Salesforce product that allows you to transform and enrich your data natively in Salesforce. It uses the same powerful technology as Tableau CRM’s data platform including input & output connectors and recipes plus can process billions (yes, billions – it’s not a typo) of records. I could give you a long description here, but I’ll rather refer to the blog Antonio Scaramuzzino wrote introducing SDP and say adding this tool to your toolkit will make your life as a Salesforce admin or consultant so much more efficient – let me tell you why…

Two Types of Data Tasks

In these conversations I’ve had and in my reflection back to my data tasks as a Salesforce consultant there are two types of data tasks Salesforce Data Pipelines is perfect for:

  1. Continous data transformation tasks; refers to the same data task that takes place on a regular basis to enhance or clean your data.
  2. Ad hoc data transformation tasks; refers to tasks that just comes up once typically in regards to a Salesforce project implementation or business change that affects your data.

Now let me with personal war stories tell how Salesforce Data Pipelines can help you and how my life as a Salesforce consultant would have been easier if I had a tool like this when I joined the Salesforce ecosystem.

Note: If you want to try it out Salesforce Data Pipelines here’s a link to create an org where you can try it out.

Continous Data Transformation Tasks

As mentioned above these are tasks that you will carry out on a regular basis, it’s the same tedious task you have to complete every week or month. As a consultant, these were probably not the tasks I had the most of but I still saw a lot of examples around data cleanse that would have been great to streamline with SDP. In my conversations with customers, solution engineers and consultants here are the themes of typical use cases:

  • Data cleanse; such as ensuring correct format for phone numbers or validating data of your Pardot leads.
  • Cross-object calculations; getting input from different objects including external data to write an output like an Account Health Score.
  • Machine-learning (ML) enhancements; leverage the build in smart transformations to make your data smarter for instance by applying sentiment analysis to your cases.

In a pre-SDP world you could still carry out these tasks but it would involve code or a long process of extracting data, manipulating data, and finally loading data back into Salesforce. Needless to say, this is time-consuming and has a high risk.

I know I promised real examples, but the fact is there are already blogs “out there” uncovering how you can use Salesforce Data Pipelines and recipes to carry out continuous data transformation tasks on a schedule. So let me list two blogs for you:

Remember there’s no data extract, no external systems like excel, a point’n’click UI, no data loader – just build, execute on a schedule and monitor your runs.

Ad Hoc Data Transformation Tasks

Ah, this is the category of tasks that you probably didn’t think or consider that Salesforce Data Pipelines is a perfect fit for. A while ago when I was trying to explain what Salesforce Data Pipelines is all about it hit me how many data tasks I’ve had as a Salesforce consultant where I was frustrated beyond reason how complex a simple data task could be to carry out and how time-consuming it always was. So let me share two war stories of one-off data tasks that weren’t big or complex enough to automate, yet they were necessary to complete.

Data Update

Euro is a newer currency and not all countries were using Euro from the beginning. Now I don’t want to go into the reason for that, that’s irrelevant for this use case, the point is I had to update every single record of objects that had currency fields from one currency to Euro. Now I’ve forgotten all the tedious details but I roughly had to do the following:

  1. Extract all records with the old currency using the data loader. I had to repeat this step for all objects that had a currency field.
  2. In excel update the currency field and make sure the new value was correct. For those of you that have handled more than a few thousand rows in excel knows you end up drinking a lot of coffee.
  3. Do an update of all the records with the data loader. This again was repeated for each object.

Now it sounds simple. It technically was. But before Christmas I had to do a trial run identifying all the steps, the correct order, potential errors write it down in a playbook so I could execute the same steps on millions of records on January 1st when the new currency was active. Needless to say, I was nervous about the high risk and felt the pressure of getting the job done fast.

If I had Salesforce Data Pipelines I could have easily created a recipe before Christmas to:

  1. Extract the objects with currency fields that used the old currency,
  2. Updated the currency field,
  3. Calculated the new amount,
  4. Write back to the record with the new values.

The best part is I would have been able to see the preview of my transformations before executing them, whereafter I with confidence could run my recipe on millions of records in a full sandbox to validate the process. Once happy I could deploy my recipe and wait for January 1st where I just had to click the “run” button avoiding the risk of manual processes and slow processing of millions of records.

Data Migration

While I didn’t personally work on this type of project, migrating data from one org to another isn’t uncommon. I at least have seen a few projects of that sort in my career whether it’s because of acquisitions or just wanting to start over after a messy org.

In this example, there are a lot of considerations to take especially since orgs tend to be highly customized and data has to match in order for this to work. However, consider you have one org with account data that needs to be migrated to another org. Instead of extracting all the data with the data loader and uploading it to the new org, why not use the external Salesforce connector in your Salesforce Data Pipelines org to extract the data and have a recipe to create new or update exciting accounts in the org.

While I haven’t had a chance to test it out, why not use Salesforce Data Pipelines to populate your developer or developer pro sandbox with a few data examples? Yes, you can create test accounts, but I remember the nightmare of figuring out what fields were mandatory and checked by validation rules, so I extracted 10 rows to then upload them. And I haven’t forgotten the fun I had with excel and vlookup trying to match records with new ids. It was time-consuming.

Note: Before doing this make sure the standard object you want to use is supported with the output connector – see more here.

Is There More?

That’s a silly headline because of course there is more! I’ve talked to two Salesforce Solution Engineers in the past month that said SDP sounds great but can I work with external flat files? You know the loved csv file on someone’s hard drive. Of course, there are going to be a few “if’s” and considerations here – the main one being if the file always has the same number of columns with the same file – to automate anything consistency is important. Anyway, yes it’s possible. Let me introduce you to Dataset Utils a “friendly utility to load your on-prem data, whether large or small, to Einstein Analytics Datasets, with useful features such as autoloading, dataflow control, and dataset inspection”.

With Dataset Utils you can set up a listener to automatically upload your csv file. If this sounds like something you need then check out this video that walks you through the steps to set it up.

Note: If you are handy with some code you can also use Mohan’s plugin and the upload command to automate this process.

And that’s it! That’s why you should get to know Salesforce Data Pipelines!

I know there are many other use cases for SDP, I’ve got more myself, but this blog is already pretty long. Regardless I hope I’ve inspired you to be creative with SDP, I would love to hear your use cases in the comments below.

How useful was this post?

Click on a star to rate useful the post is!

Written by

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.