Introduction
Stop hardcoding credentials, API keys and connection strings in your Azure Data Factory linked services right now!
In this post you'll learn how to integrate Azure Data Factory with Azure Key Vault.
What is Azure Key Vault
The Microsoft Azure Key Vault is a cloud service, used for securely storing keys, certificates and secrets.
Why would I store my "secrets" in the Azure Key Vault you're asking?
Because the Key Vault makes it possible to store secrets, while keeping control access tight and under your control.
But what are these so called "secrets"?
Secrets can contain everything that has to remain secure: API keys, credentials, connection strings, certificates, etc.
We advice to create a Azure Key Vault per environment (development, test, acceptance and production). With separate vaults you increase the security of your data platform by not sharing secrets across environments.
Why should I integrate Azure Data Factory with Azure Key Vault
Integrating Azure Data Factory with Azure Key Vault grants you several benefits.
To begin with Azure Key Vault can contain credentials, connection strings, API keys, certificates and much more. No longer should we include sensitive information in Azure Data Factory
Gone are the days that we were sharing credentials by mail, Microsoft Teams or passing on small papers. With every secret stored in the Key Vault, developers no longer have to worry about them. With the correct access policies they can use every secret they need.
When you deploy your Azure Data Factory environment from development to test/acceptance/production you don't need to change any credentials or connection strings.
Instead of changing every data set, only the connection to the Key Vault in the linked service has to change. Moreover, If you add the connection to the Key Vault as dynamic content, you can automate changing the service change in the deployment pipeline.
Let's create Azure Key Vault secrets
Let us assume that you have already created an Azure Key Vault in your Azure subscription. If not, quickly add an Azure Key Vault resource from the Microsoft Azure Marketplace.
When the key vault is installed you can navigate to the resource and find an extensive menu on the left hand side. For now let's navigate to Secrets. You'll see that there are no secrets available yet.
Click on Generate/Import in the top menu to create a new secret. A new window will open, in which we'll fill in the required fields for the secret. We'll make two secrets: a SQL Server login and a SQL Server password.
For the SQL Server login we'll fill the fields Name, Value and Content type (optional). The content type might be optional but it's an easy way to know what data a secret might contain.
We'll leave the Activation Date and Expiration Date blank and set the secret to Enabled.
When you've filled the required field and click on Create on the bottom of the window.
Make another secret for the SQL Server password and fill the same fields like we did for the SQL Server login. When you've created the second secret, you should be able to see two secrets as shown in the image below.
We can almost use our secrets in Azure Data Factory. But first we need to set the access policies.
Set Azure Key Vault access policies
In order for Azure Data factory to be able to access the secrets we've made before, we need to set some access policies. To set the access policies your Azure Data Factory resource should already exist. If not add a new Azure Data Factory resource.
When you have your Azure Data Factory, let's head back to the Key Vault and click on Access Policies in the left hand menu.
You'll see that there's already one access policy in place for the account that created the Key Vault. We'll add another policy for our Azure Data Factory resource. Click on + Add Access Policy in the middle of the window.
The Data Factory gets the lowest amount of policies needed to use the secrets we've made. You can choose a template so you don't have to set individual permissions for keys, secrets and certificates but for our example we'll do it manually. We're only using secrets so we'll set Key and Certificate permissions to none. For Secret permissions we'll select Get and List. The Data Factory only needs to retrieve secrets. It doesn't need to be able to create, update, etc.
Next we need to select a principal to whom we're giving the permissions. Click on None selected and search for your data factory resource's name. Make sure your resource is selected before clicking on the Select button.
Add the access policy and now you should be able to see a new Application policy in your Key Vault. The new policy has 2 selected Secret permissions and 0 Key and Certificate permissions. Don't forget to click on Save to commit your changes.
Let's head over to Data Factory to start using the Key Vault secrets we've just made.
Let's create the Azure Key Vault linked service
In the Azure Data Factory Studio of your created Data Factory resource we will add a linked service in the Manage window.
Click on Create linked service and search for Key Vault. The Azure Key Vault icon will show and we will continue. Next we'll have to give the linked service a name and connect it to our Key Vault. We'll used Managed Identity as Authentication method and select the Key Vault from our Azure subscription. When we click on Test connection, Connection successful should pop up.
If you forgot to set the access policies in the Key Vault, Microsoft gives a reminder to do so.
Click on create to finish creating the linked service. A new linked service will now show in the linked services overview.
Now we only need to make a linked service that actually uses our secrets from the Key Vault linked service.
Use a secret in a linked service
We're still in the linked service overview and we'll add another linked service, this time for an Azure SQL. If you don't have an Azure SQL resource yet, create a new one from the Azure Marketplace. We'll use the login and password secrets to connect to the Azure SQL.
Head over to your Azure SQL resource with Azure Data Studio or SQL Server Management Studio and add a new login with the same name and password like the ones from our secrets.
When you've done that we'll create the new linked service and point it to the Azure SQL. Click on New, search for Azure SQL Database, and click on Continue. Give the linked service a name choose the Azure SQL from our subscription.
You'll notice that that for the Password you can switch to Azure Key Vault, yet for the User name you can't. If you want to use secrets for the user and the password you should make a secret for the full connection string. For now we'll only set the password as a Azure Key Vault secret.
For the User name we'll copy the content of the secret. Head over to your Key Vault, click on Secrets in the left hand menu, and click on the SQL login secret we've made earlier. You'll see the current version of the secret, if you click on the secret version a window will popup that enables you to copy the secret value.
Copy the value to the User name field, in the linked service creation window in the Data Factory Studio. When ready click on Test connection. A connection successful window should pop up. Click on create and you should see a new Azure SQL Database linked service that's making use of the Key Vault linked service.
Great success!
P.S. Don't forget to publish your changes before closing Azure Data Factory Studio.
Final thoughts
It requires some time for the initial setup of the Azure Key Vault, secrets and the linked services. Yet the benefits outweigh this one time setup.
Gone are the days of hardcoding credentials, API keys and connection strings in your Azure Data Factory!