Accessing Databricks REST API using service principal

Prakamya Aishwarya
3 min readJun 14, 2021

--

This article is focused around accessing the Azure Databricks REST API using Service Principal(SP) certificate or secret for authentication. If you are looking for authenticating as a user please refer to Microsoft’s official documentation.

You should also go through this article by Amine Kaabachi on ‘Ways to authenticate Azure Databricks REST API’. It is a very good read and provides details on various authentication options available and some rules for you to decide which one you should use.

Another good resource, that I have taken help from is this git hub repository. Some code excerpts have directly been picked from here.

In our examples we have made use of client certificate as the credential. If you are using client secret, you can simply pass them as any other password in place of the client_credential.

Accessing Azure Databricks REST API using service principal requires two things-

  1. Authentication
  2. The SP which is authenticating should be a databricks workspace user (we can call this authorization)

If your SP is not yet a workspace user, we will be discussing on how to add it. You can get the list of databricks workspace users in the databricks UI or you can use the SCIM API.

Users can be found under workspace in the Databricks UI

Authentication-

For authenticating using service principal, we first obtain an Azure Active Directory (aad) token for the SP and then use this token while sending the API request. For obtaining the aad token, we will use the Microsoft Authentication Library (MSAL). (azure identity package has also been used for creating client credential but this can be skipped..more on this in the code)

Acquiring aad token for our SP

MSAL makes use of Oauth 2.0 client credential flow which in-turn uses either the SP secret or the SP certificate (you may know these as client secret or client certificate) to obtain the aad token.

Authorization-

Once we have got the access token, we need to make sure that our SP should be present as a workplace user. If your service principal is already a workspace user, you can skip this part and directly make the API request.

To add our SP as a workspace user, we have two options-

  1. Use our SP management privileges to add itself as a workspace user. (This requires the SP to have ‘Contributor’ or higher role for databricks workspace resource.)
  2. Use SCIM API to add SP as a workspace user.

I would recommend to use the second approach as the first approach requires a contributor role for the SP which might not be given in production environments due to security concerns. This will also violate the least privilege principle.

1.Adding using SP management privileges-

Link to Microsoft documentation

We use this method when the service principal is not defined as a user, and we want to add it automatically as an admin user while making the API request.

  1. Make sure the SP has ‘Contributor’ or ‘Owner’ role for the databricks workspace resource.
  2. Acquire the management token for the SP.
  3. Send this management token and the SP aad token obtained earlier in the header while making the desired request.
Sending the API request using management privileges of our SP. This automatically adds our SP as a workspace user.

Once we make the above API request, our SP get added as a user and subsequent calls can be made directly, i.e without the management token.

2. Adding using SCIM API-

Link to Microsoft documentation

SCIM for service principal allows you to manage Azure Active Directory service principals in Azure databricks.

  1. Get aad for an databricks admin
  2. Use this aad to make the below API request.
SCIM request to add our SP as a user in databricks.
payload.json

Once this is done, make the desired API request using the SP aad obtained earlier.

Making the desired databricks API request.

The entire piece looks like this-

Summary

In this article we established connectivity to Azure Databricks using REST API. There are a number of operations that can be performed on databricks this way including — create, submit and monitor jobs, import, export and delete workspace objects and more.

If there is something inaccurate or something that can be improved, please leave a response.

If you would like to connect with me, you can find me on Linkedin- Prakamya Aishwarya.

--

--

No responses yet