CI/CD of QnA Maker Knowledge Base using Azure DevOps

Mt Swargarohini; Bali Pass trek (October 2021)

Overview

I have been working on a fascinating project where there is a need for NLP. After surfing through some options, we decided to go ahead with Azure’s QnA Maker service. As this NLP requirement is a part of the product that we are building, we had to make sure that it fits well with the rest of the implementation and does not become a hassle when we want to make any changes in the future. The dataset should be open for modifications, and the service should re-train and re-deploy to different environments (dev/staging/production) whenever the dataset gets updated, as a part of deployment processes. Please check out the code related to this article here.

Knowledge Base

If you need to build a smart FAQ chatbot or anything which involves questions and their answers, then QnA Maker is the way to go. We need our solution to be smart enough to understand similar questions so that it could return the same answer. Questions like “How high is Mt. Everest” and “What’s the height of Everest” should be read as the same. For that, we first need to feed in that kind of data to our QnA Maker service, so that it gets a basic understanding of which answer to return for a specific question.

Knowledge Base (KB from hereon) is a collection of questions and answers against which the QnA Maker service will get trained. Every QnA Maker instance runs against a KB, and that’s why it is very important to manage KBs properly. KB could be imported into QnA Maker in various formats. We’ll be importing data for our KB from an excel file (.xlsx), so that the excel file becomes source of truth which is easier to manage.

Need for CI/CD

Consider this scenario: At time T1, QnA Maker is using KB version v1 across all the environments of your solution (dev, staging, and production for now). At T2, a requirement comes in where you need to update an answer in the existing KB, which makes it v2. But you’ll also need to test this modification in dev and staging before pushing this to production. So at T2, you want dev and staging to use v2, but production to still use v1. Not just the KB should be updated across different environments, the re-trained QnA Maker should be re-deployed automatically so that QnA Maker uses the latest KB. This makes us realize that CI/CD of KB and QnA Maker is something important.

Flow

Deployment flow

Deployment pipeline

We’ll be using Azure DevOps for CI/CD purposes. Our aim is as follows: We want the same pipeline to deploy to different environments, based on the branch. If new changes are pushed to the dev branch, then a pipeline should be run for dev-related changes. Hence, if the excel sheet containing questions and answers is updated in dev, then we want the KB to be updated and published so that the QnA service can now return modified answers. Hence, developers only need to manage questions and answers in the excel, while our deployment pipeline will take care of reflecting changes wherever required.

Another small thing that I’ve assumed is that each environment needs to be deployed to its own Azure subscription. Hence, variable groups need to have the service connection name of the respective subscription.

We need to create a new pipeline based on our azure-pipeline.yml. Different steps/tasks of the pipeline are as follows:

  • Variables: Library groups are a nice way to keep different sets of variables for different environments. We want our pipeline to conditionally pick the right set of variables, based on the triggering branch. If new changes are pushed to staging, then the pipeline must only access the staging variable group.
  • Replace tokens: Parameters of our IaC (ARM in our case) deployment need to be dynamic, based on the environment to which deployment is being done. Hence, we’ll need our pipeline to pick up values from variables and use those values as parameters in the ARM deployment.
  • Azure PowerShell: PowerShell task to deploy the ARM template. The ARM template provisions QnA Maker service and Azure Key Vault. The template provisions a key vault secret for QnA Maker’s authoring key, which we’ll need in Python script.
  • Azure Key Vault: This task downloads secrets from Key Vault so that they can be used as environment variables of the pipeline. We are specifically interested in the authoring key of QnA Maker.
  • Scripts: pip installs – requests library for consuming REST endpoints and openpyxl library for reading excel files easily.
  • Python script: Explained below.

Python Script

The final task of our pipeline will execute the python script which basically converts excel data to an object, and the object is sent as a payload of create/update KB REST APIs. The script accepts the following arguments: QnA’s host endpoint, authoring key, and name of the KB which needs to be created/updated. QnA Maker’s REST API references can be found here.

For authenticating against the endpoints, we need to pass the QnA’s authoring key as the value of “Ocp-Apim-Subscription-Key” header.

The script also checks if the KB has already been provisioned. If yes, then it simply updates it, and if not, then it creates a new KB.

In order to create a new KB, we need to make a POST call to /create endpoint. The endpoint returns 202 after accepting the request. A small issue with this is that our script at this point isn’t sure whether the KB got successfully created or not, as the endpoint has just acknowledged our request, but hasn’t actually confirmed the creation of the KB. We need a way to check the status of KB creation.

For monitoring status, we can use /operations endpoint. It returns “operationState” as “Succeeded” when provisioning of the KB is complete. We need to poll this endpoint till we get the desired state. Below is the monitoring function, which returns ID of the KB.

def monitorOperation(host, operationId):
    state = ""
    count = 0
    while(state != "Succeeded" and count < 10):
        response = requests.get(host + "operations/" + operationId, headers={'Ocp-Apim-Subscription-Key': subscriptionKey, 'Content-Type': 'application/json'})
        state = response.json()['operationState']
        count = count + 1
        time.sleep(1)
    if(count == 10 and state != "Succeeded"):
        raise Exception("Something went wrong while creating KB")
    return response.json()['resourceLocation'].split('/')[-1]

After creating/updating KB, we need to make the new changes available for end users. For that, we need to /publish the KB.

To see how an application (.NET Core Console App in our case) can consume the QnA Maker service, please refer to this code.

Notes

  • The Python script assumes that the Excel sheet’s first column is for questions, and the second one is for answers. Please check out the Excel here to understand the format.
  • While provisioning Key Vault using ARM, we need to keep in mind that DevOps will need access to Key Vault secrets. Hence, it’s a good idea to specify the access while provisioning the Key Vault itself, like this:
"accessPolicies": [
                    {
                        "tenantId": "[subscription().tenantId]",
                        "objectId": "[parameters('devOpsSpnObjectId')]",
                        "permissions": {
                            "keys": [],
                            "secrets": [
                                "Get",
                                "List",
                                "Set"
                            ],
                            "certificates": []
                        }
                    }
                ],

And we also need to add DevOps SPN’s object ID in variable groups. Example of a variable group:

dev variable group

All the required code and files can be found here. Thanks for reading!

Authorization of applications in an Azure Function

On the way to Brahmatal summit (December 2018)

Introduction

While working with Microsoft Graph, most of us have assigned application permissions to an application so that the application can fetch data from Graph APIs based on the assigned permissions.

In this article, let’s try to imagine and develop things for the other end, for the API end. By the end of this article, you’ll understand how an incoming token which is created by an application can be validated before fetching resources for the request. For an API, I’m using a HTTP triggered Azure Function.

Continue reading Authorization of applications in an Azure Function

Importing Power Platform solution – 2

Kedarkantha base camp
Sunset at Kedarkantha base camp (December 2019)

This post is the second part of a two-part blog series on Importing Power Platform solution.

In this series, I would like to show how a Power Platform solution can be programmatically imported into a target environment using two ways:

  1. Delegated permission
  2. Application user

Introduction

In the first part of this blog series, I wrote about how can we import a Power Platform solution from a user’s context using delegated permission of Dynamics CRM. But what if we want to import a solution from an application’s context, where there is no user interaction at all? In this blog, I’ll try to explain the same.

Continue reading Importing Power Platform solution – 2

Importing Power Platform solution – 1

Kedarkantha summit
Sunrise at Kedarkantha summit (December 2019)

This post is the first part of a two-part blog series on Importing Power Platform solution.

In this series, I would like to show how a Power Platform solution can be programmatically imported into a target environment using two ways:

  1. Delegated permission
  2. Application user

Introduction

While Application Lifecycle Management of Power Platform solutions can be done using Power Platform Build Tools in Azure DevOps, solutions can be managed using custom code (PowerShell, Rest API, SDK API) as well. In this article, I’ll try to explain how we can import a Power Platform solution into a target environment using a delegated permission.

Continue reading Importing Power Platform solution – 1

Meeting Scheduler Bot in Teams: Bot Framework

Sunset at Brahmatal basecamp (December 2018)

Introduction

Imagine an experience where you ask your personal assistant to set up a meeting with your team, but your team is working on a big project and you don’t know at what time your team will be available for a meeting. No worries, your PA is super smart and quickly prepares a list of first 5 most suitable timings based on availability of all the attendees. Impressed, right? We will try to imitate the same experience by developing our own PA, a Teams bot.

Continue reading Meeting Scheduler Bot in Teams: Bot Framework