CI/CD of QnA Maker Knowledge Base using Azure DevOps

Mt Swargarohini; Bali Pass trek (October 2021)

Overview

I have been working on a fascinating project where there is a need for NLP. After surfing through some options, we decided to go ahead with Azure’s QnA Maker service. As this NLP requirement is a part of the product that we are building, we had to make sure that it fits well with the rest of the implementation and does not become a hassle when we want to make any changes in the future. The dataset should be open for modifications, and the service should re-train and re-deploy to different environments (dev/staging/production) whenever the dataset gets updated, as a part of deployment processes. Please check out the code related to this article here.

Knowledge Base

If you need to build a smart FAQ chatbot or anything which involves questions and their answers, then QnA Maker is the way to go. We need our solution to be smart enough to understand similar questions so that it could return the same answer. Questions like “How high is Mt. Everest” and “What’s the height of Everest” should be read as the same. For that, we first need to feed in that kind of data to our QnA Maker service, so that it gets a basic understanding of which answer to return for a specific question.

Knowledge Base (KB from hereon) is a collection of questions and answers against which the QnA Maker service will get trained. Every QnA Maker instance runs against a KB, and that’s why it is very important to manage KBs properly. KB could be imported into QnA Maker in various formats. We’ll be importing data for our KB from an excel file (.xlsx), so that the excel file becomes source of truth which is easier to manage.

Need for CI/CD

Consider this scenario: At time T1, QnA Maker is using KB version v1 across all the environments of your solution (dev, staging, and production for now). At T2, a requirement comes in where you need to update an answer in the existing KB, which makes it v2. But you’ll also need to test this modification in dev and staging before pushing this to production. So at T2, you want dev and staging to use v2, but production to still use v1. Not just the KB should be updated across different environments, the re-trained QnA Maker should be re-deployed automatically so that QnA Maker uses the latest KB. This makes us realize that CI/CD of KB and QnA Maker is something important.

Flow

Deployment flow

Deployment pipeline

We’ll be using Azure DevOps for CI/CD purposes. Our aim is as follows: We want the same pipeline to deploy to different environments, based on the branch. If new changes are pushed to the dev branch, then a pipeline should be run for dev-related changes. Hence, if the excel sheet containing questions and answers is updated in dev, then we want the KB to be updated and published so that the QnA service can now return modified answers. Hence, developers only need to manage questions and answers in the excel, while our deployment pipeline will take care of reflecting changes wherever required.

Another small thing that I’ve assumed is that each environment needs to be deployed to its own Azure subscription. Hence, variable groups need to have the service connection name of the respective subscription.

We need to create a new pipeline based on our azure-pipeline.yml. Different steps/tasks of the pipeline are as follows:

  • Variables: Library groups are a nice way to keep different sets of variables for different environments. We want our pipeline to conditionally pick the right set of variables, based on the triggering branch. If new changes are pushed to staging, then the pipeline must only access the staging variable group.
  • Replace tokens: Parameters of our IaC (ARM in our case) deployment need to be dynamic, based on the environment to which deployment is being done. Hence, we’ll need our pipeline to pick up values from variables and use those values as parameters in the ARM deployment.
  • Azure PowerShell: PowerShell task to deploy the ARM template. The ARM template provisions QnA Maker service and Azure Key Vault. The template provisions a key vault secret for QnA Maker’s authoring key, which we’ll need in Python script.
  • Azure Key Vault: This task downloads secrets from Key Vault so that they can be used as environment variables of the pipeline. We are specifically interested in the authoring key of QnA Maker.
  • Scripts: pip installs – requests library for consuming REST endpoints and openpyxl library for reading excel files easily.
  • Python script: Explained below.

Python Script

The final task of our pipeline will execute the python script which basically converts excel data to an object, and the object is sent as a payload of create/update KB REST APIs. The script accepts the following arguments: QnA’s host endpoint, authoring key, and name of the KB which needs to be created/updated. QnA Maker’s REST API references can be found here.

For authenticating against the endpoints, we need to pass the QnA’s authoring key as the value of “Ocp-Apim-Subscription-Key” header.

The script also checks if the KB has already been provisioned. If yes, then it simply updates it, and if not, then it creates a new KB.

In order to create a new KB, we need to make a POST call to /create endpoint. The endpoint returns 202 after accepting the request. A small issue with this is that our script at this point isn’t sure whether the KB got successfully created or not, as the endpoint has just acknowledged our request, but hasn’t actually confirmed the creation of the KB. We need a way to check the status of KB creation.

For monitoring status, we can use /operations endpoint. It returns “operationState” as “Succeeded” when provisioning of the KB is complete. We need to poll this endpoint till we get the desired state. Below is the monitoring function, which returns ID of the KB.

def monitorOperation(host, operationId):
    state = ""
    count = 0
    while(state != "Succeeded" and count < 10):
        response = requests.get(host + "operations/" + operationId, headers={'Ocp-Apim-Subscription-Key': subscriptionKey, 'Content-Type': 'application/json'})
        state = response.json()['operationState']
        count = count + 1
        time.sleep(1)
    if(count == 10 and state != "Succeeded"):
        raise Exception("Something went wrong while creating KB")
    return response.json()['resourceLocation'].split('/')[-1]

After creating/updating KB, we need to make the new changes available for end users. For that, we need to /publish the KB.

To see how an application (.NET Core Console App in our case) can consume the QnA Maker service, please refer to this code.

Notes

  • The Python script assumes that the Excel sheet’s first column is for questions, and the second one is for answers. Please check out the Excel here to understand the format.
  • While provisioning Key Vault using ARM, we need to keep in mind that DevOps will need access to Key Vault secrets. Hence, it’s a good idea to specify the access while provisioning the Key Vault itself, like this:
"accessPolicies": [
                    {
                        "tenantId": "[subscription().tenantId]",
                        "objectId": "[parameters('devOpsSpnObjectId')]",
                        "permissions": {
                            "keys": [],
                            "secrets": [
                                "Get",
                                "List",
                                "Set"
                            ],
                            "certificates": []
                        }
                    }
                ],

And we also need to add DevOps SPN’s object ID in variable groups. Example of a variable group:

dev variable group

All the required code and files can be found here. Thanks for reading!

3 thoughts on “CI/CD of QnA Maker Knowledge Base using Azure DevOps”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s