Address flakiness in Azure SQL conformance test (#1219)

* Add script to update Azure SQL firewall rules from GitHub meta API

* Update state.azure.sql test to use unique DB names

- Add use of `databaseName` metadata to state.azure.sql test.
- Add dynamic generation of test `databaseName` to conformance.yml to
  avoid multiple test instances from racing.
- Add wait before clean-up of Azure SQL DB instance in conformance.yml
  to prevent test flakiness and accumulation of undeleted DBs.
- Remove dynamic Azure SQL firewall rule injection from conformance.yml.
- The workflow relies on IPs used by GitHub Actions to be provisioned
  in the firewall rules already.
- Update documentation for managing Azure SQL and testing instructions.

Co-authored-by: Long Dai <long.dai@intel.com>
Co-authored-by: Artur Souza <artursouza.ms@outlook.com>
This commit is contained in:
Simon Leet 2021-10-27 13:38:48 -07:00 committed by GitHub
parent c4303d55db
commit 0fdeb429c6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 183 additions and 13 deletions

View File

@ -38,6 +38,24 @@ By default, the script will prefix all resources it creates with your user alias
- `AzureKeyVaultSecretStoreCert.pfx` is a local copy of the cert for the Service Principal used in the `secretstore.azure.keyvault` conformance test. The path to this is referenced as part of the environment variables in the `*-conf-test-config.rc`. - `AzureKeyVaultSecretStoreCert.pfx` is a local copy of the cert for the Service Principal used in the `secretstore.azure.keyvault` conformance test. The path to this is referenced as part of the environment variables in the `*-conf-test-config.rc`.
- `AZURE_CREDENTIALS` contains the credentials for the Service Principal you can use to run the conformance test GitHub workflow against the created Azure resources. - `AZURE_CREDENTIALS` contains the credentials for the Service Principal you can use to run the conformance test GitHub workflow against the created Azure resources.
### Deploying for use in GitHub workflows
If you are running the script to enable running the conformance test workflow in your fork of dapr/components-contrib, you will also need to run the `allow-github-ips-in-azuresql.py` script to allow the ports used by GitHub Actions through the test Azure SQL Server's firewall.
The script coalesces the IP addresses published by the GitHub meta API endpoint and adds them as firewall rules to the target SQL server, for example:
```bash
python3 allow-github-ips-in-azuresql.py --outpath ~/sql_firewall_settings --sqlserver "${AzureSqlServerName}" --resource-group "${AzureResourceGroupName}"
```
This script will also allow you to generate the template for adding the firewall rules without deploying them with the `--no-deployment` flag, so you can inspect the rules first as needed:
```bash
python3 allow-github-ips-in-azuresql.py --outpath ~/sql_firewall_settings --no-deployment
```
For more details on the parameters, run the script with the `--help` flag.
## Running Azure conformance tests locally ## Running Azure conformance tests locally
1. Apply all the environment variables needed to run the Azure conformance test from your device, by sourcing the generated `*-conf-test-config.rc` file. For example: 1. Apply all the environment variables needed to run the Azure conformance test from your device, by sourcing the generated `*-conf-test-config.rc` file. For example:
@ -46,10 +64,22 @@ By default, the script will prefix all resources it creates with your user alias
source ~/azure-conf-test/myazurealias-conf-test-config.rc source ~/azure-conf-test/myazurealias-conf-test-config.rc
``` ```
2. Follow the [instructions for running individual conformance tests](../../../../tests/conformance/README.md#running-conormance-tests). 2. Follow the [instructions for running individual conformance tests](../../../../tests/conformance/README.md#running-conformance-tests).
> The `bindings.azure.eventgrid` test and others may require additional setup before running the conformance test, such as setting up non-Azure resources like an Ngrok endpoint. See [conformance.yml](../../../../.github/workflows/conformance.yml) for details. > The `bindings.azure.eventgrid` test and others may require additional setup before running the conformance test, such as setting up non-Azure resources like an Ngrok endpoint. See [conformance.yml](../../../../.github/workflows/conformance.yml) for details.
> The `state.azure.sql` test expects that the SQL Server firewall port has been opened to the test client. The GitHub workflow relies on all IPs used by GitHub Actions being allowed via the `allow-github-ips-in-azuresql.py` script, but when running locally, you will need to open the port to your client IP as indicated in the initial test failure message.
>
> ```bash
> # Capture the blocked IP from the test failure
> TEST_OUTPUT="$(go test -v -tags=conftests -count=1 ./tests/conformance -run=TestStateConformance/azure.sql)"
> BLOCKED_IP=$(echo "$TEST_OUTPUT" | grep -Po "Client with IP address '\K[^']*")
>
> # Login to the account with Contributor access to the SQL server instance
> az login
> az sql server firewall-rule create --resource-group "$AzureResourceGroupName" --server "$AzureSqlServerName" -n "AllowTestClientIP" --start-ip-address "$BLOCKED_IP" --end-ip-address "$BLOCKED_IP"
> ```
## Running Azure conformance tests via GitHub workflows ## Running Azure conformance tests via GitHub workflows
1. Fork the `dapr/components-contrib` repo. 1. Fork the `dapr/components-contrib` repo.

View File

@ -0,0 +1,112 @@
#!/usr/bin/env python3
# ------------------------------------------------------------
# Copyright (c) Microsoft Corporation and Dapr Contributors.
# Licensed under the MIT License.
# ------------------------------------------------------------
import argparse
import ipaddress
import json
import os
import subprocess
import sys
import urllib.request
def parseArgs():
abscwd = os.path.abspath(os.getcwd())
arg_parser = argparse.ArgumentParser(description='Generates the IP ranges based on CIDRs for GitHub Actions from meta API.')
arg_parser.add_argument('--outpath', type=str, default=abscwd, help='Optional. Full path to write the JSON output to.')
arg_parser.add_argument('--sqlserver', type=str, help='Name of the Azure SQL server to update firewall rules of. Required for deployment.')
arg_parser.add_argument('--resource-group', type=str, help='Resouce group containing the target Azure SQL server. Required for deployment.')
arg_parser.add_argument('--no-deployment', action='store_true', help='Specify this flag to generate the ARM template without deploying it.')
args = arg_parser.parse_args()
if not args.no_deployment:
is_missing_args = False
if not args.sqlserver:
print('ERROR: the following argument is required: --sqlserver')
is_missing_args = True
if not args.resource_group:
print('ERROR: the following argument is required: --resource-group')
is_missing_args = True
if is_missing_args:
arg_parser.print_help()
sys.exit(-1)
print('Arguments parsed: {}'.format(args))
return args
def getResponse(url):
operUrl = urllib.request.urlopen(url)
if(operUrl.getcode()==200):
data = operUrl.read()
jsonData = json.loads(data)
else:
print('ERROR: failed to receive data', operUrl.getcode())
return jsonData
def writeAllowedIPRangesJSON(outpath):
url = 'https://api.github.com/meta'
jsonData = getResponse(url)
ipRanges = []
prevStart = ''
prevEnd = ''
# Iterate the public IP CIDRs used to run GitHub Actions, and convert them
# into IP ranges for test SQL server firewall access.
for cidr in jsonData['actions']:
net = ipaddress.ip_network(cidr)
# SQL server firewall only supports up to 128 firewall rules.
# As a first cut, exclude all IPv6 addresses.
if net.version == 4:
start = net[0]
end = net[-1]
# print(f'{cidr} --> [{start}, {end}]')
if prevStart == '':
prevStart = start
if prevEnd == '':
prevEnd = end
elif prevEnd + 65536 > start:
# If the current IP range is within the granularity of a /16
# subnet mask to the previous range, coalesce them into one.
# This is necessary to get the number of rules down to ~100.
prevEnd = end
else:
ipRange = [str(prevStart), str(prevEnd)]
ipRanges.append(ipRange)
prevStart = start
prevEnd = end
if prevStart != '' and prevEnd != '':
ipRange = [str(prevStart), str(prevEnd)]
ipRanges.append(ipRange)
with open(outpath, 'w') as outfile:
json.dump(ipRanges, outfile)
def main():
args = parseArgs()
# Get the GitHub IP Ranges to use as firewall allow-rules from the GitHub meta API
ipRangesFileName = os.path.join(args.outpath, 'github-ipranges.json')
writeAllowedIPRangesJSON(ipRangesFileName)
print(f'INFO: GitHub Actions public IP range rules written {ipRangesFileName}')
# Generate the ARM template from bicep to update Azure SQL server firewall rules
subprocess.call(['az', 'bicep', 'install'])
firewallTemplateName = os.path.join(args.outpath, 'update-sql-firewall-rules.json')
subprocess.call(['az', 'bicep', 'build', '--file', 'conf-test-azure-sqlserver-firewall.bicep', '--outfile', firewallTemplateName])
print(f'INFO: ARM template to update SQL Server firewall rules written to {firewallTemplateName}')
# Update the Azure SQL server firewall rules
if args.no_deployment:
print(f'INFO: --no-deployment specified, skipping update of SQL server {firewallTemplateName}')
else:
subprocess.call(['az', 'deployment', 'group', 'create', '--name', 'UpdateSQLFirewallRules', '--template-file', firewallTemplateName, '--resource-group', args.resource_group, '--parameters', f'sqlServerName={args.sqlserver}', '--parameters', f'ipRanges=@{ipRangesFileName}'])
sys.exit(0)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,22 @@
// ------------------------------------------------------------
// Copyright (c) Microsoft Corporation and Dapr Contributors.
// Licensed under the MIT License.
// ------------------------------------------------------------
param sqlServerName string
param rgLocation string = resourceGroup().location
param ipRanges array
resource sqlServer 'Microsoft.Sql/servers@2021-02-01-preview' = {
name: sqlServerName
location: rgLocation
}
resource sqlServerFirewallRule 'Microsoft.Sql/servers/firewallRules@2021-02-01-preview' = [for (ipRange, i) in ipRanges: {
name: 'sqlGitHubRule${i}'
parent: sqlServer
properties: {
endIpAddress: '${ipRange[1]}'
startIpAddress: '${ipRange[0]}'
}
}]

View File

@ -191,6 +191,7 @@ RESOURCE_GROUP_NAME_VAR_NAME="AzureResourceGroupName"
SERVICE_BUS_CONNECTION_STRING_VAR_NAME="AzureServiceBusConnectionString" SERVICE_BUS_CONNECTION_STRING_VAR_NAME="AzureServiceBusConnectionString"
SQL_SERVER_NAME_VAR_NAME="AzureSqlServerName" SQL_SERVER_NAME_VAR_NAME="AzureSqlServerName"
SQL_SERVER_DB_NAME_VAR_NAME="AzureSqlServerDbName"
SQL_SERVER_CONNECTION_STRING_VAR_NAME="AzureSqlServerConnectionString" SQL_SERVER_CONNECTION_STRING_VAR_NAME="AzureSqlServerConnectionString"
STORAGE_ACCESS_KEY_VAR_NAME="AzureBlobStorageAccessKey" STORAGE_ACCESS_KEY_VAR_NAME="AzureBlobStorageAccessKey"
@ -540,6 +541,10 @@ az keyvault secret set --name "${RESOURCE_GROUP_NAME_VAR_NAME}" --vault-name "${
echo export ${SQL_SERVER_NAME_VAR_NAME}=\"${SQL_SERVER_NAME}\" >> "${ENV_CONFIG_FILENAME}" echo export ${SQL_SERVER_NAME_VAR_NAME}=\"${SQL_SERVER_NAME}\" >> "${ENV_CONFIG_FILENAME}"
az keyvault secret set --name "${SQL_SERVER_NAME_VAR_NAME}" --vault-name "${KEYVAULT_NAME}" --value "${SQL_SERVER_NAME}" az keyvault secret set --name "${SQL_SERVER_NAME_VAR_NAME}" --vault-name "${KEYVAULT_NAME}" --value "${SQL_SERVER_NAME}"
# Export a default value for DB name to be used when running conformance test locally.
# This is not added to the keyvault as the conformance.yml workflow generates a unique DB name each time.
echo export ${SQL_SERVER_DB_NAME_VAR_NAME}=\"${PREFIX}SqlDb\" >> "${ENV_CONFIG_FILENAME}"
# Note that `az sql db show-connection-string` does not currently support a `go` --client type, so we construct our own here. # Note that `az sql db show-connection-string` does not currently support a `go` --client type, so we construct our own here.
SQL_SERVER_CONNECTION_STRING="Server=${SQL_SERVER_NAME}.database.windows.net;port=1433;User ID=${SQL_SERVER_ADMIN_NAME};Password=${SQL_SERVER_ADMIN_PASSWORD};Encrypt=true;" SQL_SERVER_CONNECTION_STRING="Server=${SQL_SERVER_NAME}.database.windows.net;port=1433;User ID=${SQL_SERVER_ADMIN_NAME};Password=${SQL_SERVER_ADMIN_PASSWORD};Encrypt=true;"
echo export ${SQL_SERVER_CONNECTION_STRING_VAR_NAME}=\"${SQL_SERVER_CONNECTION_STRING}\" >> "${ENV_CONFIG_FILENAME}" echo export ${SQL_SERVER_CONNECTION_STRING_VAR_NAME}=\"${SQL_SERVER_CONNECTION_STRING}\" >> "${ENV_CONFIG_FILENAME}"

View File

@ -297,16 +297,11 @@ jobs:
go mod download go mod download
go install gotest.tools/gotestsum@latest go install gotest.tools/gotestsum@latest
- name: Configure Azure SQL Firewall - name: Generate Azure SQL DB name
run: | run: |
set +e # Use UUID with `-` stripped out for DB names to prevent collisions between workflows
TEST_OUTPUT="$(go test -v -tags=conftests -count=1 -timeout=1m ./tests/conformance -run=TestStateConformance/azure.sql)" export AzureSqlServerDbName=$(cat /proc/sys/kernel/random/uuid | sed -E 's/-//g')
echo "Trial run result:\n\"$TEST_OUTPUT\"" echo "AzureSqlServerDbName=$AzureSqlServerDbName" >> $GITHUB_ENV
PUBLIC_IP=$(echo "$TEST_OUTPUT" | grep -Po "Client with IP address '\K[^']*")
if [[ -n ${PUBLIC_IP} ]]; then
echo "Setting Azure SQL firewall-rule AllowTestRunnerIP to allow $PUBLIC_IP..."
az sql server firewall-rule create --resource-group ${{ env.AzureResourceGroupName }} --server ${{ env.AzureSqlServerName }} -n "AllowTestRunnerIP" --start-ip-address "$PUBLIC_IP" --end-ip-address "$PUBLIC_IP"
fi
if: contains(matrix.component, 'azure.sql') if: contains(matrix.component, 'azure.sql')
- name: Run tests - name: Run tests
@ -347,12 +342,16 @@ jobs:
continue-on-error: true continue-on-error: true
run: pkill ngrok; cat /tmp/ngrok.log run: pkill ngrok; cat /tmp/ngrok.log
- name: Cleanup Azure SQL Firewall and test DB instance - name: Cleanup Azure SQL test DB instance
if: contains(matrix.component, 'azure.sql') if: contains(matrix.component, 'azure.sql')
continue-on-error: true continue-on-error: true
run: | run: |
az sql server firewall-rule delete --resource-group ${{ env.AzureResourceGroupName }} --server ${{ env.AzureSqlServerName }} -n "AllowTestRunnerIP" # Wait for the creation of the DB by the test to propagate to ARM, otherwise deletion succeeds as no-op.
az sql db delete --resource-group ${{ env.AzureResourceGroupName }} --server ${{ env.AzureSqlServerName }} -n dapr --yes # The wait should be under 30s, but is capped at 1m as flakiness here results in an accumulation of expensive DB instances over time.
# Also note that the deletion call only blocks until the request is process, do not rely on it for mutex on the same DB,
# deletion may be ongoing in sequential runs.
sleep 1m
az sql db delete --resource-group ${{ env.AzureResourceGroupName }} --server ${{ env.AzureSqlServerName }} -n ${{ env.AzureSqlServerDbName }} --yes
# Download the required certificates into files, and set env var pointing to their names # Download the required certificates into files, and set env var pointing to their names
- name: Clean up certs - name: Clean up certs

View File

@ -7,6 +7,8 @@ spec:
metadata: metadata:
- name: connectionString - name: connectionString
value: ${{AzureSqlServerConnectionString}} value: ${{AzureSqlServerConnectionString}}
- name: databaseName
value: ${{AzureSqlServerDbName}}
- name: tableName - name: tableName
value: dapr_conf_test value: dapr_conf_test
- name: keyType - name: keyType