firewing1's blog

Thoughts on writing better documentation

Reasonable defaults

I like Python so much because the developer experience working is amazing - it's an incredibly productive language because of the Zen of Python. It explicitly calls out: there should be one-- and preferably only one --obvious way to do it..

It means Python library and modules often have one, clear, canonical way to do things and that implementation will typically have a 5-10 line sample that will cover user needs 99% of the time. It will try to handle all the 'gotchas' for you in a reasonable manner, and only if you want to customize it do you need to check the API docs. It doesn’t preclude having different ways of doing things, but there should be one obvious way.

Building blocks

Technical documentation sets will typically show you documentation for class API surface areas, maybe auto-generated from source comments. That's great! Except... it's probably not that helpful to the people who need it most. Developers are not trying to problem-solve how to use your class' API, they're problem-solving for scenarios. They are after desired behavior, and your class' API surface area is the afterthought.

A big gripe I have with documentation - in particular that of C#/.NET - is how often it falls into the trap of documenting the usage of specific methods or class basic building blocks without documenting the broader interactions, resulting in a myopic view of the overall desired behavior a developer is after.

It leaves users to stumble through assembling pieces to solve their scenario the hard way - akin to if a LEGO instruction booklet listed only all the different ways in which each LEGO shape could be used, but contained no step-by-step assembly instructions.

Knowing your audience

Writing quality documentation is about knowing your audience. Readers of documentation will generally fall into two buckets:

  1. People who have used your project before, and just need a refresher on API surface area
  2. People who are trying to figure out how to achieve specific desired behavior, are new to your project, and seeing if the fit is right.

Very specific API documentation best helps the former, while good examples and detailed remarks help the latter.

API documentation does little to help the latter group. It assumes they know about how the parts of your project are supposed to interact, within itself or with the language's standard library. That's a great way to have users to shoot themselves in the foot, instead of guiding users the right direction.

For example, if you are deeply experienced in the C#/.NET ecosystem and patterns, you probably already have an idea of what classes need to interact together to achieve the desired behavior and the patterns necessary to avoid gotchas down the line - it's extensive and wonderful, in-depth API docs are perfect for you!

But if you are a developer reading the docs with a fresh eye, trying to figure out "how do I issue HTTP GET in C#?" or "how to I verify a self-signed X.509 certificate with .NET?", you won't have as good luck. You'll probably end up implementing HttpClient in the obvious but sub-optimal way that causes socket exhaustion, or be lulled into a false sense of security from X509Chain.Build() without realizing nuance in the .NET implementation details warrants additional verification on top of the X.509 class methods.

Example - Polly and REST APIs

Say you want to make REST API calls to an external dependency, and want to also use Polly to add resiliency those calls. Luckily, there's a whole docs page for that! It'll even (briefly) touch upon that socket exhaustion gotcha with HttpClient.

It shows building blocks - how to add a policy to a HttpClient instance, and how to select policies.

However, the first real-world problem a user is going to run into is going to be needing multiple policies. Different endpoints are going to have different needs, and most importantly: POST is not idempotent, so applying retries are going to wreak havoc when the developer encounters their first failed POST call.

After hitting that, a user may realize their error and search along the lines of 'polly httpclient (idempotent OR idempotency)' on the upstream docs or Microsoft docs and promptly come up with no results.

Broaden the search results to any website and the top post is a helpful from Scott Hanselman
Scott Hanselman that mentions this specific issue:

GOTCHAS

A few things to remember. If you are POSTing to an endpoint and applying retries, you want that operation to be idempotent.

Since official docs are all focused on tying policies to HttpClient and subsequently how to inject it with DI, what's clear is that policies were intended to be tied a HttpClient instance - so the obvious question would be then how do I consume multiple HttpClient references via DI so I can apply different policies?

And just like that, the docs led the user down an obvious path, but a wrong one that shoots them in the foot: you shouldn't try and inject multiple HttpClient instances.

Let's go back to our idempotency search and pick results further down the page: blog posts by twilio and no dogma, both talking about this gotcha and demonstrating how to vary policies on a single DI-injected HttpClient based on the HttpRequestMessage properties (i.e. REST method) using AddPolicyHandler() or AddPolicyHandlerFromRegistry() respectively.

In short

Remember the Zen of Python?

There should be one-- and preferably only one --obvious way to do it.

Documentation should make the recommended implementation, avoiding gotchas, obvious; Include a remark about idempotency in sections that apply retry policy to HttpClient. Include a code sample that shows explicitly how to tie multiple policies to a single HttpClient, and also the recommended way to inject it with IHttpClientFactory.

What would amount to under 50 lines of sample code could show developers with fresh eyes a canonical implementation working around issues 99% would likely face, and save users hours of debugging. That's good documentation.

How to properly validate X.509 certificates in C# with .NET Core 3.1 & .NET 5+

I have been recently working on a project that required issuing certificates from a self-signed root CA, and while trying to verify those certificates from a C# project, I discovered that the AllowUnknownCertificateAuthority X509VerificationFlag was not behaving as expected; the docs (at the time) shows that this flag would ignore only UntrustedRoot when in reality it was also ignoring the PartialChain status!

I detail the consequences in dotnet/runtime#49615; namely X509Chain.Build() -- which is intended to return a simple bool representing if the certificate was verified as trusted or not -- was returning true even if the certificate under validation was not issued by any of the trusted root CAs or those in ExtraStore (i.e., it considers a new chain consisting only the certificate under validation and determines that to be a partial chain, which is then ignored).

The X509VerificationFlag docs have since been corrected to include PartialChain in the behavior of AllowUnknownCertificateAuthority flag, but contributions of further examples to the docs were rejected and I do not believe that the docs provide sufficient clarity around the issues with the .NET 3.x and prior APIs:

Unless developers are using .NET 5 and the X509Chain.CustomTrustStore property, the return value of X509Chain.Build() should not be trusted on its own; proper certificate verification requires developers manually and separately perform manual verification of correct chain termination (i.e. checking the last item in the chain is indeed the signing root CA we expect).

Thus, I have created a new GitHub repository stewartadam/dotnet-x509-certificate-verification that describes these issues in detail and provides code samples for securely validating X.509 certificates on both .NET Core and .NET 5 including with self-signed root CAs.

Running GitHub Actions in an Azure DevOps pipeline

Azure Pipelines does have an extensions marketplace available, but there isn't always a task to fill a gap.
GitHub Actions Marketplace has a quickly growing list of actions and has several related to markdown linting, which was the use case I was after.

But how does one get the GitHub Action running from an Azure DevOps Pipeline?

Find the Docker image for the GitHub Action

Fortunately, it's fairly straightforward to get an GitHub Action running from your AzDO Pipeline provided the author uses a Docker container -- there are several GitHub action types, and not all of them use Docker.
Once you've found a marketplace Action you like, go to its repo using the link on the right-hand pane above Open Issues.

Any Docker-based task will have an action.yaml file in their repo, so if the author doesn't mention the image name in their README peek at the runs section of their action.yaml to find their registry and image name.
For example, have a look at the action.yaml from the Markdown Lint action).

If it's not a Docker based task, sometimes you're lucky and the author provides one anyways in the README - such is the case for Markdown Link Check, even though it is a JavaScript-based action.

Configure your pipeline

Provided a Docker container is available for the Action, it's a fairly straightforward to integrate it into a DevOps after installing Docker tools onto your build agent.

First, ensure you have the docker tools available on your build agent:

...
steps:
  - task: DockerInstaller@0
    displayName: Docker Installer
    inputs:
      releaseType: stable
  ...

Next, use a script task to issue a docker run that sets up an environment similar to that created by GitHub actions.
For most Actions, I've found that just mapping a volume of the repository root to /tmp and setting the working directory to the same to be sufficient:

steps:
  # ...
  - script: |
      docker run --rm -i -v $(pwd):/tmp -w /tmp IMAGE_NAME:TAG ARGS_IF_APPLICABLE
    displayName: "Run GitHub Action <foo>"

For example, for the Markdown Link Check:

steps:
  # ...
  # https://github.com/marketplace/actions/markdown-link-check
  - script: |
      docker run --rm -i -v $(pwd):/tmp:ro -w /tmp ghcr.io/tcort/markdown-link-check:stable *.md **/*.md
    displayName: "Check for broken URLs"

(note here, since link checking doesn't require write access to the repo I've also marked the /tmp volume as ro to limit its permissions - this is optional, but recommended)

Some actions may make use of GitHub built-in environment variables.
If you run into errors due to missing variable values, you will need to look at the Action repo to find out when are used, and map them to the Azure Pipelines Predefined Variables as best you can:

docker run --rm -i -v "$(pwd):/tmp:ro" -w /tmp -e GITHUB_SHA="$(Build.SourceVersion)" -e GITHUB_JOB="$(System.JobId)" IMAGE_NAME:TAG ARGS

Putting it all together

Here's an sample complete job template YAML to kick off markdown linting, link checking, and spellchecking:

# Re-usable pipeline job template to perform markdown linting.

jobs:
  - job: verify_markdown
    timeoutInMinutes: 10
    pool:
      vmImage: ubuntu-latest
    steps:
      - task: DockerInstaller@0
        displayName: Docker Installer
        inputs:
          releaseType: stable

      # https://github.com/marketplace/actions/markdown-linting-action
      - script: |
          docker run --rm -v $(pwd):/tmp:ro -w /tmp avtodev/markdown-lint:master .
        displayName: "Lint markdown"

      # https://github.com/marketplace/actions/markdown-link-check
      - script: |
          docker run --rm -i -v $(pwd):/tmp:ro -w /tmp ghcr.io/tcort/markdown-link-check:stable *.md **/*.md
        displayName: "Check for broken URLs"

      # https://github.com/marketplace/actions/github-spellcheck-action
      - script: |
          docker run --rm -v $(pwd):/tmp jonasbn/github-action-spellcheck
        displayName: "Check for spelling errors"

Note - since this trick only involves setting up the Docker image in the right way, you can also use these Actions locally using the same docker run commands as your pipeline, as long as you can find the right values any environment variables if they are required by the Action.

Customizing the DNS Servers used for specific clients with Unifi Security Gateway

One of the neat and relatively undocumented feature of Unifi Security Gateway (USG) is the ability to specify alternate DNS servers sent with DHCP replies for specific clients, permitting you to do things like setup pihole for only a few specific devices on your LAN (e.g. the Smart TV or a streaming stick).

This is perfect as I didn't want to point my whole network at the pihole, as that would mean any technical issues with my pihole host (config errors, docker failing, etc) means the whole home Internet connection effectively going offline.

You can test out this feature interactively by SSHing into the USG and running these commands (replace capitals as appropriate):

configure
set service dhcp-server shared-network-name net_LANNAME_eth1_SUBNET-MASK subnet SUBNET/MASK static-mapping DASH-SEPARATED-MAC-ADDR ip-address LAN_STATIC_IP
set service dhcp-server shared-network-name net_LANNAME_eth1_SUBNET-MASK subnet SUBNET/MASK static-mapping DASH-SEPARATED-MAC-ADDR mac-address COLON:SEPARATED:MAC:ADDRESS
set service dhcp-server shared-network-name net_LANNAME_eth1_SUBNET-MASK subnet SUBNET/MASK static-mapping DASH-SEPARATED-MAC-ADDR static-mapping-parameters  "option domain-name-servers DNS_IP_FOR_OVERRIDE;"
commit
save
exit

This will edit the running configuration but rebooting or re-provisioning will lose these changes. To persist the configuration, create/edit your config.gateway.json with a snippet like this:

{
        "service": {
                "dhcp-server": {
                        "shared-network-name": {
                                "net_LANNAME_eth1_SUBNET-MASK": {
                                        "subnet": {
                                                "SUBNET/MASK": {
                                                        "static-mapping": {
                                                                "DASH-SEPARATED-MAC-ADDRESS": {
                                                                        "host-record": "disable",
                                                                        "ip-address": "LAN_STATIC_IP",
                                                                        "mac-address": "COLON:SEPARATED:MAC:ADDRESS",
                                                                        "static-mapping-parameters": "option domain-name-servers DNS_IP_FOR_OVERRIDE;"
                                                                }
                                                        }
                                                }
                                        }
                                }
                        }
                }
        }
}

Credit for discovering the syntax: tachyonforce on the Unifi forums

For those curious about the pihole setup specifically, I used docker-compose with the pihole/pihole image on a home server to get it running:

---
version: "2.2"

services:
  pihole:
    image: pihole/pihole
    container_name: pihole
    restart: unless-stopped
    environment:
      - TZ=America/Los_Angeles
      - ServerIP=192.168.1.15
      - WEBPASSWORD=arandompw
      - VIRTUAL_HOST=externalhostname
    ports:
      - "192.168.1.15:53:53/tcp"
      - "192.168.1.15:53:53/udp"
    volumes:
      - /srv/docker-vols/pihole/etc/pihole:/etc/pihole/
      - /srv/docker-vols/pihole/etc/dnsmasq.d:/etc/dnsmasq.d

Here I used the IP address of the server in the port mapping to as the server has multiple interfaces, and 53 is already used elsewhere. Specifying the IP ensures that Docker attempts to port map on the correct interface.

VIRTUAL_HOST is required because I use a reverse proxy to expose internal services, so the hostname must be provided to ensure dashboard URLs resolve correctly.

Tags: 

Using Azure CLI 2.0 behind a web proxy with mitmproxy or Fiddler

The Azure CLI is a wonderful tool to manage Azure resources but at times, you'll run into a bizarre error (or want to reverse engineer what API call is being made for a given comment) and need more information. HTTP session capture tools like Fiddler or mitmproxy are excellent for tracing HTTP calls, but the since the Azure CLI constructs requests directly using the requests Python library, it ignores the Windows or macOS default proxy settings.

Here's how you can call the Azure CLI forcing it to use the HTTP web proxy:

export HTTP_PROXY="http://localhost:8080" HTTPS_PROXY="http://localhost:8080"
az rest --debug --method put --uri "$URL" --body "$BODY"

Note that unless you just want to use a HTTP proxy, mitmproxy or Fiddler will also be intercepting HTTPS requests and presenting its own certificate. Even if you it trusted in the system certificate store, again - Python's requests uses its own resulting in something like this error message:

cli.azure.cli.core.util : HTTPSConnectionPool(host='management.azure.com', port=443): Max retries exceeded with url: /subscriptions/subid/resourceGroups/vmname/providers/microsoft.Security/locations/westus2/jitNetworkAccessPolicies/default/Initiate?api-version=2015-06-01-preview (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')])")))

HTTPSConnectionPool(host='management.azure.com', port=443): Max retries exceeded with url: /subscriptions/subid/resourceGroups/vmname/providers/microsoft.Security/locations/westus2/jitNetworkAccessPolicies/default/Initiate?api-version=2015-06-01-preview (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')])")))

Update June 2021: Azure CLI now has published guidance on this scenario, and permits for customization of the certificate authority bundles by setting REQUESTS_CA_BUNDLE - see here for details.

Disabling SSL entirely as originally noted below should no longer be used unless you are stuck on an old version of the Azure CLI:

Set AZURE_CLI_DISABLE_CONNECTION_VERIFICATION=1 to also disable SSL certificate verification for the Azure CLI:

export AZURE_CLI_DISABLE_CONNECTION_VERIFICATION=1

Good to go!

Tags: