This means slightly more typing in "zero-value" cases (`nil` vs `dockerfile.Metadata{}`), but the tradeoff is that it's simpler to use and reason about (and all the struct members are pointer-type map/slice values anyhow, so copying the struct is still pretty cheap).
This also swaps the scanner error handling to return the partially parsed Metadata object alongside the scanner error -- the error already tells us the object isn't fully complete data, so it's fair/fine to return and will likely just be ignored by the caller instead. This also allows us to get to 100% code coverage. 👀
This also updates our "treat `oci-import` just like `FROM scratch`" code to *actually* parse `FROM scratch` so we can't accidentally cause "missing data" bugs there in the future, and I implemented that using `sync.OnceValues` which requires upgrading to Go 1.21, but IMO that's a worthwhile tradeoff (because `sync.OnceValues` makes that code so clean/simple).
There were some very minor/subtle bugs in how I implemented continuation that wouldn't affect any real-world parsing we did, but still bothered me because I'm me. This fixes them (and further increases test coverage as a result).
This script needs/uses a custom `BASHBREW_LIBRARY` directory, but it stores that value in the exported environment slightly too soon such that `generate-stackbrew-library.sh` picks it up (and shouldn't be).
We often use "`BASHBREW_LIBRARY` is unset (or empty)" as a conditional for whether to fall back to using "https://github.com/docker-library/official-images/raw/HEAD/library/" as an explicit prefix for querying "source of truth" values for things like supported parent architectures.
These two things collided in cc2dc88e04 (and similar commits) because the script saw `BASHBREW_LIBRARY` set, trusted it, but then fails to find the parent image.
This is the cleanest place to fix this such that `generate-stackbrew-library.sh` can take `BASHBREW_LIBRARY` from the provided environment instead of using our generated value.
I thought this was already the behavior, but I guess it was relaxed because previous iterations of this validation had to apply to the older format where we'd been less meticulous about enforcing this. Since those are all gone now, we can safely update the validation to enforce that commit hashes *must* be fully qualified.
It's been failing to upload for a while, and I don't think the failures are *completely* their fault (GitHub's got to share some of the blame, I think), but the end result is that we don't really have any good options for continuing to use the service (bad/unacceptable options include install an app with way too many privileges on the org/repo or add a "secret" with a personal access token ... and make that available to Pull Requests too 🙃).
This should be more reliable/correct than `uname -m` (because we're almost always looking for the *userspace* architecture, not the kernel architecture).
In my refactoring to use `go-git`'s `Tree` objects, I missed this edge case (that symlinks get resolved to be relative to the Git root, but our `Tree` object is a subdirectory).
This also finally adds `bashbrew context` as an explicit subcommand so that issues with this code are easier to test/debug (so we can generate the actual tarball and compare it to previous versions of it, versions generated by `git archive`, etc).
As-is, this currently generates verbatim identical checksums to 0cde8de57d/sources.sh (L90-L96) (by design). We'll wait to do any cache bust there until we implement `Dockerfile`/context filtering:
```console
$ bashbrew cat varnish:stable --format '{{ .TagEntry.GitCommit }} {{ .TagEntry.Directory }}'
0c295b528f28a98650fb2580eab6d34b30b165c4 stable/debian
$ git -C "$BASHBREW_CACHE/git" archive 0c295b528f28a98650fb2580eab6d34b30b165c4:stable/debian/ | ./tar-scrubber | sha256sum
3aef5ac859b23d65dfe5e9f2a47750e9a32852222829cfba762a870c1473fad6
$ bashbrew cat --format '{{ .ArchGitChecksum arch .TagEntry }}' varnish:stable
3aef5ac859b23d65dfe5e9f2a47750e9a32852222829cfba762a870c1473fad6
```
(Choosing `varnish:stable` there because it currently has [some 100% valid dangling symlinks](6b1c6ffedc/stable/debian/scripts) that tripped up my code beautifully 💕)
From a performance perspective (which was the original reason for looking into / implementing this), running the `meta-scripts/sources.sh` script against `--all` vs this, my local system gets ~18.5m vs ~4.5m (faster being this new pure-Go implementation).