Merge pull request #23251 from containers/renovate/github.com-cyphar-filepath-securejoin-0.x

Update module github.com/cyphar/filepath-securejoin to v0.3.0
This commit is contained in:
openshift-merge-bot[bot] 2024-07-11 18:34:16 +00:00 committed by GitHub
commit fe65b5873f
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
15 changed files with 1554 additions and 32 deletions

2
go.mod
View File

@ -27,7 +27,7 @@ require (
github.com/coreos/stream-metadata-go v0.4.4
github.com/crc-org/crc/v2 v2.38.0
github.com/crc-org/vfkit v0.5.1
github.com/cyphar/filepath-securejoin v0.2.5
github.com/cyphar/filepath-securejoin v0.3.0
github.com/digitalocean/go-qemu v0.0.0-20230711162256-2e3d0186973e
github.com/docker/distribution v2.8.3+incompatible
github.com/docker/docker v26.1.4+incompatible

4
go.sum
View File

@ -118,8 +118,8 @@ github.com/creack/pty v1.1.18 h1:n56/Zwd5o6whRC5PMGretI4IdRLlmBXYNjScPaBgsbY=
github.com/creack/pty v1.1.18/go.mod h1:MOBLtS5ELjhRRrroQr9kyvTxUAFNvYEK993ew/Vr4O4=
github.com/cyberphone/json-canonicalization v0.0.0-20231217050601-ba74d44ecf5f h1:eHnXnuK47UlSTOQexbzxAZfekVz6i+LKRdj1CU5DPaM=
github.com/cyberphone/json-canonicalization v0.0.0-20231217050601-ba74d44ecf5f/go.mod h1:uzvlm1mxhHkdfqitSA92i7Se+S9ksOn3a3qmv/kyOCw=
github.com/cyphar/filepath-securejoin v0.2.5 h1:6iR5tXJ/e6tJZzzdMc1km3Sa7RRIVBKAK32O2s7AYfo=
github.com/cyphar/filepath-securejoin v0.2.5/go.mod h1:aPGpWjXOXUn2NCNjFvBE6aRxGGx79pTxQpKOJNYHHl4=
github.com/cyphar/filepath-securejoin v0.3.0 h1:tXpmbiaeBrS/K2US8nhgwdKYnfAOnVfkcLPKFgFHeA0=
github.com/cyphar/filepath-securejoin v0.3.0/go.mod h1:F7i41x/9cBF7lzCrVsYs9fuzwRZm4NQsGTBdpp6mETc=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc h1:U9qPSI2PIWSS1VwoXQT9A3Wy9MM3WgvqSxFWenqJduM=

View File

@ -1,5 +1,5 @@
Copyright (C) 2014-2015 Docker Inc & Go Authors. All rights reserved.
Copyright (C) 2017 SUSE LLC. All rights reserved.
Copyright (C) 2017-2024 SUSE LLC. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are

View File

@ -2,31 +2,24 @@
[![Build Status](https://github.com/cyphar/filepath-securejoin/actions/workflows/ci.yml/badge.svg)](https://github.com/cyphar/filepath-securejoin/actions/workflows/ci.yml)
An implementation of `SecureJoin`, a [candidate for inclusion in the Go
standard library][go#20126]. The purpose of this function is to be a "secure"
alternative to `filepath.Join`, and in particular it provides certain
guarantees that are not provided by `filepath.Join`.
### Old API ###
> **NOTE**: This code is *only* safe if you are not at risk of other processes
> modifying path components after you've used `SecureJoin`. If it is possible
> for a malicious process to modify path components of the resolved path, then
> you will be vulnerable to some fairly trivial TOCTOU race conditions. [There
> are some Linux kernel patches I'm working on which might allow for a better
> solution.][lwn-obeneath]
>
> In addition, with a slightly modified API it might be possible to use
> `O_PATH` and verify that the opened path is actually the resolved one -- but
> I have not done that yet. I might add it in the future as a helper function
> to help users verify the path (we can't just return `/proc/self/fd/<foo>`
> because that doesn't always work transparently for all users).
This library was originally just an implementation of `SecureJoin` which was
[intended to be included in the Go standard library][go#20126] as a safer
`filepath.Join` that would restrict the path lookup to be inside a root
directory.
This is the function prototype:
The implementation was based on code that existed in several container
runtimes. Unfortunately, this API is **fundamentally unsafe** against attackers
that can modify path components after `SecureJoin` returns and before the
caller uses the path, allowing for some fairly trivial TOCTOU attacks.
```go
func SecureJoin(root, unsafePath string) (string, error)
```
`SecureJoin` (and `SecureJoinVFS`) are still provided by this library to
support legacy users, but new users are strongly suggested to avoid using
`SecureJoin` and instead use the [new api](#new-api) or switch to
[libpathrs][libpathrs].
This library **guarantees** the following:
With the above limitations in mind, this library guarantees the following:
* If no error is set, the resulting string **must** be a child path of
`root` and will not contain any symlink path components (they will all be
@ -47,7 +40,7 @@ This library **guarantees** the following:
A (trivial) implementation of this function on GNU/Linux systems could be done
with the following (note that this requires root privileges and is far more
opaque than the implementation in this library, and also requires that
`readlink` is inside the `root` path):
`readlink` is inside the `root` path and is trustworthy):
```go
package securejoin
@ -70,9 +63,105 @@ func SecureJoin(root, unsafePath string) (string, error) {
}
```
[lwn-obeneath]: https://lwn.net/Articles/767547/
[libpathrs]: https://github.com/openSUSE/libpathrs
[go#20126]: https://github.com/golang/go/issues/20126
### New API ###
While we recommend users switch to [libpathrs][libpathrs] as soon as it has a
stable release, some methods implemented by libpathrs have been ported to this
library to ease the transition. These APIs are only supported on Linux.
These APIs are implemented such that `filepath-securejoin` will
opportunistically use certain newer kernel APIs that make these operations far
more secure. In particular:
* All of the lookup operations will use [`openat2`][openat2.2] on new enough
kernels (Linux 5.6 or later) to restrict lookups through magic-links and
bind-mounts (for certain operations) and to make use of `RESOLVE_IN_ROOT` to
efficiently resolve symlinks within a rootfs.
* The APIs provide hardening against a malicious `/proc` mount to either detect
or avoid being tricked by a `/proc` that is not legitimate. This is done
using [`openat2`][openat2.2] for all users, and privileged users will also be
further protected by using [`fsopen`][fsopen.2] and [`open_tree`][open_tree.2]
(Linux 4.18 or later).
[openat2.2]: https://www.man7.org/linux/man-pages/man2/openat2.2.html
[fsopen.2]: https://github.com/brauner/man-pages-md/blob/main/fsopen.md
[open_tree.2]: https://github.com/brauner/man-pages-md/blob/main/open_tree.md
#### `OpenInRoot` ####
```go
func OpenInRoot(root, unsafePath string) (*os.File, error)
func OpenatInRoot(root *os.File, unsafePath string) (*os.File, error)
func Reopen(handle *os.File, flags int) (*os.File, error)
```
`OpenInRoot` is a much safer version of
```go
path, err := securejoin.SecureJoin(root, unsafePath)
file, err := os.OpenFile(path, unix.O_PATH|unix.O_CLOEXEC)
```
that protects against various race attacks that could lead to serious security
issues, depending on the application. Note that the returned `*os.File` is an
`O_PATH` file descriptor, which is quite restricted. Callers will probably need
to use `Reopen` to get a more usable handle (this split is done to provide
useful features like PTY spawning and to avoid users accidentally opening bad
inodes that could cause a DoS).
Callers need to be careful in how they use the returned `*os.File`. Usually it
is only safe to operate on the handle directly, and it is very easy to create a
security issue. [libpathrs][libpathrs] provides far more helpers to make using
these handles safer -- there is currently no plan to port them to
`filepath-securejoin`.
`OpenatInRoot` is like `OpenInRoot` except that the root is provided using an
`*os.File`. This allows you to ensure that multiple `OpenatInRoot` (or
`MkdirAllHandle`) calls are operating on the same rootfs.
> **NOTE**: Unlike `SecureJoin`, `OpenInRoot` will error out as soon as it hits
> a dangling symlink or non-existent path. This is in contrast to `SecureJoin`
> which treated non-existent components as though they were real directories,
> and would allow for partial resolution of dangling symlinks. These behaviours
> are at odds with how Linux treats non-existent paths and dangling symlinks,
> and so these are no longer allowed.
#### `MkdirAll` ####
```go
func MkdirAll(root, unsafePath string, mode int) error
func MkdirAllHandle(root *os.File, unsafePath string, mode int) (*os.File, error)
```
`MkdirAll` is a much safer version of
```go
path, err := securejoin.SecureJoin(root, unsafePath)
err = os.MkdirAll(path, mode)
```
that protects against the same kinds of races that `OpenInRoot` protects
against.
`MkdirAllHandle` is like `MkdirAll` except that the root is provided using an
`*os.File` (the reason for this is the same as with `OpenatInRoot`) and an
`*os.File` of the final created directory is returned (this directory is
guaranteed to be effectively identical to the directory created by
`MkdirAllHandle`, which is not possible to ensure by just using `OpenatInRoot`
after `MkdirAll`).
> **NOTE**: Unlike `SecureJoin`, `MkdirAll` will error out as soon as it hits
> a dangling symlink or non-existent path. This is in contrast to `SecureJoin`
> which treated non-existent components as though they were real directories,
> and would allow for partial resolution of dangling symlinks. These behaviours
> are at odds with how Linux treats non-existent paths and dangling symlinks,
> and so these are no longer allowed. This means that `MkdirAll` will not
> create non-existent directories referenced by a dangling symlink.
### License ###
The license of this project is the same as Go, which is a BSD 3-clause license

View File

@ -1 +1 @@
0.2.5
0.3.0

View File

@ -1,5 +1,5 @@
// Copyright (C) 2014-2015 Docker Inc & Go Authors. All rights reserved.
// Copyright (C) 2017 SUSE LLC. All rights reserved.
// Copyright (C) 2017-2024 SUSE LLC. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
@ -41,6 +41,12 @@ func IsNotExist(err error) bool {
// replaced with symlinks on the filesystem) after this function has returned.
// Such a symlink race is necessarily out-of-scope of SecureJoin.
//
// NOTE: Due to the above limitation, Linux users are strongly encouraged to
// use OpenInRoot instead, which does safely protect against these kinds of
// attacks. There is no way to solve this problem with SecureJoinVFS because
// the API is fundamentally wrong (you cannot return a "safe" path string and
// guarantee it won't be modified afterwards).
//
// Volume names in unsafePath are always discarded, regardless if they are
// provided via direct input or when evaluating symlinks. Therefore:
//

View File

@ -0,0 +1,380 @@
//go:build linux
// Copyright (C) 2024 SUSE LLC. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package securejoin
import (
"errors"
"fmt"
"os"
"path"
"path/filepath"
"slices"
"strings"
"golang.org/x/sys/unix"
)
type symlinkStackEntry struct {
// (dir, remainingPath) is what we would've returned if the link didn't
// exist. This matches what openat2(RESOLVE_IN_ROOT) would return in
// this case.
dir *os.File
remainingPath string
// linkUnwalked is the remaining path components from the original
// Readlink which we have yet to walk. When this slice is empty, we
// drop the link from the stack.
linkUnwalked []string
}
func (se symlinkStackEntry) String() string {
return fmt.Sprintf("<%s>/%s [->%s]", se.dir.Name(), se.remainingPath, strings.Join(se.linkUnwalked, "/"))
}
func (se symlinkStackEntry) Close() {
_ = se.dir.Close()
}
type symlinkStack []*symlinkStackEntry
func (s symlinkStack) IsEmpty() bool {
return len(s) == 0
}
func (s *symlinkStack) Close() {
for _, link := range *s {
link.Close()
}
// TODO: Switch to clear once we switch to Go 1.21.
*s = nil
}
var (
errEmptyStack = errors.New("[internal] stack is empty")
errBrokenSymlinkStack = errors.New("[internal error] broken symlink stack")
)
func (s *symlinkStack) popPart(part string) error {
if s.IsEmpty() {
// If there is nothing in the symlink stack, then the part was from the
// real path provided by the user, and this is a no-op.
return errEmptyStack
}
tailEntry := (*s)[len(*s)-1]
// Double-check that we are popping the component we expect.
if len(tailEntry.linkUnwalked) == 0 {
return fmt.Errorf("%w: trying to pop component %q of empty stack entry %s", errBrokenSymlinkStack, part, tailEntry)
}
headPart := tailEntry.linkUnwalked[0]
if headPart != part {
return fmt.Errorf("%w: trying to pop component %q but the last stack entry is %s (%q)", errBrokenSymlinkStack, part, tailEntry, headPart)
}
// Drop the component, but keep the entry around in case we are dealing
// with a "tail-chained" symlink.
tailEntry.linkUnwalked = tailEntry.linkUnwalked[1:]
return nil
}
func (s *symlinkStack) PopPart(part string) error {
if err := s.popPart(part); err != nil {
if errors.Is(err, errEmptyStack) {
// Skip empty stacks.
err = nil
}
return err
}
// Clean up any of the trailing stack entries that are empty.
for lastGood := len(*s) - 1; lastGood >= 0; lastGood-- {
entry := (*s)[lastGood]
if len(entry.linkUnwalked) > 0 {
break
}
entry.Close()
(*s) = (*s)[:lastGood]
}
return nil
}
func (s *symlinkStack) push(dir *os.File, remainingPath, linkTarget string) error {
// Split the link target and clean up any "" parts.
linkTargetParts := slices.DeleteFunc(
strings.Split(linkTarget, "/"),
func(part string) bool { return part == "" })
// Don't add a no-op link to the stack. You can't create a no-op link
// symlink, but if the symlink is /, partialLookupInRoot has already jumped to the
// root and so there's nothing more to do.
if len(linkTargetParts) == 0 {
return nil
}
// Copy the directory so the caller doesn't close our copy.
dirCopy, err := dupFile(dir)
if err != nil {
return err
}
// Add to the stack.
*s = append(*s, &symlinkStackEntry{
dir: dirCopy,
remainingPath: remainingPath,
linkUnwalked: linkTargetParts,
})
return nil
}
func (s *symlinkStack) SwapLink(linkPart string, dir *os.File, remainingPath, linkTarget string) error {
// If we are currently inside a symlink resolution, remove the symlink
// component from the last symlink entry, but don't remove the entry even
// if it's empty. If we are a "tail-chained" symlink (a trailing symlink we
// hit during a symlink resolution) we need to keep the old symlink until
// we finish the resolution.
if err := s.popPart(linkPart); err != nil {
if !errors.Is(err, errEmptyStack) {
return err
}
// Push the component regardless of whether the stack was empty.
}
return s.push(dir, remainingPath, linkTarget)
}
func (s *symlinkStack) PopTopSymlink() (*os.File, string, bool) {
if s.IsEmpty() {
return nil, "", false
}
tailEntry := (*s)[0]
*s = (*s)[1:]
return tailEntry.dir, tailEntry.remainingPath, true
}
// partialLookupInRoot tries to lookup as much of the request path as possible
// within the provided root (a-la RESOLVE_IN_ROOT) and opens the final existing
// component of the requested path, returning a file handle to the final
// existing component and a string containing the remaining path components.
func partialLookupInRoot(root *os.File, unsafePath string) (_ *os.File, _ string, Err error) {
unsafePath = filepath.ToSlash(unsafePath) // noop
// This is very similar to SecureJoin, except that we operate on the
// components using file descriptors. We then return the last component we
// managed open, along with the remaining path components not opened.
// Try to use openat2 if possible.
if hasOpenat2() {
return partialLookupOpenat2(root, unsafePath)
}
// Get the "actual" root path from /proc/self/fd. This is necessary if the
// root is some magic-link like /proc/$pid/root, in which case we want to
// make sure when we do checkProcSelfFdPath that we are using the correct
// root path.
logicalRootPath, err := procSelfFdReadlink(root)
if err != nil {
return nil, "", fmt.Errorf("get real root path: %w", err)
}
currentDir, err := dupFile(root)
if err != nil {
return nil, "", fmt.Errorf("clone root fd: %w", err)
}
defer func() {
if Err != nil {
_ = currentDir.Close()
}
}()
// symlinkStack is used to emulate how openat2(RESOLVE_IN_ROOT) treats
// dangling symlinks. If we hit a non-existent path while resolving a
// symlink, we need to return the (dir, remainingPath) that we had when we
// hit the symlink (treating the symlink as though it were a regular file).
// The set of (dir, remainingPath) sets is stored within the symlinkStack
// and we add and remove parts when we hit symlink and non-symlink
// components respectively. We need a stack because of recursive symlinks
// (symlinks that contain symlink components in their target).
//
// Note that the stack is ONLY used for book-keeping. All of the actual
// path walking logic is still based on currentPath/remainingPath and
// currentDir (as in SecureJoin).
var symlinkStack symlinkStack
defer symlinkStack.Close()
var (
linksWalked int
currentPath string
remainingPath = unsafePath
)
for remainingPath != "" {
// Save the current remaining path so if the part is not real we can
// return the path including the component.
oldRemainingPath := remainingPath
// Get the next path component.
var part string
if i := strings.IndexByte(remainingPath, '/'); i == -1 {
part, remainingPath = remainingPath, ""
} else {
part, remainingPath = remainingPath[:i], remainingPath[i+1:]
}
// Skip any "//" components.
if part == "" {
continue
}
// Apply the component lexically to the path we are building.
// currentPath does not contain any symlinks, and we are lexically
// dealing with a single component, so it's okay to do a filepath.Clean
// here.
nextPath := path.Join("/", currentPath, part)
// If we logically hit the root, just clone the root rather than
// opening the part and doing all of the other checks.
if nextPath == "/" {
if err := symlinkStack.PopPart(part); err != nil {
return nil, "", fmt.Errorf("walking into root with part %q failed: %w", part, err)
}
// Jump to root.
rootClone, err := dupFile(root)
if err != nil {
return nil, "", fmt.Errorf("clone root fd: %w", err)
}
_ = currentDir.Close()
currentDir = rootClone
currentPath = nextPath
continue
}
// Try to open the next component.
nextDir, err := openatFile(currentDir, part, unix.O_PATH|unix.O_NOFOLLOW|unix.O_CLOEXEC, 0)
switch {
case err == nil:
st, err := nextDir.Stat()
if err != nil {
_ = nextDir.Close()
return nil, "", fmt.Errorf("stat component %q: %w", part, err)
}
switch st.Mode() & os.ModeType {
case os.ModeDir:
// If we are dealing with a directory, simply walk into it.
_ = currentDir.Close()
currentDir = nextDir
currentPath = nextPath
// The part was real, so drop it from the symlink stack.
if err := symlinkStack.PopPart(part); err != nil {
return nil, "", fmt.Errorf("walking into directory %q failed: %w", part, err)
}
// If we are operating on a .., make sure we haven't escaped.
// We only have to check for ".." here because walking down
// into a regular component component cannot cause you to
// escape. This mirrors the logic in RESOLVE_IN_ROOT, except we
// have to check every ".." rather than only checking after a
// rename or mount on the system.
if part == ".." {
// Make sure the root hasn't moved.
if err := checkProcSelfFdPath(logicalRootPath, root); err != nil {
return nil, "", fmt.Errorf("root path moved during lookup: %w", err)
}
// Make sure the path is what we expect.
fullPath := logicalRootPath + nextPath
if err := checkProcSelfFdPath(fullPath, currentDir); err != nil {
return nil, "", fmt.Errorf("walking into %q had unexpected result: %w", part, err)
}
}
case os.ModeSymlink:
// We don't need the handle anymore.
_ = nextDir.Close()
// Unfortunately, we cannot readlink through our handle and so
// we need to do a separate readlinkat (which could race to
// give us an error if the attacker swapped the symlink with a
// non-symlink).
linkDest, err := readlinkatFile(currentDir, part)
if err != nil {
if errors.Is(err, unix.EINVAL) {
// The part was not a symlink, so assume that it's a
// regular file. It is possible for it to be a
// directory (if an attacker is swapping a directory
// and non-directory at this subpath) but erroring out
// here is better anyway.
err = fmt.Errorf("%w: path component %q is invalid: %w", errPossibleAttack, part, unix.ENOTDIR)
}
return nil, "", err
}
linksWalked++
if linksWalked > maxSymlinkLimit {
return nil, "", &os.PathError{Op: "partialLookupInRoot", Path: logicalRootPath + "/" + unsafePath, Err: unix.ELOOP}
}
// Swap out the symlink's component for the link entry itself.
if err := symlinkStack.SwapLink(part, currentDir, oldRemainingPath, linkDest); err != nil {
return nil, "", fmt.Errorf("walking into symlink %q failed: push symlink: %w", part, err)
}
// Update our logical remaining path.
remainingPath = linkDest + "/" + remainingPath
// Absolute symlinks reset any work we've already done.
if path.IsAbs(linkDest) {
// Jump to root.
rootClone, err := dupFile(root)
if err != nil {
return nil, "", fmt.Errorf("clone root fd: %w", err)
}
_ = currentDir.Close()
currentDir = rootClone
currentPath = "/"
}
default:
// For any other file type, we can't walk further and so we've
// hit the end of the lookup. The handling is very similar to
// ENOENT from openat(2), except that we return a handle to the
// component we just walked into (and we drop the component
// from the symlink stack).
_ = currentDir.Close()
// The part existed, so drop it from the symlink stack.
if err := symlinkStack.PopPart(part); err != nil {
return nil, "", fmt.Errorf("walking into non-directory %q failed: %w", part, err)
}
// If there are any remaining components in the symlink stack,
// we are still within a symlink resolution and thus we hit a
// dangling symlink. So pretend that the first symlink in the
// stack we hit was an ENOENT (to match openat2).
if oldDir, remainingPath, ok := symlinkStack.PopTopSymlink(); ok {
_ = nextDir.Close()
return oldDir, remainingPath, nil
}
// The current component exists, so return it.
return nextDir, remainingPath, nil
}
case errors.Is(err, os.ErrNotExist):
// If there are any remaining components in the symlink stack, we
// are still within a symlink resolution and thus we hit a dangling
// symlink. So pretend that the first symlink in the stack we hit
// was an ENOENT (to match openat2).
if oldDir, remainingPath, ok := symlinkStack.PopTopSymlink(); ok {
_ = currentDir.Close()
return oldDir, remainingPath, nil
}
// We have hit a final component that doesn't exist, so we have our
// partial open result. Note that we have to use the OLD remaining
// path, since the lookup failed.
return currentDir, oldRemainingPath, nil
default:
return nil, "", err
}
}
// All of the components existed!
return currentDir, "", nil
}

View File

@ -0,0 +1,228 @@
//go:build linux
// Copyright (C) 2024 SUSE LLC. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package securejoin
import (
"errors"
"fmt"
"io"
"os"
"path/filepath"
"slices"
"strings"
"golang.org/x/sys/unix"
)
var (
errInvalidMode = errors.New("invalid permission mode")
errPossibleAttack = errors.New("possible attack detected")
)
// MkdirAllHandle is equivalent to MkdirAll, except that it is safer to use in
// two respects:
//
// - The caller provides the root directory as an *os.File (preferably O_PATH)
// handle. This means that the caller can be sure which root directory is
// being used. Note that this can be emulated by using /proc/self/fd/... as
// the root path with MkdirAll.
//
// - Once all of the directories have been created, an *os.File (O_PATH) handle
// to the directory at unsafePath is returned to the caller. This is done in
// an effectively-race-free way (an attacker would only be able to swap the
// final directory component), which is not possible to emulate with
// MkdirAll.
//
// In addition, the returned handle is obtained far more efficiently than doing
// a brand new lookup of unsafePath (such as with SecureJoin or openat2) after
// doing MkdirAll. If you intend to open the directory after creating it, you
// should use MkdirAllHandle.
func MkdirAllHandle(root *os.File, unsafePath string, mode int) (_ *os.File, Err error) {
// Make sure there are no os.FileMode bits set.
if mode&^0o7777 != 0 {
return nil, fmt.Errorf("%w for mkdir 0o%.3o", errInvalidMode, mode)
}
// Try to open as much of the path as possible.
currentDir, remainingPath, err := partialLookupInRoot(root, unsafePath)
if err != nil {
return nil, fmt.Errorf("find existing subpath of %q: %w", unsafePath, err)
}
defer func() {
if Err != nil {
_ = currentDir.Close()
}
}()
// If there is an attacker deleting directories as we walk into them,
// detect this proactively. Note this is guaranteed to detect if the
// attacker deleted any part of the tree up to currentDir.
//
// Once we walk into a dead directory, partialLookupInRoot would not be
// able to walk further down the tree (directories must be empty before
// they are deleted), and if the attacker has removed the entire tree we
// can be sure that anything that was originally inside a dead directory
// must also be deleted and thus is a dead directory in its own right.
//
// This is mostly a quality-of-life check, because mkdir will simply fail
// later if the attacker deletes the tree after this check.
if err := isDeadInode(currentDir); err != nil {
return nil, fmt.Errorf("finding existing subpath of %q: %w", unsafePath, err)
}
// Re-open the path to match the O_DIRECTORY reopen loop later (so that we
// always return a non-O_PATH handle). We also check that we actually got a
// directory.
if reopenDir, err := Reopen(currentDir, unix.O_DIRECTORY|unix.O_CLOEXEC); errors.Is(err, unix.ENOTDIR) {
return nil, fmt.Errorf("cannot create subdirectories in %q: %w", currentDir.Name(), unix.ENOTDIR)
} else if err != nil {
return nil, fmt.Errorf("re-opening handle to %q: %w", currentDir.Name(), err)
} else {
currentDir = reopenDir
}
remainingParts := strings.Split(remainingPath, string(filepath.Separator))
if slices.Contains(remainingParts, "..") {
// The path contained ".." components after the end of the "real"
// components. We could try to safely resolve ".." here but that would
// add a bunch of extra logic for something that it's not clear even
// needs to be supported. So just return an error.
//
// If we do filepath.Clean(remainingPath) then we end up with the
// problem that ".." can erase a trailing dangling symlink and produce
// a path that doesn't quite match what the user asked for.
return nil, fmt.Errorf("%w: yet-to-be-created path %q contains '..' components", unix.ENOENT, remainingPath)
}
// Make sure the mode doesn't have any type bits.
mode &^= unix.S_IFMT
// What properties do we expect any newly created directories to have?
var (
// While umask(2) is a per-thread property, and thus this value could
// vary between threads, a functioning Go program would LockOSThread
// threads with different umasks and so we don't need to LockOSThread
// for this entire mkdirat loop (if we are in the locked thread with a
// different umask, we are already locked and there's nothing for us to
// do -- and if not then it doesn't matter which thread we run on and
// there's nothing for us to do).
expectedMode = uint32(unix.S_IFDIR | (mode &^ getUmask()))
// We would want to get the fs[ug]id here, but we can't access those
// from userspace. In practice, nobody uses setfs[ug]id() anymore, so
// just use the effective [ug]id (which is equivalent to the fs[ug]id
// for programs that don't use setfs[ug]id).
expectedUid = uint32(unix.Geteuid())
expectedGid = uint32(unix.Getegid())
)
// Create the remaining components.
for _, part := range remainingParts {
switch part {
case "", ".":
// Skip over no-op paths.
continue
}
// NOTE: mkdir(2) will not follow trailing symlinks, so we can safely
// create the finaly component without worrying about symlink-exchange
// attacks.
if err := unix.Mkdirat(int(currentDir.Fd()), part, uint32(mode)); err != nil {
err = &os.PathError{Op: "mkdirat", Path: currentDir.Name() + "/" + part, Err: err}
// Make the error a bit nicer if the directory is dead.
if err2 := isDeadInode(currentDir); err2 != nil {
err = fmt.Errorf("%w (%w)", err, err2)
}
return nil, err
}
// Get a handle to the next component. O_DIRECTORY means we don't need
// to use O_PATH.
var nextDir *os.File
if hasOpenat2() {
nextDir, err = openat2File(currentDir, part, &unix.OpenHow{
Flags: unix.O_NOFOLLOW | unix.O_DIRECTORY | unix.O_CLOEXEC,
Resolve: unix.RESOLVE_BENEATH | unix.RESOLVE_NO_SYMLINKS | unix.RESOLVE_NO_XDEV,
})
} else {
nextDir, err = openatFile(currentDir, part, unix.O_NOFOLLOW|unix.O_DIRECTORY|unix.O_CLOEXEC, 0)
}
if err != nil {
return nil, err
}
_ = currentDir.Close()
currentDir = nextDir
// Make sure that the directory matches what we expect. An attacker
// could have swapped the directory between us making it and opening
// it. There's no way for us to be sure that the directory is
// _precisely_ the same as the directory we created, but if we are in
// an empty directory with the same owner and mode as the one we
// created then there is nothing the attacker could do with this new
// directory that they couldn't do with the old one.
if stat, err := fstat(currentDir); err != nil {
return nil, fmt.Errorf("check newly created directory: %w", err)
} else {
if stat.Mode != expectedMode {
return nil, fmt.Errorf("%w: newly created directory %q has incorrect mode 0o%.3o (expected 0o%.3o)", errPossibleAttack, currentDir.Name(), stat.Mode, expectedMode)
}
if stat.Uid != expectedUid || stat.Gid != expectedGid {
return nil, fmt.Errorf("%w: newly created directory %q has incorrect owner %d:%d (expected %d:%d)", errPossibleAttack, currentDir.Name(), stat.Uid, stat.Gid, expectedUid, expectedGid)
}
// Check that the directory is empty. We only need to check for
// a single entry, and we should get EOF if the directory is
// empty.
_, err := currentDir.Readdirnames(1)
if !errors.Is(err, io.EOF) {
if err == nil {
err = fmt.Errorf("%w: newly created directory %q is non-empty", errPossibleAttack, currentDir.Name())
}
return nil, fmt.Errorf("check if newly created directory %q is empty: %w", currentDir.Name(), err)
}
// Reset the offset.
_, _ = currentDir.Seek(0, unix.SEEK_SET)
}
}
return currentDir, nil
}
// MkdirAll is a race-safe alternative to the Go stdlib's os.MkdirAll function,
// where the new directory is guaranteed to be within the root directory (if an
// attacker can move directories from inside the root to outside the root, the
// created directory tree might be outside of the root but the key constraint
// is that at no point will we walk outside of the directory tree we are
// creating).
//
// Effectively, MkdirAll(root, unsafePath, mode) is equivalent to
//
// path, _ := securejoin.SecureJoin(root, unsafePath)
// err := os.MkdirAll(path, mode)
//
// But is much safer. The above implementation is unsafe because if an attacker
// can modify the filesystem tree between SecureJoin and MkdirAll, it is
// possible for MkdirAll to resolve unsafe symlink components and create
// directories outside of the root.
//
// If you plan to open the directory after you have created it or want to use
// an open directory handle as the root, you should use MkdirAllHandle instead.
// This function is a wrapper around MkdirAllHandle.
//
// NOTE: The mode argument must be set the unix mode bits (unix.S_I...), not
// the Go generic mode bits (os.Mode...).
func MkdirAll(root, unsafePath string, mode int) error {
rootDir, err := os.OpenFile(root, unix.O_PATH|unix.O_DIRECTORY|unix.O_CLOEXEC, 0)
if err != nil {
return err
}
defer rootDir.Close()
f, err := MkdirAllHandle(rootDir, unsafePath, mode)
if err != nil {
return err
}
_ = f.Close()
return nil
}

View File

@ -0,0 +1,83 @@
//go:build linux
// Copyright (C) 2024 SUSE LLC. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package securejoin
import (
"fmt"
"os"
"golang.org/x/sys/unix"
)
// OpenatInRoot is equivalent to OpenInRoot, except that the root is provided
// using an *os.File handle, to ensure that the correct root directory is used.
func OpenatInRoot(root *os.File, unsafePath string) (*os.File, error) {
handle, remainingPath, err := partialLookupInRoot(root, unsafePath)
if err != nil {
return nil, err
}
if remainingPath != "" {
_ = handle.Close()
return nil, &os.PathError{Op: "securejoin.OpenInRoot", Path: unsafePath, Err: unix.ENOENT}
}
return handle, nil
}
// OpenInRoot safely opens the provided unsafePath within the root.
// Effectively, OpenInRoot(root, unsafePath) is equivalent to
//
// path, _ := securejoin.SecureJoin(root, unsafePath)
// handle, err := os.OpenFile(path, unix.O_PATH|unix.O_CLOEXEC)
//
// But is much safer. The above implementation is unsafe because if an attacker
// can modify the filesystem tree between SecureJoin and OpenFile, it is
// possible for the returned file to be outside of the root.
//
// Note that the returned handle is an O_PATH handle, meaning that only a very
// limited set of operations will work on the handle. This is done to avoid
// accidentally opening an untrusted file that could cause issues (such as a
// disconnected TTY that could cause a DoS, or some other issue). In order to
// use the returned handle, you can "upgrade" it to a proper handle using
// Reopen.
func OpenInRoot(root, unsafePath string) (*os.File, error) {
rootDir, err := os.OpenFile(root, unix.O_PATH|unix.O_DIRECTORY|unix.O_CLOEXEC, 0)
if err != nil {
return nil, err
}
defer rootDir.Close()
return OpenatInRoot(rootDir, unsafePath)
}
// Reopen takes an *os.File handle and re-opens it through /proc/self/fd.
// Reopen(file, flags) is effectively equivalent to
//
// fdPath := fmt.Sprintf("/proc/self/fd/%d", file.Fd())
// os.OpenFile(fdPath, flags|unix.O_CLOEXEC)
//
// But with some extra hardenings to ensure that we are not tricked by a
// maliciously-configured /proc mount. While this attack scenario is not
// common, in container runtimes it is possible for higher-level runtimes to be
// tricked into configuring an unsafe /proc that can be used to attack file
// operations. See CVE-2019-19921 for more details.
func Reopen(handle *os.File, flags int) (*os.File, error) {
procRoot, err := getProcRoot()
if err != nil {
return nil, err
}
flags |= unix.O_CLOEXEC
fdPath := fmt.Sprintf("fd/%d", handle.Fd())
return doProcSelfMagiclink(procRoot, fdPath, func(procDirHandle *os.File, base string) (*os.File, error) {
// Rather than just wrapping openatFile, open-code it so we can copy
// handle.Name().
reopenFd, err := unix.Openat(int(procDirHandle.Fd()), base, flags, 0)
if err != nil {
return nil, fmt.Errorf("reopen fd %d: %w", handle.Fd(), err)
}
return os.NewFile(uintptr(reopenFd), handle.Name()), nil
})
}

View File

@ -0,0 +1,128 @@
//go:build linux
// Copyright (C) 2024 SUSE LLC. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package securejoin
import (
"errors"
"fmt"
"os"
"path/filepath"
"strings"
"sync"
"testing"
"golang.org/x/sys/unix"
)
var (
hasOpenat2Bool bool
hasOpenat2Once sync.Once
testingForceHasOpenat2 *bool
)
func hasOpenat2() bool {
if testing.Testing() && testingForceHasOpenat2 != nil {
return *testingForceHasOpenat2
}
hasOpenat2Once.Do(func() {
fd, err := unix.Openat2(unix.AT_FDCWD, ".", &unix.OpenHow{
Flags: unix.O_PATH | unix.O_CLOEXEC,
Resolve: unix.RESOLVE_NO_SYMLINKS | unix.RESOLVE_IN_ROOT,
})
if err == nil {
hasOpenat2Bool = true
_ = unix.Close(fd)
}
})
return hasOpenat2Bool
}
func scopedLookupShouldRetry(how *unix.OpenHow, err error) bool {
// RESOLVE_IN_ROOT (and RESOLVE_BENEATH) can return -EAGAIN if we resolve
// ".." while a mount or rename occurs anywhere on the system. This could
// happen spuriously, or as the result of an attacker trying to mess with
// us during lookup.
//
// In addition, scoped lookups have a "safety check" at the end of
// complete_walk which will return -EXDEV if the final path is not in the
// root.
return how.Resolve&(unix.RESOLVE_IN_ROOT|unix.RESOLVE_BENEATH) != 0 &&
(errors.Is(err, unix.EAGAIN) || errors.Is(err, unix.EXDEV))
}
const scopedLookupMaxRetries = 10
func openat2File(dir *os.File, path string, how *unix.OpenHow) (*os.File, error) {
fullPath := dir.Name() + "/" + path
// Make sure we always set O_CLOEXEC.
how.Flags |= unix.O_CLOEXEC
var tries int
for tries < scopedLookupMaxRetries {
fd, err := unix.Openat2(int(dir.Fd()), path, how)
if err != nil {
if scopedLookupShouldRetry(how, err) {
// We retry a couple of times to avoid the spurious errors, and
// if we are being attacked then returning -EAGAIN is the best
// we can do.
tries++
continue
}
return nil, &os.PathError{Op: "openat2", Path: fullPath, Err: err}
}
// If we are using RESOLVE_IN_ROOT, the name we generated may be wrong.
// NOTE: The procRoot code MUST NOT use RESOLVE_IN_ROOT, otherwise
// you'll get infinite recursion here.
if how.Resolve&unix.RESOLVE_IN_ROOT == unix.RESOLVE_IN_ROOT {
if actualPath, err := rawProcSelfFdReadlink(fd); err == nil {
fullPath = actualPath
}
}
return os.NewFile(uintptr(fd), fullPath), nil
}
return nil, &os.PathError{Op: "openat2", Path: fullPath, Err: errPossibleAttack}
}
// partialLookupOpenat2 is an alternative implementation of
// partialLookupInRoot, using openat2(RESOLVE_IN_ROOT) to more safely get a
// handle to the deepest existing child of the requested path within the root.
func partialLookupOpenat2(root *os.File, unsafePath string) (*os.File, string, error) {
// TODO: Implement this as a git-bisect-like binary search.
unsafePath = filepath.ToSlash(unsafePath) // noop
endIdx := len(unsafePath)
for endIdx > 0 {
subpath := unsafePath[:endIdx]
handle, err := openat2File(root, subpath, &unix.OpenHow{
Flags: unix.O_PATH | unix.O_CLOEXEC,
Resolve: unix.RESOLVE_IN_ROOT | unix.RESOLVE_NO_MAGICLINKS,
})
if err == nil {
// Jump over the slash if we have a non-"" remainingPath.
if endIdx < len(unsafePath) {
endIdx += 1
}
// We found a subpath!
return handle, unsafePath[endIdx:], nil
}
if errors.Is(err, unix.ENOENT) || errors.Is(err, unix.ENOTDIR) {
// That path doesn't exist, let's try the next directory up.
endIdx = strings.LastIndexByte(subpath, '/')
continue
}
return nil, "", fmt.Errorf("open subpath: %w", err)
}
// If we couldn't open anything, the whole subpath is missing. Return a
// copy of the root fd so that the caller doesn't close this one by
// accident.
rootClone, err := dupFile(root)
if err != nil {
return nil, "", err
}
return rootClone, unsafePath, nil
}

View File

@ -0,0 +1,59 @@
//go:build linux
// Copyright (C) 2024 SUSE LLC. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package securejoin
import (
"os"
"path/filepath"
"golang.org/x/sys/unix"
)
func dupFile(f *os.File) (*os.File, error) {
fd, err := unix.FcntlInt(f.Fd(), unix.F_DUPFD_CLOEXEC, 0)
if err != nil {
return nil, os.NewSyscallError("fcntl(F_DUPFD_CLOEXEC)", err)
}
return os.NewFile(uintptr(fd), f.Name()), nil
}
func openatFile(dir *os.File, path string, flags int, mode int) (*os.File, error) {
// Make sure we always set O_CLOEXEC.
flags |= unix.O_CLOEXEC
fd, err := unix.Openat(int(dir.Fd()), path, flags, uint32(mode))
if err != nil {
return nil, &os.PathError{Op: "openat", Path: dir.Name() + "/" + path, Err: err}
}
// All of the paths we use with openatFile(2) are guaranteed to be
// lexically safe, so we can use path.Join here.
fullPath := filepath.Join(dir.Name(), path)
return os.NewFile(uintptr(fd), fullPath), nil
}
func fstatatFile(dir *os.File, path string, flags int) (unix.Stat_t, error) {
var stat unix.Stat_t
if err := unix.Fstatat(int(dir.Fd()), path, &stat, flags); err != nil {
return stat, &os.PathError{Op: "fstatat", Path: dir.Name() + "/" + path, Err: err}
}
return stat, nil
}
func readlinkatFile(dir *os.File, path string) (string, error) {
size := 4096
for {
linkBuf := make([]byte, size)
n, err := unix.Readlinkat(int(dir.Fd()), path, linkBuf)
if err != nil {
return "", &os.PathError{Op: "readlinkat", Path: dir.Name() + "/" + path, Err: err}
}
if n != size {
return string(linkBuf[:n]), nil
}
// Possible truncation, resize the buffer.
size *= 2
}
}

View File

@ -0,0 +1,481 @@
//go:build linux
// Copyright (C) 2024 SUSE LLC. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package securejoin
import (
"errors"
"fmt"
"os"
"path/filepath"
"runtime"
"strconv"
"sync"
"golang.org/x/sys/unix"
)
func fstat(f *os.File) (unix.Stat_t, error) {
var stat unix.Stat_t
if err := unix.Fstat(int(f.Fd()), &stat); err != nil {
return stat, &os.PathError{Op: "fstat", Path: f.Name(), Err: err}
}
return stat, nil
}
func fstatfs(f *os.File) (unix.Statfs_t, error) {
var statfs unix.Statfs_t
if err := unix.Fstatfs(int(f.Fd()), &statfs); err != nil {
return statfs, &os.PathError{Op: "fstatfs", Path: f.Name(), Err: err}
}
return statfs, nil
}
// The kernel guarantees that the root inode of a procfs mount has an
// f_type of PROC_SUPER_MAGIC and st_ino of PROC_ROOT_INO.
const (
procSuperMagic = 0x9fa0 // PROC_SUPER_MAGIC
procRootIno = 1 // PROC_ROOT_INO
)
func verifyProcRoot(procRoot *os.File) error {
if statfs, err := fstatfs(procRoot); err != nil {
return err
} else if statfs.Type != procSuperMagic {
return fmt.Errorf("%w: incorrect procfs root filesystem type 0x%x", errUnsafeProcfs, statfs.Type)
}
if stat, err := fstat(procRoot); err != nil {
return err
} else if stat.Ino != procRootIno {
return fmt.Errorf("%w: incorrect procfs root inode number %d", errUnsafeProcfs, stat.Ino)
}
return nil
}
var (
hasNewMountApiBool bool
hasNewMountApiOnce sync.Once
)
func hasNewMountApi() bool {
hasNewMountApiOnce.Do(func() {
// All of the pieces of the new mount API we use (fsopen, fsconfig,
// fsmount, open_tree) were added together in Linux 5.1[1,2], so we can
// just check for one of the syscalls and the others should also be
// available.
//
// Just try to use open_tree(2) to open a file without OPEN_TREE_CLONE.
// This is equivalent to openat(2), but tells us if open_tree is
// available (and thus all of the other basic new mount API syscalls).
// open_tree(2) is most light-weight syscall to test here.
//
// [1]: merge commit 400913252d09
// [2]: <https://lore.kernel.org/lkml/153754740781.17872.7869536526927736855.stgit@warthog.procyon.org.uk/>
fd, err := unix.OpenTree(-int(unix.EBADF), "/", unix.OPEN_TREE_CLOEXEC)
if err == nil {
hasNewMountApiBool = true
_ = unix.Close(fd)
}
})
return hasNewMountApiBool
}
func fsopen(fsName string, flags int) (*os.File, error) {
// Make sure we always set O_CLOEXEC.
flags |= unix.FSOPEN_CLOEXEC
fd, err := unix.Fsopen(fsName, flags)
if err != nil {
return nil, os.NewSyscallError("fsopen "+fsName, err)
}
return os.NewFile(uintptr(fd), "fscontext:"+fsName), nil
}
func fsmount(ctx *os.File, flags, mountAttrs int) (*os.File, error) {
// Make sure we always set O_CLOEXEC.
flags |= unix.FSMOUNT_CLOEXEC
fd, err := unix.Fsmount(int(ctx.Fd()), flags, mountAttrs)
if err != nil {
return nil, os.NewSyscallError("fsmount "+ctx.Name(), err)
}
return os.NewFile(uintptr(fd), "fsmount:"+ctx.Name()), nil
}
func newPrivateProcMount() (*os.File, error) {
procfsCtx, err := fsopen("proc", unix.FSOPEN_CLOEXEC)
if err != nil {
return nil, err
}
defer procfsCtx.Close()
// Try to configure hidepid=ptraceable,subset=pid if possible, but ignore errors.
_ = unix.FsconfigSetString(int(procfsCtx.Fd()), "hidepid", "ptraceable")
_ = unix.FsconfigSetString(int(procfsCtx.Fd()), "subset", "pid")
// Get an actual handle.
if err := unix.FsconfigCreate(int(procfsCtx.Fd())); err != nil {
return nil, os.NewSyscallError("fsconfig create procfs", err)
}
return fsmount(procfsCtx, unix.FSMOUNT_CLOEXEC, unix.MS_RDONLY|unix.MS_NODEV|unix.MS_NOEXEC|unix.MS_NOSUID)
}
func openTree(dir *os.File, path string, flags uint) (*os.File, error) {
dirFd := -int(unix.EBADF)
dirName := "."
if dir != nil {
dirFd = int(dir.Fd())
dirName = dir.Name()
}
// Make sure we always set O_CLOEXEC.
flags |= unix.OPEN_TREE_CLOEXEC
fd, err := unix.OpenTree(dirFd, path, flags)
if err != nil {
return nil, &os.PathError{Op: "open_tree", Path: path, Err: err}
}
return os.NewFile(uintptr(fd), dirName+"/"+path), nil
}
func clonePrivateProcMount() (_ *os.File, Err error) {
// Try to make a clone without using AT_RECURSIVE if we can. If this works,
// we can be sure there are no over-mounts and so if the root is valid then
// we're golden. Otherwise, we have to deal with over-mounts.
procfsHandle, err := openTree(nil, "/proc", unix.OPEN_TREE_CLONE)
if err != nil || testingForcePrivateProcRootOpenTreeAtRecursive(procfsHandle) {
procfsHandle, err = openTree(nil, "/proc", unix.OPEN_TREE_CLONE|unix.AT_RECURSIVE)
}
if err != nil {
return nil, fmt.Errorf("creating a detached procfs clone: %w", err)
}
defer func() {
if Err != nil {
_ = procfsHandle.Close()
}
}()
if err := verifyProcRoot(procfsHandle); err != nil {
return nil, err
}
return procfsHandle, nil
}
func privateProcRoot() (*os.File, error) {
if !hasNewMountApi() {
return nil, fmt.Errorf("new mount api: %w", unix.ENOTSUP)
}
// Try to create a new procfs mount from scratch if we can. This ensures we
// can get a procfs mount even if /proc is fake (for whatever reason).
procRoot, err := newPrivateProcMount()
if err != nil || testingForcePrivateProcRootOpenTree(procRoot) {
// Try to clone /proc then...
procRoot, err = clonePrivateProcMount()
}
return procRoot, err
}
var (
procRootHandle *os.File
procRootError error
procRootOnce sync.Once
errUnsafeProcfs = errors.New("unsafe procfs detected")
)
func unsafeHostProcRoot() (_ *os.File, Err error) {
procRoot, err := os.OpenFile("/proc", unix.O_PATH|unix.O_NOFOLLOW|unix.O_DIRECTORY|unix.O_CLOEXEC, 0)
if err != nil {
return nil, err
}
defer func() {
if Err != nil {
_ = procRoot.Close()
}
}()
if err := verifyProcRoot(procRoot); err != nil {
return nil, err
}
return procRoot, nil
}
func doGetProcRoot() (*os.File, error) {
procRoot, err := privateProcRoot()
if err != nil || testingForceGetProcRootUnsafe(procRoot) {
// Fall back to using a /proc handle if making a private mount failed.
// If we have openat2, at least we can avoid some kinds of over-mount
// attacks, but without openat2 there's not much we can do.
procRoot, err = unsafeHostProcRoot()
}
return procRoot, err
}
func getProcRoot() (*os.File, error) {
procRootOnce.Do(func() {
procRootHandle, procRootError = doGetProcRoot()
})
return procRootHandle, procRootError
}
var (
haveProcThreadSelf bool
haveProcThreadSelfOnce sync.Once
)
type procThreadSelfCloser func()
// procThreadSelf returns a handle to /proc/thread-self/<subpath> (or an
// equivalent handle on older kernels where /proc/thread-self doesn't exist).
// Once finished with the handle, you must call the returned closer function
// (runtime.UnlockOSThread). You must not pass the returned *os.File to other
// Go threads or use the handle after calling the closer.
//
// This is similar to ProcThreadSelf from runc, but with extra hardening
// applied and using *os.File.
func procThreadSelf(procRoot *os.File, subpath string) (_ *os.File, _ procThreadSelfCloser, Err error) {
haveProcThreadSelfOnce.Do(func() {
// If the kernel doesn't support thread-self, it doesn't matter which
// /proc handle we use.
_, err := fstatatFile(procRoot, "thread-self", unix.AT_SYMLINK_NOFOLLOW)
haveProcThreadSelf = (err == nil)
})
// We need to lock our thread until the caller is done with the handle
// because between getting the handle and using it we could get interrupted
// by the Go runtime and hit the case where the underlying thread is
// swapped out and the original thread is killed, resulting in
// pull-your-hair-out-hard-to-debug issues in the caller.
runtime.LockOSThread()
defer func() {
if Err != nil {
runtime.UnlockOSThread()
}
}()
// Figure out what prefix we want to use.
threadSelf := "thread-self/"
if !haveProcThreadSelf || testingForceProcSelfTask() {
/// Pre-3.17 kernels don't have /proc/thread-self, so do it manually.
threadSelf = "self/task/" + strconv.Itoa(unix.Gettid()) + "/"
if _, err := fstatatFile(procRoot, threadSelf, unix.AT_SYMLINK_NOFOLLOW); err != nil || testingForceProcSelf() {
// In this case, we running in a pid namespace that doesn't match
// the /proc mount we have. This can happen inside runc.
//
// Unfortunately, there is no nice way to get the correct TID to
// use here because of the age of the kernel, so we have to just
// use /proc/self and hope that it works.
threadSelf = "self/"
}
}
// Grab the handle.
var (
handle *os.File
err error
)
if hasOpenat2() {
// We prefer being able to use RESOLVE_NO_XDEV if we can, to be
// absolutely sure we are operating on a clean /proc handle that
// doesn't have any cheeky overmounts that could trick us (including
// symlink mounts on top of /proc/thread-self). RESOLVE_BENEATH isn't
// stricly needed, but just use it since we have it.
//
// NOTE: /proc/self is technically a magic-link (the contents of the
// symlink are generated dynamically), but it doesn't use
// nd_jump_link() so RESOLVE_NO_MAGICLINKS allows it.
//
// NOTE: We MUST NOT use RESOLVE_IN_ROOT here, as openat2File uses
// procSelfFdReadlink to clean up the returned f.Name() if we use
// RESOLVE_IN_ROOT (which would lead to an infinite recursion).
handle, err = openat2File(procRoot, threadSelf+subpath, &unix.OpenHow{
Flags: unix.O_PATH | unix.O_CLOEXEC,
Resolve: unix.RESOLVE_BENEATH | unix.RESOLVE_NO_XDEV | unix.RESOLVE_NO_MAGICLINKS,
})
if err != nil {
return nil, nil, fmt.Errorf("%w: %w", errUnsafeProcfs, err)
}
} else {
handle, err = openatFile(procRoot, threadSelf+subpath, unix.O_PATH|unix.O_CLOEXEC, 0)
if err != nil {
return nil, nil, fmt.Errorf("%w: %w", errUnsafeProcfs, err)
}
defer func() {
if Err != nil {
_ = handle.Close()
}
}()
// We can't detect bind-mounts of different parts of procfs on top of
// /proc (a-la RESOLVE_NO_XDEV), but we can at least be sure that we
// aren't on the wrong filesystem here.
if statfs, err := fstatfs(handle); err != nil {
return nil, nil, err
} else if statfs.Type != procSuperMagic {
return nil, nil, fmt.Errorf("%w: incorrect /proc/self/fd filesystem type 0x%x", errUnsafeProcfs, statfs.Type)
}
}
return handle, runtime.UnlockOSThread, nil
}
var (
hasStatxMountIdBool bool
hasStatxMountIdOnce sync.Once
)
func hasStatxMountId() bool {
hasStatxMountIdOnce.Do(func() {
var (
stx unix.Statx_t
// We don't care which mount ID we get. The kernel will give us the
// unique one if it is supported.
wantStxMask uint32 = unix.STATX_MNT_ID_UNIQUE | unix.STATX_MNT_ID
)
err := unix.Statx(-int(unix.EBADF), "/", 0, int(wantStxMask), &stx)
hasStatxMountIdBool = (err == nil && (stx.Mask&wantStxMask != 0))
})
return hasStatxMountIdBool
}
func checkSymlinkOvermount(dir *os.File, path string) error {
// If we don't have statx(STATX_MNT_ID*) support, we can't do anything.
if !hasStatxMountId() {
return nil
}
var (
stx unix.Statx_t
// We don't care which mount ID we get. The kernel will give us the
// unique one if it is supported.
wantStxMask uint32 = unix.STATX_MNT_ID_UNIQUE | unix.STATX_MNT_ID
)
// Get the mntId of our procfs handle.
err := unix.Statx(int(dir.Fd()), "", unix.AT_EMPTY_PATH, int(wantStxMask), &stx)
if err != nil {
return &os.PathError{Op: "statx", Path: dir.Name(), Err: err}
}
if stx.Mask&wantStxMask == 0 {
// It's not a kernel limitation, for some reason we couldn't get a
// mount ID. Assume it's some kind of attack.
return fmt.Errorf("%w: could not get mnt id of dir %s", errUnsafeProcfs, dir.Name())
}
expectedMountId := stx.Mnt_id
// Get the mntId of the target symlink.
stx = unix.Statx_t{}
err = unix.Statx(int(dir.Fd()), path, unix.AT_SYMLINK_NOFOLLOW, int(wantStxMask), &stx)
if err != nil {
return &os.PathError{Op: "statx", Path: dir.Name() + "/" + path, Err: err}
}
if stx.Mask&wantStxMask == 0 {
// It's not a kernel limitation, for some reason we couldn't get a
// mount ID. Assume it's some kind of attack.
return fmt.Errorf("%w: could not get mnt id of symlink %s", errUnsafeProcfs, path)
}
gotMountId := stx.Mnt_id
// As long as the directory mount is alive, even with wrapping mount IDs,
// we would expect to see a different mount ID here. (Of course, if we're
// using unsafeHostProcRoot() then an attaker could change this after we
// did this check.)
if expectedMountId != gotMountId {
return fmt.Errorf("%w: symlink %s/%s has an overmount obscuring the real link (mount ids do not match %d != %d)", errUnsafeProcfs, dir.Name(), path, expectedMountId, gotMountId)
}
return nil
}
func doProcSelfMagiclink[T any](procRoot *os.File, subPath string, fn func(procDirHandle *os.File, base string) (T, error)) (T, error) {
// We cannot operate on the magic-link directly with a handle, we need to
// create a handle to the parent of the magic-link and then do
// single-component operations on it.
dir, base := filepath.Dir(subPath), filepath.Base(subPath)
procDirHandle, closer, err := procThreadSelf(procRoot, dir)
if err != nil {
return *new(T), fmt.Errorf("get safe /proc/thread-self/%s handle: %w", dir, err)
}
defer procDirHandle.Close()
defer closer()
// Try to detect if there is a mount on top of the symlink we are about to
// read. If we are using unsafeHostProcRoot(), this could change after we
// check it (and there's nothing we can do about that) but for
// privateProcRoot() this should be guaranteed to be safe (at least since
// Linux 5.12[1], when anonymous mount namespaces were completely isolated
// from external mounts including mount propagation events).
//
// [1]: Linux commit ee2e3f50629f ("mount: fix mounting of detached mounts
// onto targets that reside on shared mounts").
if err := checkSymlinkOvermount(procDirHandle, base); err != nil {
return *new(T), fmt.Errorf("check safety of %s proc magiclink: %w", subPath, err)
}
return fn(procDirHandle, base)
}
func doRawProcSelfFdReadlink(procRoot *os.File, fd int) (string, error) {
fdPath := fmt.Sprintf("fd/%d", fd)
return doProcSelfMagiclink(procRoot, fdPath, readlinkatFile)
}
func rawProcSelfFdReadlink(fd int) (string, error) {
procRoot, err := getProcRoot()
if err != nil {
return "", err
}
return doRawProcSelfFdReadlink(procRoot, fd)
}
func procSelfFdReadlink(f *os.File) (string, error) {
return rawProcSelfFdReadlink(int(f.Fd()))
}
var (
errPossibleBreakout = errors.New("possible breakout detected")
errInvalidDirectory = errors.New("wandered into deleted directory")
errDeletedInode = errors.New("cannot verify path of deleted inode")
)
func isDeadInode(file *os.File) error {
// If the nlink of a file drops to 0, there is an attacker deleting
// directories during our walk, which could result in weird /proc values.
// It's better to error out in this case.
stat, err := fstat(file)
if err != nil {
return fmt.Errorf("check for dead inode: %w", err)
}
if stat.Nlink == 0 {
err := errDeletedInode
if stat.Mode&unix.S_IFMT == unix.S_IFDIR {
err = errInvalidDirectory
}
return fmt.Errorf("%w %q", err, file.Name())
}
return nil
}
func getUmask() int {
// umask is a per-thread property, but it is inherited by children, so we
// need to lock our OS thread to make sure that no other goroutine runs in
// this thread and no goroutines are spawned from this thread until we
// revert to the old umask.
//
// We could parse /proc/self/status to avoid this get-set problem, but
// /proc/thread-self requires LockOSThread anyway, so there's no real
// benefit over just using umask(2).
runtime.LockOSThread()
umask := unix.Umask(0)
unix.Umask(umask)
runtime.UnlockOSThread()
return umask
}
func checkProcSelfFdPath(path string, file *os.File) error {
if err := isDeadInode(file); err != nil {
return err
}
actualPath, err := procSelfFdReadlink(file)
if err != nil {
return fmt.Errorf("get path of handle: %w", err)
}
if actualPath != path {
return fmt.Errorf("%w: handle path %q doesn't match expected path %q", errPossibleBreakout, actualPath, path)
}
return nil
}

View File

@ -0,0 +1,68 @@
//go:build linux
// Copyright (C) 2024 SUSE LLC. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package securejoin
import (
"os"
"testing"
)
type forceGetProcRootLevel int
const (
forceGetProcRootDefault forceGetProcRootLevel = iota
forceGetProcRootOpenTree // force open_tree()
forceGetProcRootOpenTreeAtRecursive // force open_tree(AT_RECURSIVE)
forceGetProcRootUnsafe // force open()
)
var testingForceGetProcRoot *forceGetProcRootLevel
func testingCheckClose(check bool, f *os.File) bool {
if check {
if f != nil {
_ = f.Close()
}
return true
}
return false
}
func testingForcePrivateProcRootOpenTree(f *os.File) bool {
return testing.Testing() && testingForceGetProcRoot != nil &&
testingCheckClose(*testingForceGetProcRoot >= forceGetProcRootOpenTree, f)
}
func testingForcePrivateProcRootOpenTreeAtRecursive(f *os.File) bool {
return testing.Testing() && testingForceGetProcRoot != nil &&
testingCheckClose(*testingForceGetProcRoot >= forceGetProcRootOpenTreeAtRecursive, f)
}
func testingForceGetProcRootUnsafe(f *os.File) bool {
return testing.Testing() && testingForceGetProcRoot != nil &&
testingCheckClose(*testingForceGetProcRoot >= forceGetProcRootUnsafe, f)
}
type forceProcThreadSelfLevel int
const (
forceProcThreadSelfDefault forceProcThreadSelfLevel = iota
forceProcSelfTask
forceProcSelf
)
var testingForceProcThreadSelf *forceProcThreadSelfLevel
func testingForceProcSelfTask() bool {
return testing.Testing() && testingForceProcThreadSelf != nil &&
*testingForceProcThreadSelf >= forceProcSelfTask
}
func testingForceProcSelf() bool {
return testing.Testing() && testingForceProcThreadSelf != nil &&
*testingForceProcThreadSelf >= forceProcSelf
}

View File

@ -1,4 +1,4 @@
// Copyright (C) 2017 SUSE LLC. All rights reserved.
// Copyright (C) 2017-2024 SUSE LLC. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

4
vendor/modules.txt vendored
View File

@ -441,8 +441,8 @@ github.com/crc-org/vfkit/pkg/util
# github.com/cyberphone/json-canonicalization v0.0.0-20231217050601-ba74d44ecf5f
## explicit
github.com/cyberphone/json-canonicalization/go/src/webpki.org/jsoncanonicalizer
# github.com/cyphar/filepath-securejoin v0.2.5
## explicit; go 1.13
# github.com/cyphar/filepath-securejoin v0.3.0
## explicit; go 1.20
github.com/cyphar/filepath-securejoin
# github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc
## explicit