Commit Graph

228 Commits

Author SHA1 Message Date
Brent Baude cad4d0ee9f Fix path for omvf vars on Darwin/arm64
On darwin arm64, we need to set the location of the ovmf vars. It should be put into the imageDir (also known as as dataDir).  But because qemu determines the image path late in Init(), the image path is set something like a stream marker.

Fixes #20361

[NO NEW TESTS NEEDED]

Signed-off-by: Brent Baude <bbaude@redhat.com>
2023-10-18 10:40:08 -05:00
openshift-ci[bot] aa0e96e781
Merge pull request #20274 from ashley-cui/cleanup
Machine: Teardown on init failure
2023-10-13 14:22:46 +00:00
openshift-ci[bot] 5afa949a43
Merge pull request #20322 from jakecorrenti/set-lock
Implement SetLock for all virt providers
2023-10-12 23:15:40 +00:00
Daniel J Walsh cb53bcf23f
Run codespell
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-10-12 12:45:44 -04:00
Jake Correnti 987dc2b8bb SetLock for all virt providers
Implements a shared `GetLock` function for virtualization providers. Returns
a pointer to a lockfile used for serializing write operations.

[NO NEW TESTS NEEDED]

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-10-12 12:06:31 -04:00
Ashley Cui 61e0b64b91 Machine: Teardown on init failure
If init fails, or if a SIGINT is sent during init, podman machine should remove all files and configs
created during the init. This includes config jsons, image files, ssh
id's, and system connections. On Windows, the VM instances are also
unregistered.

Signed-off-by: Ashley Cui <acui@redhat.com>
2023-10-12 09:26:06 -04:00
Jake Correnti 0414f88b3a Create Qemu command wrapper
Creates a wrapper around the Qemu command line implementation to prevent
the need to hard-code the different command line options in Init and
Start.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-10-04 23:17:15 -04:00
Jake Correnti 9b39641116 Remove `c.ExtraFiles` line in machine
Removes the line in applehv and qemu `machine.go` file. These are
remnants from #19723. This lines was written to add stdin, stdout,
stderr as extra files, but that is not how `c.ExtraFiles` works (unlike
`os.ProcAttr`).

go source: https://cs.opensource.google/go/go/+/go1.21.1:src/os/exec/exec.go;l=147

[NO NEW TESTS NEEDED]

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-09-28 19:24:06 -04:00
Ashley Cui bcfd9f3403 New machines should show Never as LastUp
After creating a podman machine, and before starting it, the LastUp value for podman machine ls should display Never. Previously, the LastUp value was the same as creation time. This also changes the LastUp value for inspect to ZeroTime instead of creation time.

Signed-off-by: Ashley Cui <acui@redhat.com>
2023-09-28 14:16:26 -04:00
Brent Baude 5b3801776b Various updates for hyperv and machine e2e tests
This PR is a mishmash of updates needed so that the hyperv provider can
begin to passd the machine e2e tests.

Summary as follows:
* Added custom error handling for machine errors so that all providers
  can generate the same formatted error messages.  The ones implemented
  thus far are needed for the basic and init tests.  More will come as
  they are identified.
* Vendored new libhvee for better memory inspection.  The memory type
  changed from uint32 to uint64.
* Some machine e2e tests used linux-specific utilities to check various
  error conditions and messages (like pgrep).  Those were made into
  functions and implemented on an operating system level.

[NO NEW TESTS NEEDED]

Signed-off-by: Brent Baude <bbaude@redhat.com>
2023-09-21 08:52:02 -05:00
OpenShift Merge Robot 53e6a4435f
Merge pull request #20031 from ashley-cui/winmake
Makefile equiv Powershell script
2023-09-20 22:57:59 -04:00
Ashley Cui 113b41b6f5 Buildtag out unix commands for common OS files
Unix only code crept into shared portions of the machine codebase,
preventing builds on Windows. Move them into unix-only files.

[NO NEW TESTS NEEDED]

Signed-off-by: Ashley Cui <acui@redhat.com>
2023-09-19 12:19:24 -04:00
Jake Correnti 289e59ee1f Implement gvproxy networking using cmdline wrapper
Converts the host networking code in `podman machine` to use the
`GvproxyCommand` type introduced in containers/gvisor-tap-vsock#258

[NO NEW TESTS NEEDED]

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-09-19 09:20:26 -04:00
Ashley Cui da81bc13a1 Add rootful status to machine inspect
Podman machine inspect now shows if the machine is rootful

Signed-off-by: Ashley Cui <acui@redhat.com>
2023-08-25 11:27:08 -04:00
Brent Baude d3618719b1 Dedup and refactor image acquisition
As promised in #19596, this pr deduplicates and refactors image
acquisition.  All virt providers that use FCOS as its default now use
the same code.

[NO NEW TESTS NEEDED]

Signed-off-by: Brent Baude <bbaude@redhat.com>
2023-08-24 20:52:03 -05:00
Daniel J Walsh 62a22c5d60
Run codespell on code
Also cleanup --rm=true to be just --rm

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-08-03 07:00:30 -04:00
Valentin Rothberg 8b7701f522 machine: QEMU: recover from failed start
After a failed start, we can run into (somehow inconsistent) states
where the machine won't start because a previous QEMU process is still
running and the PID file is being used.  Stop didn't resolve the issue
as this state wasn't detected.

Allow to recover from this state by a) detecting it during start and
error out with a more helpful message than the error QEMU would
otherwise spit out, and b) by enabling stop to kill the dangling QEMU
process - even after a failed stop.

With the changes, a recovery may look as follows:
```
_  podman git:(main) _ ./bin/darwin/podman machine start
Starting machine "podman-machine-default"
Error: cannot start VM "podman-machine-default": another instance of "/opt/homebrew/bin/qemu-system-aarch64" is already running with process ID 970: please stop and restart the VM
_  podman git:(main) _ ./bin/darwin/podman machine stop
Machine "podman-machine-default" stopped successfully
_  podman git:(main) _ ./bin/darwin/podman machine start
Starting machine "podman-machine-default"
Waiting for VM ...
```

Please note that this change does not prevent us from running into such
inconsistent states but only allows for recovering from them.

[NO NEW TESTS NEEDED] - there is no reliable reproducer.

Fixes: #16054
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-08-02 11:08:26 +02:00
Jake Correnti 21ebe0e90a Move `writeConfig` logic to shared function
Moves the shared logic from `writeConfig` into a shared function in
`pkg/machine/machine_common.go`

[NO NEW TESTS NEEDED]

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-08-01 21:40:14 -04:00
Jake Correnti 597ccff0bc Move some logic of `setRootful` to a common file
Moves most of the logic of `setRootful` to the common file
`pkg/machine/machine_common.go`.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-08-01 21:40:06 -04:00
Jake Correnti 98cf8462ad move `removeFilesAndConnections` to a common file
Moves `removeFilesAndConnections` to the common file
`pkg/machine/connections.go` to be reused by multiple hypervisors.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-08-01 21:13:58 -04:00
Jake Correnti 75a8f13c4a Move `waitAPIAndPrintInfo` to common file
Moves `waitAPIAndPrintInfo` into the common file
`pkg/machine/machine_common.go` allowing applehv and qemu to share the
code.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-08-01 21:13:58 -04:00
Jake Correnti 55c7b5ceca Move `addSSHConnectionsToPodmanSocket` code to shared file
Moves the implementation of `addSSHConnectionsToPodmanSocket` into the
common file `pkg/machine/machine_common.go`. The implementation was
shared between the hypervisors and does not need to be implemented
multiple times.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-08-01 21:13:52 -04:00
Jake Correnti 850482b314 Move alternate image acquisition to separate function
Moves acquisition of an alternate image provided by the user out of
`acquireVMImage` in `pkg/machine/<hypervisor>/machine.go` and into
`pkg/machine/pull.go` as its own function.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-08-01 09:53:38 -04:00
Jake Correnti 906af5bbc6 Move `getDevNullFiles` into a common file
Moves `getDevNullFiles` into a new common file,
`pkg/machine/machine_common.go`, preventing the re-implementation of the
function across the different hypervisor implementations.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-08-01 08:52:23 -04:00
Jake Correnti d6847b19c8 Convert QEMU functions to methods with documentation
Converts new functions added in #19311 to methods and adds
documentation.

[NO NEW TESTS NEEDED]

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-07-31 15:55:47 -04:00
Valentin Rothberg 416a471eed machine: QEMU: lock VM on stop/rm/set
Lock the machine when stopping, removing or changing its attributes to
make sure write accesses are serialized which should prevent a number of
issues and inconsistencies reported.

[NO NEW TESTS NEEDED]

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-07-28 13:57:59 +02:00
OpenShift Merge Robot bd0fe69cad
Merge pull request #19385 from jakecorrenti/breakup-qemu-config-funcs
Breakup qemu config funcs
2023-07-28 08:37:42 +02:00
Jake Correnti 3523b9b052 Break QEMU `config.go` code into its own functions
Breaks some of the code in QEMU's `VirtProvider` implementation located
at `pkg/machine/qemu/config.go` into its own functions. Aids in
improving the readability of the code.

[NO NEW TESTS NEEDED]

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-07-27 09:53:21 -04:00
Valentin Rothberg c341a0ffe0 machine: QEMU: lock VM on start
Lock the VM on start.  If the machine is in the "starting" state we know
that a previous start has failed and guide the user into resolving the
issue.

Concurrent starts will busy wait and return the expected "already
running" error.

NOTE: this change is only looking at the start issue (#18662).  Other
commands such as stop and update should also lock and will be updated
in a future change.  I expect the underlying issue to apply to all
machine providers, not only QEMU.  It's desirable to aim for extending
the machine interface to also allow to `Lock()` and `Unlock()`.  After
acquiring the lock, the VM should automatically be reloaded/updated.

[NO NEW TESTS NEEDED]

Fixes: #18662
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-07-27 14:16:02 +02:00
Jake Correnti b57091ac92 Reduce qemu machine function sizes
The functions for QEMU's `VM` interface implementation (`machine.go`)
had quite large functions. Pulls out some code that could be moved to
its own function for easier readability.

[NO NEW TESTS NEEDED]

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-07-24 09:31:58 -04:00
OpenShift Merge Robot a6bdccdb85
Merge pull request #19217 from baude/applehvpass3
Podman machine AppleHV pass number 3
2023-07-13 19:03:46 +02:00
Brent Baude 1443e2918c Podman machine AppleHV pass number 3
* Enabled user-mode networking with gvproxy
* VirtIOFS volumes supported

Signed-off-by: Brent Baude <bbaude@redhat.com>

[NO NEW TESTS NEEDED]

Signed-off-by: Brent Baude <bbaude@redhat.com>
2023-07-13 09:06:34 -05:00
Valentin Rothberg 8c16322a84 machine start: qemu: wait for SSH readiness
During the exponential backoff waiting for the machine to be fully up
and running, also make sure that SSH is ready.  The systemd dependencies
of the ready.service include the sshd.service among others but that is
not enough.

Other CoreOS users reported the same issue on IRC, so I feel fairly
confident to use the pragmatic approach of making sure SSH works on the
client side.  #17403 is quite old and there are other pressing machine
issues that need attention.

[NO NEW TESTS NEEDED]

Fixes: #17403
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-07-13 08:57:07 +02:00
Valentin Rothberg a0b7655523 machine start: qemu: adjust backoffs
Make sure that starting a qemu machine uses proper exponential backoffs
and that a single variable isn't shared across multiple backoffs.

DO NOT BACKPORT: I want to avoid backporting this PR to the upcoming 4.6
release as it increases the flakiness of machine start (see #17403). On
my M2 machine, the flake rate seems to have increased with this change
and I strongly suspect that additional/redundant sleep after waiting for
the machine to be running and listening reduced the flakiness.  My hope
is to have more predictable behavior and find the sources of the flakes
soon.

[NO NEW TESTS NEEDED] - still too flaky to add a test to CI.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-07-05 09:40:33 +02:00
Valentin Rothberg 9fb2f8e100 podman machine start: fix ready service
When debugging #17403, the logs of sshd indicates that Podman tried to
ssh into the machine too soon as the `core` user has not yet been fully
set up:

 > error: kex_exchange_identification: Connection closed by remote host
 > fatal: Access denied for user core by PAM account configuration [preauth]

@dustymabe found that the we may have to wait for systemd-user sessions
to be up.  Doing that reduces the flake rate on my M2 machine but does
not entirely fix the issue.

Since I have seen multiple symptoms of flakiness, I think it does not
hurt to add the systemd-user sessions to the dependencies of the ready
service and continue investigating.

[NO NEW TESTS NEEDED] - once we have a fix out, I want to exercise
frequent stop/start in the machine tests but they won't pass now.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-30 10:50:30 +02:00
Jake Correnti 516034215f Re-organize hypervisor implementations
Ensures that for each hypervisor implementation, their `config.go` file
deals with implementing the `VirtProvider` interface while the
`machine.go` file is for implementing the `VM` interface.

Moves the `Virtualization` type into a common file and
created wrappers for the individual hypervisors. Allows for shared
functions that are exactly the same while providing the flexibility to
create hypervisor-specific implementations of the functions.

[NO NEW TESTS NEEDED]

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
2023-06-23 11:33:19 -04:00
Black-Hole1 04a1cdfa96
refactor(machine): remove hard code
Use exported variables instead of hard-coded strings.

Ref: https://github.com/containers/common/pull/1516

Signed-off-by: Black-Hole1 <bh@bugs.cc>
2023-06-21 18:49:12 +08:00
Black-Hole1 81e63227e6
fix(machine): throw `connect: connection refused` after set proxy
When the `machine start` command is executed, Podman automatically retrieves the current host's `*_PROXY` environment variable and assigns it directly to the virtual machine in QEMU. However, most `*_PROXY` variables are set with `127.0.0.1` or `localhost`, such as `127.0.0.1:8888`. This causes failures in network-related operations within the virtual machine due to incorrect proxy settings.

Fixes: #14087
Signed-off-by: Black-Hole1 <bh@bugs.cc>
2023-06-21 01:01:58 +08:00
Black-Hole1 c7a8d29f12
refactor: improve get ssh path duplicate code
Signed-off-by: Black-Hole1 <bh@bugs.cc>
2023-06-07 09:03:35 +08:00
Erik Sjölund 685c736185 source code comments and docs: fix typos, language, Markdown layout
- fix a/an before noun
- fix loose -> lose
- fix "the the"
- fix lets -> let's
- fix Markdown layout
- fix a few typos
- remove unnecessary text in troubleshooting.md

Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
2023-05-22 07:52:16 +02:00
Paul Holzinger ce07860a1c
machine: fix default connection URL to use 127.0.0.1
gvproxy listens on 127.0.0.1, using localhost as hostname can result in
the client trying to connect to the ipv6 localhost (`::1`). This will
fail as shown in the issue. This switches the hostname in the system
connection to 127.0.0.1 to fix this problem.
I switched the qemu, hyperV and WSL backend. I haven't touched the
applehv code because it uses two different ips and I am not sure what is
the correct thing there. I leave this to Brent to figure out.

[NO NEW TESTS NEEDED]

[1] https://github.com/containers/gvisor-tap-vsock/blob/main/cmd/gvproxy/main.go#L197-L199

Fixes #16470

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-05-16 10:55:31 +02:00
Jason T. Greene 5a176f09c2 Set machine docker.sock according to rootful flag
Signed-off-by: Jason T. Greene <jason.greene@redhat.com>
2023-05-14 23:56:15 -05:00
OpenShift Merge Robot cd9a95922f
Merge pull request #18359 from Luap99/machine-connection
machine: qemu only remove connection after confirmation
2023-05-01 13:07:56 -04:00
OpenShift Merge Robot 832b098471
Merge pull request #18303 from n1hility/user-mode
Add user-mode networking feature to Windows/WSL
2023-04-26 16:01:48 -04:00
Paul Holzinger 64959b744f
pkg/machine: rework RemoveConnection()
It really does not make sense to call RemoveConnection() twice and then
update the config file a third time in updateDefaultMachineinConfig().
This results in unnecessary reads/writes and more code.

Simplyfy this into one function that is only called once and do all
updates at once.

[NO NEW TESTS NEEDED]

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-04-26 16:57:22 +02:00
Paul Holzinger 2296e71e39
machine: qemu only remove connection after confirmation
the connection remove call must be done inside the function that is
returned so that we wait until the user confirmed it.

Fixes #18330

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-04-26 16:44:28 +02:00
Ashley Cui f3c3ef72dc Recover from failed podman machine start
In rare instances, if podman machine start does not exit correctly, the machine can be left in a "Starting" state, when in reality the machine is stopped. This prevents the user from actually starting the machine. This commit makes sure that on `podman machine stop`, we check if this is the case, and correctly set the starting state to false, allowing the user to start their machine again.

Signed-off-by: Ashley Cui <acui@redhat.com>
2023-04-25 09:29:15 -04:00
Jason T. Greene 230ddbe0ca Add user mode networking feature to Windows
Signed-off-by: Jason T. Greene <jason.greene@redhat.com>
2023-04-24 17:11:54 -05:00
Nathan Henrie 6014f26c47
Revert "Resolve symlink path for qemu directory if possible"
This reverts commit 6b6458916e (Resolve
symlink path for qemu directory if possible).

Fully resolving the symlink to qemu solves some issues for
aarch64-darwin nix with regards to finding `edk2-aarch64-code.fd`, but
unfortunately the fully resolved path includes the version number,
making it so that even patch updates break the path to
homebrew-installed qemu files.

Fixes https://github.com/containers/podman/issues/18111

[NO NEW TESTS NEEDED]

Signed-off-by: Nathan Henrie <nate@n8henrie.com>
2023-04-24 10:06:43 -06:00
Brent Baude 8019dc9e60 hyperv: add podman socket mapping
on machine start, create a socket representing the machine's podman
service socket so local (to the host) applications can take advanatge of
it.

[NO NEW TESTS NEEDED]

Signed-off-by: Brent Baude <bbaude@redhat.com>
2023-04-19 16:41:34 -05:00