Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
package sif
|
2021-09-27 22:02:53 +08:00
|
|
|
|
|
|
|
|
import (
|
|
|
|
|
"bufio"
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
"context"
|
2021-09-27 22:02:53 +08:00
|
|
|
"fmt"
|
|
|
|
|
"io"
|
|
|
|
|
"os"
|
|
|
|
|
"os/exec"
|
|
|
|
|
"path/filepath"
|
|
|
|
|
"strings"
|
|
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
"github.com/sirupsen/logrus"
|
2021-11-18 05:41:22 +08:00
|
|
|
"github.com/sylabs/sif/v2/pkg/sif"
|
2021-09-27 22:02:53 +08:00
|
|
|
)
|
|
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
// injectedScriptTargetPath is the path injectedScript should be written to in the created image.
|
|
|
|
|
const injectedScriptTargetPath = "/podman/runscript"
|
2021-09-27 22:02:53 +08:00
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
// parseDefFile parses a SIF definition file from reader,
|
|
|
|
|
// and returns non-trivial contents of the %environment and %runscript sections.
|
|
|
|
|
func parseDefFile(reader io.Reader) ([]string, []string, error) {
|
|
|
|
|
type parserState int
|
|
|
|
|
const (
|
|
|
|
|
parsingOther parserState = iota
|
|
|
|
|
parsingEnvironment
|
|
|
|
|
parsingRunscript
|
|
|
|
|
)
|
2021-09-27 22:02:53 +08:00
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
environment := []string{}
|
|
|
|
|
runscript := []string{}
|
2021-09-27 22:02:53 +08:00
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
state := parsingOther
|
|
|
|
|
scanner := bufio.NewScanner(reader)
|
2021-09-27 22:02:53 +08:00
|
|
|
for scanner.Scan() {
|
|
|
|
|
s := strings.TrimSpace(scanner.Text())
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
switch {
|
|
|
|
|
case s == `%environment`:
|
|
|
|
|
state = parsingEnvironment
|
|
|
|
|
case s == `%runscript`:
|
|
|
|
|
state = parsingRunscript
|
|
|
|
|
case strings.HasPrefix(s, "%"):
|
|
|
|
|
state = parsingOther
|
|
|
|
|
case state == parsingEnvironment:
|
|
|
|
|
if s != "" && !strings.HasPrefix(s, "#") {
|
|
|
|
|
environment = append(environment, s)
|
|
|
|
|
}
|
|
|
|
|
case state == parsingRunscript:
|
|
|
|
|
runscript = append(runscript, s)
|
|
|
|
|
default: // parsingOther: ignore the line
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if err := scanner.Err(); err != nil {
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
return nil, nil, fmt.Errorf("reading lines from SIF definition file object: %w", err)
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
return environment, runscript, nil
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
|
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
// generateInjectedScript generates a shell script based on
|
|
|
|
|
// SIF definition file %environment and %runscript data, and returns it.
|
|
|
|
|
func generateInjectedScript(environment []string, runscript []string) []byte {
|
|
|
|
|
script := fmt.Sprintf("#!/bin/bash\n"+
|
|
|
|
|
"%s\n"+
|
|
|
|
|
"%s\n", strings.Join(environment, "\n"), strings.Join(runscript, "\n"))
|
|
|
|
|
return []byte(script)
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
|
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
// processDefFile finds sif.DataDeffile in sifImage, if any,
|
|
|
|
|
// and returns:
|
|
|
|
|
// - the command to run
|
|
|
|
|
// - contents of a script to inject as injectedScriptTargetPath, or nil
|
|
|
|
|
func processDefFile(sifImage *sif.FileImage) (string, []byte, error) {
|
|
|
|
|
var environment, runscript []string
|
|
|
|
|
|
|
|
|
|
desc, err := sifImage.GetDescriptor(sif.WithDataType(sif.DataDeffile))
|
|
|
|
|
if err == nil {
|
|
|
|
|
environment, runscript, err = parseDefFile(desc.GetReader())
|
2021-09-27 22:02:53 +08:00
|
|
|
if err != nil {
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
return "", nil, err
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
|
|
|
|
}
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
|
|
|
|
|
var command string
|
|
|
|
|
var injectedScript []byte
|
|
|
|
|
if len(environment) == 0 && len(runscript) == 0 {
|
|
|
|
|
command = "bash"
|
|
|
|
|
injectedScript = nil
|
|
|
|
|
} else {
|
|
|
|
|
injectedScript = generateInjectedScript(environment, runscript)
|
|
|
|
|
command = injectedScriptTargetPath
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
|
|
|
|
|
return command, injectedScript, nil
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
|
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
func writeInjectedScript(extractedRootPath string, injectedScript []byte) error {
|
|
|
|
|
if injectedScript == nil {
|
2021-09-27 22:02:53 +08:00
|
|
|
return nil
|
|
|
|
|
}
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
filePath := filepath.Join(extractedRootPath, injectedScriptTargetPath)
|
|
|
|
|
parentDirPath := filepath.Dir(filePath)
|
|
|
|
|
if err := os.MkdirAll(parentDirPath, 0755); err != nil {
|
|
|
|
|
return fmt.Errorf("creating %s: %w", parentDirPath, err)
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
2022-04-14 01:33:42 +08:00
|
|
|
if err := os.WriteFile(filePath, injectedScript, 0755); err != nil {
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
return fmt.Errorf("writing %s to %s: %w", injectedScriptTargetPath, filePath, err)
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
|
|
|
|
return nil
|
|
|
|
|
}
|
|
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
// createTarFromSIFInputs creates a tar file at tarPath, using a squashfs image at squashFSPath.
|
|
|
|
|
// It can also use extractedRootPath and scriptPath, which are allocated for its exclusive use,
|
|
|
|
|
// if necessary.
|
|
|
|
|
func createTarFromSIFInputs(ctx context.Context, tarPath, squashFSPath string, injectedScript []byte, extractedRootPath, scriptPath string) error {
|
|
|
|
|
// It's safe for the Remove calls to happen even before we create the files, because tempDir is exclusive
|
|
|
|
|
// for our use.
|
|
|
|
|
defer os.RemoveAll(extractedRootPath)
|
|
|
|
|
|
|
|
|
|
// Almost everything in extractedRootPath comes from squashFSPath.
|
|
|
|
|
conversionCommand := fmt.Sprintf("unsquashfs -d %s -f %s && tar --acls --xattrs -C %s -cpf %s ./",
|
|
|
|
|
extractedRootPath, squashFSPath, extractedRootPath, tarPath)
|
|
|
|
|
script := "#!/bin/sh\n" + conversionCommand + "\n"
|
2022-04-14 01:33:42 +08:00
|
|
|
if err := os.WriteFile(scriptPath, []byte(script), 0755); err != nil {
|
2021-09-27 22:02:53 +08:00
|
|
|
return err
|
|
|
|
|
}
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
defer os.Remove(scriptPath)
|
2021-09-27 22:02:53 +08:00
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
// On top of squashFSPath, we only add injectedScript, if necessary.
|
|
|
|
|
if err := writeInjectedScript(extractedRootPath, injectedScript); err != nil {
|
|
|
|
|
return err
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
|
|
|
|
|
logrus.Debugf("Converting squashfs to tar, command: %s ...", conversionCommand)
|
|
|
|
|
cmd := exec.CommandContext(ctx, "fakeroot", "--", scriptPath)
|
|
|
|
|
output, err := cmd.CombinedOutput()
|
|
|
|
|
if err != nil {
|
|
|
|
|
return fmt.Errorf("converting image: %w, output: %s", err, string(output))
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
logrus.Debugf("... finished converting squashfs to tar")
|
2021-09-27 22:02:53 +08:00
|
|
|
return nil
|
|
|
|
|
}
|
|
|
|
|
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
// convertSIFToElements processes sifImage and creates/returns
|
2022-01-21 20:57:07 +08:00
|
|
|
// the relevant elements for constructing an OCI-like image:
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
// - A path to a tar file containing a root filesystem,
|
|
|
|
|
// - A command to run.
|
|
|
|
|
// The returned tar file path is inside tempDir, which can be assumed to be empty
|
|
|
|
|
// at start, and is exclusively used by the current process (i.e. it is safe
|
|
|
|
|
// to use hard-coded relative paths within it).
|
|
|
|
|
func convertSIFToElements(ctx context.Context, sifImage *sif.FileImage, tempDir string) (string, []string, error) {
|
2022-04-14 01:33:42 +08:00
|
|
|
// We could allocate unique names for all of these using os.{CreateTemp,MkdirTemp}, but tempDir is exclusive,
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
// so we can just hard-code a set of unique values here.
|
|
|
|
|
// We create and/or manage cleanup of these two paths.
|
|
|
|
|
squashFSPath := filepath.Join(tempDir, "rootfs.squashfs")
|
|
|
|
|
tarPath := filepath.Join(tempDir, "rootfs.tar")
|
|
|
|
|
// We only allocate these paths, the user is responsible for cleaning them up.
|
|
|
|
|
extractedRootPath := filepath.Join(tempDir, "rootfs")
|
|
|
|
|
scriptPath := filepath.Join(tempDir, "script")
|
|
|
|
|
|
|
|
|
|
succeeded := false
|
|
|
|
|
// It's safe for the Remove calls to happen even before we create the files, because tempDir is exclusive
|
|
|
|
|
// for our use.
|
|
|
|
|
// Ideally we would remove squashFSPath immediately after creating extractedRootPath, but we need
|
|
|
|
|
// to run both creation and consumption of extractedRootPath in the same fakeroot context.
|
|
|
|
|
// So, overall, this process requires at least 2 compressed copies (SIF and squashFSPath) and 2
|
|
|
|
|
// uncompressed copies (extractedRootPath and tarPath) of the data, all using up space at the same time.
|
|
|
|
|
// That's rather unsatisfactory, ideally we would be streaming the data directly from a squashfs parser
|
2022-01-21 20:57:07 +08:00
|
|
|
// reading from the SIF file to a tarball, for 1 compressed and 1 uncompressed copy.
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
defer os.Remove(squashFSPath)
|
|
|
|
|
defer func() {
|
|
|
|
|
if !succeeded {
|
|
|
|
|
os.Remove(tarPath)
|
|
|
|
|
}
|
|
|
|
|
}()
|
|
|
|
|
|
|
|
|
|
command, injectedScript, err := processDefFile(sifImage)
|
2021-09-27 22:02:53 +08:00
|
|
|
if err != nil {
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
return "", nil, err
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
|
|
|
|
|
rootFS, err := sifImage.GetDescriptor(sif.WithPartitionType(sif.PartPrimSys))
|
|
|
|
|
if err != nil {
|
|
|
|
|
return "", nil, fmt.Errorf("looking up rootfs from SIF file: %w", err)
|
|
|
|
|
}
|
|
|
|
|
// TODO: We'd prefer not to make a full copy of the file here; unsquashfs ≥ 4.4
|
|
|
|
|
// has an -o option that allows extracting a squashfs from the SIF file directly,
|
|
|
|
|
// but that version is not currently available in RHEL 8.
|
|
|
|
|
logrus.Debugf("Creating a temporary squashfs image %s ...", squashFSPath)
|
2025-02-22 13:04:47 +08:00
|
|
|
if err := func() (retErr error) { // A scope for defer
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
f, err := os.Create(squashFSPath)
|
|
|
|
|
if err != nil {
|
|
|
|
|
return err
|
|
|
|
|
}
|
2025-02-22 13:04:47 +08:00
|
|
|
// since we are writing to this file, make sure we handle err on Close()
|
|
|
|
|
defer func() {
|
|
|
|
|
closeErr := f.Close()
|
|
|
|
|
if retErr == nil {
|
|
|
|
|
retErr = closeErr
|
|
|
|
|
}
|
|
|
|
|
}()
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
// TODO: This can take quite some time, and should ideally be cancellable using ctx.Done().
|
|
|
|
|
if _, err := io.CopyN(f, rootFS.GetReader(), rootFS.Size()); err != nil {
|
|
|
|
|
return err
|
|
|
|
|
}
|
|
|
|
|
return nil
|
|
|
|
|
}(); err != nil {
|
|
|
|
|
return "", nil, err
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
logrus.Debugf("... finished creating a temporary squashfs image")
|
|
|
|
|
|
|
|
|
|
if err := createTarFromSIFInputs(ctx, tarPath, squashFSPath, injectedScript, extractedRootPath, scriptPath); err != nil {
|
|
|
|
|
return "", nil, err
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|
Extensive refactoring to address review comments and hopefully simplify
Fix policy configuration identities in sif
- Actually allow something in ValidatePolicyConfigurationScope ;
SIF is one of the cases where it's actually a bit plausible
that a policy rejecting some filesystem sources might be desirable.
- Fix PolicyConfigurationNamespaces not to include the file name itself,
and "/"
Add tests for sifTransport and sifReference
The NewImage and NewImageSource tests are rather
pointless, but we don't want to require and invoke
fakeroot etc. on every unit test run, at least for now.
Use ref.file instead of ref.resolvedFile in newImageSource
Consistently with the dir: design, if the user specifies a relative
path, use it directly so that we don't introduce races against
changes to the directory structure.
Beautify sif_transport.go
- Use the usual order (transport implementation, followed by
reference implementation)
- Copy&paste more of the comments, to reinforce the contract
requirements.
Fix the package name directive
Reorganize imports
... to follow the usual convention.
Don't use pkg/errors in sif.
Mostly replace its uses with fmt.Errorf(...%w...).
Fix uses of fmt.Errorf
- Use %w instead of %v for error wrapping
- Use errors.New when the string is constant
Don't prefix wrapped error context with "error "
... to match most of c/image code, where we have previously
removed such prefixes.
Return true from HasThreadSafeGetBlob
... because that's the case for the current implementation,
although it makes no difference for the current c/image/copy
caller, when this source only provides one layer.
Rename sifImageSource.blobID to blobDigest
Rename sifImageSource.configID to configDigest
Remove workdir if newImageSource fails
Rename sifImageSource.blob* to layer*
... to differentiate the layer data from the config, which
is also a "blob" in the ImageSource naming.
Remove sifImageSource.diffID
The value only needs to be known inside newImageSource,
so pass it around in a variable/return value.
Remove unused sifImageSource.diffSize
... which allows us to make getLayerInfo
a function without a (partially-created) sifImageSource parent
object.
Move layerTime computation from createBlob to getBlobInfo
If anything, the latter is a bit more accurate (capturing
the time of the last update of the file we are creating, vs.
the time of the initial creation), but we want to eventually
that with a value from the SIF header anyway.
Remove sifImageSource.layerTime
It's only necessary in newImageSource, so use
a return value and a local variable for that.
Remove unused sifImageSource.layerType
Remove sifImageSource.configSize
This value is already stored in the sifImageSource.config slice,
so don't store a redundant copy.
Rename workdir to workDir
following usual Go patterns.
Rename tarpath to tarPath everywhere
Return layerDigest and layerSize from getBlobInfo
... instead of writing it to partially-initialized sifImageSource.
Provide a path to getBlobInfo instead of reading it from sifImageSource
This makes getBlobInfo independent of the partially-created sifImageSource.
Don't create a compressed layer from the SIF file
Compression is very costly, principially in CPU time.
Many use cases (notably import to c/storage) would
only end up decompressing the data again.
Those that do neeed the data compressed, like push
to a registry, can use the copy pipeline's streaming compression
implementation, often without needing to store the compressed
version in a temporary file.
So this is likely to improve both CPU time usage and (maximum) disk
space usage - at the very least against the current implementation
which doens't even remove the uncompressed version after creating
the compressed one :)
This is a minimal version of the change, we are now computing
the layer's digest twice. We'll fix that soon.
Rename tarPath to layerPath
... to be consistent with the other variables.
Don't compute the DiffID separately
It's the same value as the layer digest, now that
the layer is just the uncompressed tarball.
Rename fgz to f
The file is now expected to not be compressed.
Rename blobDigester to digester
... just to be a bit shorter.
Inline some single-use variables when building the manifest
Use a struct initializer instead of a set of assignments for config
Inline single-use variables when building a config
Also remove some fairly redundant comments.
Use a switch if sifImageSource.GetBlob
... to make the structure a tiny bit less repetitive.
Close the SIF image object in newImageSource
Nothing actually needs it afterwards.
Rename UnloadSIFImage() to Close() to indirectly silence
a linter about handling the error; it can't fail in practice,
and isn't quite worth handling.
Explicitly specify a MediaType field in the generated OCI manifest
... to follow best practices vs. schema confusion attacks
(although this generated manfiest is clearly not a schema confusion
attack).
Beautify loadedSifImage.Close
Turn loadedSifImage.GetConfig into CommandLine
- Make loadedSifImage independent of the OCI format details.
- Make it clear at the call site that only the command is actually
provided.
- Don't return an error value which is always nil, which makes the caller
simpler.
Remove lookup of the sif.DataEnvVar descriptor
It is unused in this codebase, and it's unclear
what, if anything, it is used for anywhere else.
Make SifImage.parseEnvironment and SifImage.parseRunscript stand-alone
... i.e. independent from SifImage, so that we can more
easily unit-test it.
The res *[]string parameter is rather ugly, but we'll
refactor it away soon enough.
Should not change behavior.
Split parseDefFile from SifImage.generateConfig
... to have an easily unit-testable bit of code.
Should not change behavior. This destructively assigns
to image.envlist and image.cmdlist instead of appending,
but it should be the only writer at that point.
Add a smoke test for parseDefFile
Use a state machine for parseDefFile
... instead of a nesting scanner.Scan() loops
and a goto.
Should not change behavior.
Remove a misleading comment
Now, with GetDescriptor, more than one matching
descriptor results in an error, so there isn't anything
to assume about a single value.
Move DataDeffile descriptor lookup into generateConfig
It's the only user of that data
Remove deffile and defReader from loadedSifImage
Turn them into trivial local variables in generateConfig()
The code remains a big convoluted, we'll clean that up soon.
Simplify generateConfig
Eliminate both deffile and defReader.
Pass %environment and %runscript to generateRunscript explicitly
That will eventually make it easier to unit-test
Return the generated script from generateRunscript instad of updating image
This makes generateRunscript stand-alone and easy to unit-test.
Add a smoke test for generateRunscript
It's not much, but better than nothing.
Use InjectedScript instead of Runscript for the script we generate
... everywhere, to differentiate that script from the %runscript
section contents.
Store injectedScript as a []byte instead of bytes.Buffer
No need to keep around the intermediate form, and this allows
us to change the implementation.
Use strings.Join and Sprintf instead of bytes.Buffer in generateInjectedScript
Assuming this is not performance-critical, the code is much shorter,
and clearly cannot fail (just like the previous version, which is documented
to panic rather than return the errors that version unnecessarily handled).
Note that this might change behavior for empty %environment or %runscript
sections: we now add extra empty lines. That shouldn't make a difference.
Remove the unnecessary error return value from generateInjectedScript
Remove loadedSifImage.envlist
All users are local to generateConfig.
Don't use loadedSifImage.cmdlist for storing %runscript
It's just a local value to generateConfig, and we no longer
use cmdlist for both %runscript and the final command line.
Replace loadedSifImage.cmdlist with a single command string
The array only ever has one element, so get rid of the array.
Beautify generateConfig
Always refer to environment and runscript in the same order.
Move generateInjectedScript after parseDefFile
First create the data, then consume it.
Simplify generateConfig
Only have the fallback to "bash" if no script is
available in a single place.
Rename resultDesc to desc
For such a short-lived variable we can have a shorter name.
Rename tempdir to tempDir throughout
... to follow Go conventions.
Remove a Sync call on the squashfs copy
We'd actually prefer that data not to hit the disk;
we want to remove it as soon as possible.
Instead, scope the deferred Close() so that it happens
before we consume the file.
Pass around squashFSPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Pass around tarPath in a variable.
... instead of providing an ambient constant for the relative
path.
That will make it clearer which code uses that file.
Remove the generated tar file on failure
e.g. if creating it runs out of space.
Beautify SquashFSToTarLayer
Explicitly return values instead of relying on named return
values to make the data (in this case, error) flow more explicit.
Use Sprintf instead of string concatenation for the generated script
It is a bit more manageable that way.
Also actually start the script with a recognized shebang instead
of a newline.
Don't hard-code "squashfs-root" all over the place.
Pass extractedRootPath around instead, and use (unsquashfs -d) to
override the built-in default.
Inline a single-use cmd variable
Rename xcmd to cmd
... now that the cmd name is available.
Use explicit return statements instead of named return values
Make loadedImage always passed by reference
The struct contains a stateful *sif.FileImage, which
makes no sense to copy; so don't get into that habit
even in cases where it might be safe.
Use a constant for the /podman/runscript path
... instead of hard-coding it over the place, and even
assuming a specific directory structure.
Add more context to write failures
Don't write to stderr; return error output to the caller
Remove the generated script immediately after using it
Make the tar file creation cancellable using the provided context
Also add TODO notes in other places where we would prefer
the copy to be cancellable.
Rename runUnSquashFSTar to createTarFromSIFInputs
We are going to have it handle the injectedScript
as well.
Pass scriptPath to exec.Command instead of hard-coding a constant
This allows us not to care about the working directory of the
script, as well.
Move the scriptPath decision to SquashFSToTarLayer
That's the only place that is aware of tempDir now.
Pass injectedScript to writeInjectedScript
... to make it independent of loadedSifImage; it
will go away entirely soon.
Call writeInjectedScript from createTarFromSIFInputs
... so that createTarFromSIFInputs is responsible for both
creating and consuming extractedRootPath, without any external
interference.
Move the cleanup of extractedRootPath to createTarFromSIFInputs
... to make it a tiny bit more self-contained, now that it
handles the injectedScript as well.
Add a comment to more clearly document the alocation of paths in SquashFSToTarLayer
Remove loadedSifImage.rootfs
Instead, determine the value in SquashFSToTarLayer.
This means that we now try to interpret the deffile before checking
for rootfs presence, changing the possible order of errors. That shouldn't
be much of a difference for valid images.
Rename generateConfig to processDefFile
Have processDefFile return the values instead of writing to loadedSifImage
We will eliminate the loadedSifImage members entirely soon.
Pass sif.FileImage to processDefFile explicitly
... instead of using the image.fimg member.
Rename SquashFSToTarLayer to convertSIFToElements
We are going to have it return other values as well.
Return also the command line from convertSIFToElements
This turns it into the central point of the conversion
process, instead of the fairly ambient loadedSifImage object.
Call processDefFile only in convertSIFToElements
This allows us to remove loadedSifImage.injectedScript and
loadedSifImage.command, making loadedSifImage finally
a trivial wrapper around sif.FileImage - and we'll eliminate
that wrapper next.
Inline loadedSifImage.GetSIFID
We don't really need that abstraction.
Inline loadedSifImage.GetSIFArch
We don't really need that abstraction.
Make convertSIFToElements a stand-alone function
Eliminating the last non-trivial user of loadedSifImage.
Eliminate loadedSifImage
Finally, eliminate the loadedSifImage type entirely.
It doesn't really make sense to inject a layer of abstraction
between sifImageSource and sif.FileImage, purely for the abstraction.
loadedSifImage was only ever used in one way, as an essentially
procedural step; that is now served by the convertSIFToElements
function, rather than being split between the loadedSifImage constructor
and the original tarball creation method.
(convertSIFToElements might eventually return a struct with named
fields if there were many, but it doesn't make sense for newImageSource
to hold an object and fill it up one step at a time.)
Use the last modification time from the SIF header for OCI creation time
Add a TODO note
Add a note about (unsquashfs -o)
Add debug logs around long-running operations
... and make sure to include paths of the relevant files.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-01-07 07:27:36 +08:00
|
|
|
succeeded = true
|
|
|
|
|
return tarPath, []string{command}, nil
|
2021-09-27 22:02:53 +08:00
|
|
|
}
|