* Add necessary data types/tables for pipeline version. Mostly based
on Yang's branch at https://github.com/IronPan/pipelines/tree/kfpci/.
Backward compatible.
* Modified comment
* Modify api converter according with new pipeline (version) definition
* Change pipeline_store for DefaultVersionId field
* Add pipeline spec to pipeline version
* fix model converter
* fix a comment
* Add foreign key, pagination of list request, refactor code source
* Refactor code source
* Foreign key
* Change code source and package source type
* Fix ; separator
* Add versions table and modify existing pipeline apis
* Remove api pipeline defintiion change and leave it for later PR
* Add comment
* Make schema changing and data backfilling a single transaction
* Tolerate null default version id in code
* fix status
* Revise delete pipeline func
* Use raw query to migrate data
* No need to update versions status
* rename and minor changes
* accidentally removed a where clause
* Fix a model name prefix
* Refine comments
* Revise if condition
* Address comments
* address more comments
* Rearrange pipeline and version related parts inside CreatePipeline, to make them more separate.
* Add package url to pipeline version. Required when calling CreatePipelineVersionRequest
* Single code source url; remove pipeline id as sorting field; reformat
* resolve remote branch and local branch diff
* remove unused func
* Remove an empty line
* Fix bug where source and variables are not accessible to visualization
* Updated snapshot
* Removed test_generate_test_visualization_html_from_notebook
* Added test cases to ensure roc_curve, table, and tfdv visualizations can be generated
* Made test requirements identical to normal requirements
* Fixed source links
* Updated test_server.py to use table visualization
* Update .travis.yml
* Add logging to debug travis tests
* Add tensorflow back to requirements.txt
* Updated .travis.yml and requirements.txt, also added comment that specifies required libraries to run tests
* Testing TFDV visualization with different source
* Changed remote paths to be local due to timeout issues
* Removed visualization tests due to continued failure
* Reverted .gitignore and removed tensorflow from text_exporter pip install command
* Moved where dependencies are installed in .travis.yaml
* Revert "Made test requirements identical to normal requirements"
This reverts commit 7f11c43c44.
* Added pip install requirements to .travis file
* Removed new unit test and requirements.txt install
* Cleaned up tests and re-added test.py predefined visualization
* Cleanup
* Created visualization_api_test.go
* Updated BUILD.bazel files
* Removed clean_up from e2e test
* Revert "Removed clean_up from e2e test"
This reverts commit 82fd4f5a00.
* Update e2e tests to build visualizationserver and viewer-crd
* Fix bug where wrong image is set
* Fixed incorrect image names
* Fixed additional instance of incorrect image names
* Updated Dockerfile.visualization to take advantage of caching and switched base image
* Removed tensorflow from requirements.txt and added new package to third_party_licenses.csv
* Added developer_guide.md for Python based visualizations
* Changed md file name to be README and added link to documentation page
* Updated README.md to match syntax of #1878
* Added architecture and known limitations sections to documentation
* Addressed PR comments
* Address offline feedback from @SinaChavoshi
* Removed limitation
#1951 changes how arguments are passed from the API server to the Python service. This now allows for multi-line comment support.
* Addressed PR comments
* Add new template files
* Add statement to change template used depending on type of visualization
Now, non-custom visualizations will not show stdout and stderr messages to a user.
* Removed new template files
* Removed unused custom.css style file
* Added simpler way to hide logging for non-custom visualizations
* Set hide_logging based on if a cell is based on a file or custom code
* Updated exporter unit tests
* Removed deprecated logic to set template type based on visualization type
* Fixed test_create_cell_from_args_with_multiple_args and removed test.py due to changes made to create_cell_from_file function
* Change how arguments are checked and provided for Python service
* Arguments no longer require a source if the type is specified to be custom
* If no source is provided for a custom visualization, it will no longer be provided to the Python service
* Added unit test to test that an empty source can be provided alongside custom visualizations
* Added support for custom code to be used to generate visualizations within Python service
* Added unit tests to cover support of custom visualizations
* Fixed logic that handles source addition and validation in API
* Formatted visualization_server_test.go
* Moved self.maxDiff to setup function
* Removed unused import
* Simplified how arguments are passed from API to Python service
Arguments are no longer manually converted to command line arguments to be passed to the Python service. Instead, they are converted to x-www-form-urlencoded arguments which is sent to the Python service and then converted to a dictionary by the Python service.
* Made @staticmethods private functions
* Added table and tfdv visualization
Also fixed issue surrounding ApiVisualizationType enum
* Fixed table visualization
* Removed byte limit
* Fixed issue where headers would not properly be applied
* Fixed issue where table would not be intractable
* Updated table visualizaiton to reflect changes made to dependency injection
* Fixed bug where checking if headers is provided to table visualizations could crash visualization
* Added TFMA visualization
* Updated new visualizations to match syntax of #1878
* Updated test snapshots to account for TFMA visualization
* Small if statement synax changes
* Add flake8 noqa comments to table.py and tfma.py
* Add visualization-server service to lightweight deployment
* Addressed PR suggestions
* Added field to determine if visualization service is active and fixed unit tests for visualization_server.go
* Additional small fixes
* port change from 88888 -> 8888
* version change from 0.1.15 -> 0.1.26
* removed visualization-server from base/kustomization.yaml
* Fixed visualization_server_test.go to reflect new changes
* Changed implementation to be fail fast
* Changed host name to be constant provided by environment
* Added retry and extracted isVisualizationServiceAlive logic to function
* Fixed deployment.yaml file
* Fixed serviceURL configuration issuse
serviceURL is now properly obtained from the environment, the service ip address and port are used rather than service name and namespace
* Added log message to indicate when visualization service is unreachable
* Addressed PR comments
* Removed _HTTP
* Lint Python code for undefined names
* Lint Python code for undefined names
* Exclude tfdv.py to workaround an overzealous pytest
* Fixup for tfdv.py
* Fixup for tfdv.py
* Fixup for tfdv.py
Gorm doesn't automatically change the type of a column. This changes introduced a column type change which might not be effective for an existing cluster doing upgrade.
4e43750c9d (diff-c4afa92d7e54eecff0a482cf57490aa8R40)
/assign @hongye-sun
The data stored in artifact storage are usually small. Using multi-part is not strictly a requirement.
Change the default to true to better support more platform out of box.
* Changed way visualization variables are passed from request to NotebookNode
Visualization variables are now saved to a json file and loaded by a NotebookNode upon execution.
* Updated roc_curve visualization to reflect changes made to dependency injection
* Fixed bug where checking if is_generated is provided to roc_curve visualization would crash visualizaiton
Also changed ' -> "
* Changed text_exporter to always sort variables by key for testing
* Addressed PR suggestions
It would improve the list runs call which contains filtering on [ResourceType, ReferenceUUID, ReferenceType]
We've seen cases list runs take long to run when resource_reference table is large.
```
SELECT
subq.*,
CONCAT("[", GROUP_CONCAT(r.Payload SEPARATOR ", "), "]") AS refs
FROM
(
SELECT
rd.*,
CONCAT("[", GROUP_CONCAT(m.Payload SEPARATOR ", "), "]") AS metrics
FROM
(
SELECT
UUID,
DisplayName,
Name,
StorageState,
Namespace,
Description,
CreatedAtInSec,
ScheduledAtInSec,
FinishedAtInSec,
Conditions,
PipelineId,
PipelineSpecManifest,
WorkflowSpecManifest,
Parameters,
pipelineRuntimeManifest,
WorkflowRuntimeManifest
FROM
run_details
WHERE
UUID in
(
SELECT
ResourceUUID
FROM
resource_references as rf
WHERE
(
rf.ResourceType = 'Run'
AND rf.ReferenceUUID = '488b0263-f4ee-4398-b7dc-768ffe967372'
AND rf.ReferenceType = 'Experiment'
)
)
AND StorageState <> 'STORAGESTATE_ARCHIVED'
ORDER BY
CreatedAtInSec DESC,
UUID DESC LIMIT 6
)
AS rd
LEFT JOIN
run_metrics AS m
ON rd.UUID = m.RunUUID
GROUP BY
rd.UUID
)
AS subq
LEFT JOIN
(
select
*
from
resource_references
where
ResourceType = 'Run'
)
AS r
ON subq.UUID = r.ResourceUUID
GROUP BY
subq.UUID
ORDER BY
CreatedAtInSec DESC,
UUID DESC
```
/assign @hongye-sun
* Fix the broken sample path in API
Related change b476a848d9
/hold
Need investigate why test doesn't catch it.
/assign @numerology @hongye-sun
* Update sample_config.json
* InputPath -> Source
* Changed name of data path/pattern variable from InputPath to Source to improve consistency with current visualization method
* Updated unit tests to reflect name change
* Regenerated swagger definitions to reflect name change
* Readded test that was removed with previous commit
It was deleted by mistake
* Add env var for single part support.
AddFile will have the option to send single part. for backend which doesn't support multipart.
in order to use it change ObjectStoreConfig.Disable.Multipart value to true
* few changes
* remove redundent import
* remove newlines
* Setup initial server with roc_curve visualization
* Created Dockerfile.visualization
* Fixed import issue
* Changed implementation of generate_html_from_notebook to allow template type to be specified
* Added tfdv.py
* Added unit tests for exporter.py
* Deleted __init__.py
* visualizations/ -> visualization/
* Added requirements.txt and updated Dockerfile.visualization to use it
* Updated .travis.yml to run python visualization unit tests
* Fixed travis file path issue
* Continued testing to fix travis test issues
* Removed jupyter from pip3 install
Previously included to ensure python3 kernel was accessible to jupyter_client.
* Updated requirements.txt to included ipykernel
* Removed maxDiff limit for all python tests
* Sorted keys within args dictionary to ensure tests do not fail due to dictionary order
* Created requirements-test.txt
* Added input_path argument support for python service
Also adds check for missing input_path argument and returns 400 error if argument is missing.
* Updated Copyright in Dockerfile.visualization
* Updated snapshot to include all tests
* Added types, additional comments, and TemplateType enum
Also made additional style changes
* Formatted template files
* Addressed most feedback made by @kevinbache
* Revert "Formatted template files"
This reverts commit a7afd7b8af. This was done due to issues faced by the templating engine.
* Fixed comment placement and switched os -> Path
* Changed way exporter is implemented to use importlib
* Reverted to str.format due to python comparability issue
Python 3.6 introduced support for f-stringsl, this results in the tests failing when run in a python 3.5 environment
* Added unit tests for tornado web server
* Added license script for open source compliance
* Added line between file comment and license to match exporter.py
* Updated server structure
* Created Exporter class
* Introduced ability to specify visualization timeout (default is 100 seconds)
* Added more comments
* Broke up post function in VisualizationHandler to call multiple function rather than handling all logic within post function
* Updated imports
* Updated tests
* Addressed additional feedback from @kevinbache
* Fixed snapshot for test_exporter
* Comments -> Docstring Comments and other small fixes
* Fixed missing and incorrect typings
* shutdown_kernel is now private method of Exporter class
* Added missing and updated docstring comments in server.py
* Resolved latency issue with visualization server
Issue stemmed from a recreation of an exporter object per request, this was resolved by creating a global exporter.
* Added visualization server
* Updated function names, added comments, and made serviceURL a property of VisualizationService
* Added unit tests for visualizaiton.go
* Updated BUILD.bazel
* Addressed PR comments made by @IronPan
* Wrapped input_path argument value in string when generating python arguments
* GenerateVisualizationFromRequest -> generateVisualizationFromRequest
* Addressed additional PR feedback from @IronPan
* Removed getArgumentsAsJSONFromRequest
* Removed createPythonArgumentsFromRequest
* Moved `fmt.Sprintf` to generateVisualizationFromRequest
* Updated tests to reflect changes
* Added missing word to comment
* update persistent agence to only store the argo spec
Currently when persisting the runs spawned from a job, PersistentAgent stores more information than needed into the pipeline manifest, and also miss the TypeMetadata. This resulted in storing lots of runtime information that's not needed.
Example WorkflowSpec
```
"metadata":{
"name":"fffpw4fh-2-2911767673",
"namespace":"kubeflow",
"selfLink":"/apis/argoproj.io/v1alpha1/namespaces/kubeflow/workflows/fffpw4fh-2-2911767673",
"uid":"de23bd5c-a8c1-11e9-a176-42010a800233",
"resourceVersion":"3975687",
"generation":1,
"creationTimestamp":"2019-07-17T18:37:02Z",
"labels":{
"scheduledworkflows.kubeflow.org/isOwnedByScheduledWorkflow":"true",
"scheduledworkflows.kubeflow.org/scheduledWorkflowName":"fffpw4fh",
"scheduledworkflows.kubeflow.org/workflowEpoch":"1563388612",
"scheduledworkflows.kubeflow.org/workflowIndex":"2",
"workflows.argoproj.io/phase":"Running"
},
"ownerReferences":[
{
"apiVersion":"kubeflow.org/v1beta1",
"kind":"ScheduledWorkflow",
"name":"fffpw4fh",
"uid":"91039a28-a8c1-11e9-a176-42010a800233",
"controller":true,
"blockOwnerDeletion":true
}
]
},
```
* Update workflow_test.go
* Update workflow.go
* Update resource_manager_test.go
* Update resource_manager.go
* Update workflow_test.go
* viewer controller is now namespaced so no need for cluster role
* our default namespaced install (kubeflow namespace) can also use Role instead of ClusterRole
* Viewer CRD controller running under namespace
* Change docker file and add manifest deployment yaml to support the new flag namespace
* Change docker file to support new flag namespace for viewer crd controller
* Modify kustomization.yaml and namespaced-install.yaml
* Change file name from ml-pipeline-viewer-crd-deployment to ml-pipeline-viewer-crd-deployment-patch
* Fix typo
* Remove some duplicate configs in namespaced-install
* Fix API package names and regenerate checked-in proto files. Also bump version of GRPC gateway used.
* Fix BUILD.bazel file for api as well.
* Update Bazel version