tenzen-y steps down from Katib approver role (#2561 )

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
Bump golang.org/x/crypto from 0.31.0 to 0.35.0 (#2543 )
2025-07-28 13:27:49 +00:00 · 2025-07-16 20:08:39 +00:00 · 2025-07-16 19:17:38 +00:00 · 2025-07-15 16:29:38 +00:00 · 2025-07-13 22:06:21 +00:00 · 2025-06-26 14:13:16 +00:00
1040 changed files with 71286 additions and 290333 deletions
--- a/.dockerignore
+++ b/.dockerignore
@ -4,5 +4,3 @@ docs
 manifests
 pkg/ui/*/frontend/node_modules
 pkg/ui/*/frontend/build
 pkg/new-ui/*/frontend/node_modules
 pkg/new-ui/*/frontend/build
--- a/.flake8
+++ b/.flake8
@ -0,0 +1,4 @@
 [flake8]
 max-line-length = 100
 # E203 is ignored to avoid conflicts with Black's formatting, as it's not PEP 8 compliant
 extend-ignore = W503, E203
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@ -1,25 +0,0 @@
 ---
 name: Bug report
 about: Tell us about a problem you are experiencing
 ---
 /kind bug
 **What steps did you take and what happened:**
 [A clear and concise description of what the bug is.]
 **What did you expect to happen:**
 **Anything else you would like to add:**
 [Miscellaneous information that will assist in solving the issue.]
 **Environment:**
 - Kubeflow version (`kfctl version`):
 - Minikube version (`minikube version`):
 - Kubernetes version: (use `kubectl version`):
 - OS (e.g. from `/etc/os-release`):
--- a/.github/ISSUE_TEMPLATE/bug_report.yaml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yaml
@ -0,0 +1,50 @@
 name: Bug Report
 description: Tell us about a problem you are experiencing with Katib
 labels: ["kind/bug", "lifecycle/needs-triage"]
 body:
  - type: markdown
    attributes:
      value: |
        Thanks for taking the time to fill out this Katib bug report!
  - type: textarea
    id: problem
    attributes:
      label: What happened?
      description: |
        Please provide as much info as possible. Not doing so may result in your bug not being
        addressed in a timely manner.
    validations:
      required: true
  - type: textarea
    id: expected
    attributes:
      label: What did you expect to happen?
    validations:
      required: true
  - type: textarea
    id: environment
    attributes:
      label: Environment
      value: |
        Kubernetes version:
        ```bash
        $ kubectl version
        ```
        Katib controller version:
        ```bash
        $ kubectl get pods -n kubeflow -l katib.kubeflow.org/component=controller -o jsonpath="{.items[*].spec.containers[*].image}"
        ```
        Katib Python SDK version:
        ```bash
        $ pip show kubeflow-katib
        ```
    validations:
      required: true
  - type: input
    id: votes
    attributes:
      label: Impacted by this bug?
      value: Give it a 👍 We prioritize the issues with most 👍
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@ -0,0 +1,12 @@
 blank_issues_enabled: true
 contact_links:
  - name: Katib Documentation
    url: https://www.kubeflow.org/docs/components/katib/
    about: Much help can be found in the docs
  - name: Kubeflow Katib Slack Channel
    url: https://www.kubeflow.org/docs/about/community/#kubeflow-slack-channels
    about: Ask the Katib community on CNCF Slack
  - name: Kubeflow Katib Community Meeting
    url: https://bit.ly/2PWVCkV
    about: Join the Kubeflow AutoML working group meeting
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@ -1,14 +0,0 @@
 ---
 name: Feature enhancement request
 about: Suggest an idea for this project
 ---
 /kind feature
 **Describe the solution you'd like**
 [A clear and concise description of what you want to happen.]
 **Anything else you would like to add:**
 [Miscellaneous information that will assist in solving the issue.]
--- a/.github/ISSUE_TEMPLATE/feature_request.yaml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yaml
@ -0,0 +1,28 @@
 name: Feature Request
 description: Suggest an idea for Katib
 labels: ["kind/feature", "lifecycle/needs-triage"]
 body:
  - type: markdown
    attributes:
      value: |
        Thanks for taking the time to fill out this Katib feature request!
  - type: textarea
    id: feature
    attributes:
      label: What you would like to be added?
      description: |
        A clear and concise description of what you want to add to Katib.
        Please consider to write Katib enhancement proposal if it is a large feature request.
    validations:
      required: true
  - type: textarea
    id: rationale
    attributes:
      label: Why is this needed?
    validations:
      required: true
  - type: input
    id: votes
    attributes:
      label: Love this feature?
      value: Give it a 👍 We prioritize the features with most 👍
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@ -1,6 +1,6 @@
 <!--  Thanks for sending a pull request! Here are some tips for you:
 1. If this is your first time, check our contributor guidelines https://www.kubeflow.org/docs/about/contributing
-2. To know more about Katib components, check developer guide https://github.com/kubeflow/katib/blob/master/docs/developer-guide.md
+2. To know more about Katib components, check developer guide https://github.com/kubeflow/katib/blob/master/CONTRIBUTING.md
 3. If you want *faster* PR reviews, check how: https://git.k8s.io/community/contributors/guide/pull-requests.md#best-practices-for-faster-reviews
 -->
--- a/.github/stale.yml
+++ b/.github/stale.yml
@ -1,20 +0,0 @@
 # Configuration for stale probot https://probot.github.io/apps/stale/
 # Number of days of inactivity before an issue becomes stale
 daysUntilStale: 90
 # Number of days of inactivity before a stale issue is closed
 daysUntilClose: 20
 # Issues with these labels will never be considered stale
 exemptLabels:
  - lifecycle/frozen
 # Label to use when marking an issue as stale
 staleLabel: lifecycle/stale
 # Comment to post when marking an issue as stale. Set to `false` to disable
 markComment: >
  This issue has been automatically marked as stale because it has not had
  recent activity. It will be closed if no further activity occurs. Thank you
  for your contributions.
 # Comment to post when closing a stale issue. Set to `false` to disable
 closeComment: >
  This issue has been automatically closed because it has not had recent
  activity. Please comment "/reopen" to reopen it.
--- a/.github/workflows/build-and-publish-images.yaml
+++ b/.github/workflows/build-and-publish-images.yaml
@ -0,0 +1,81 @@
 # Reusable workflows for publishing Katib images.
 name: Build and Publish Images
 on:
  workflow_call:
    inputs:
      component-name:
        required: true
        type: string
      platforms:
        required: true
        type: string
      dockerfile:
        required: true
        type: string
    secrets:
      DOCKERHUB_USERNAME:
        required: false
      DOCKERHUB_TOKEN:
        required: false
 jobs:
  build-and-publish:
    name: Build and Publish Images
    runs-on: ubuntu-22.04
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Set Publish Condition
        id: publish-condition
        shell: bash
        run: |
          if [[ "${{ github.repository }}" == 'kubeflow/katib' && \
                ( "${{ github.ref }}" == 'refs/heads/master' || \
                  "${{ github.ref }}" =~ ^refs/heads/release- || \
                  "${{ github.ref }}" =~ ^refs/tags/v ) ]]; then
            echo "should_publish=true" >> $GITHUB_OUTPUT
          else
            echo "should_publish=false" >> $GITHUB_OUTPUT
          fi
      - name: GHCR Login
        if: steps.publish-condition.outputs.should_publish == 'true'
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - name: DockerHub Login
        if: steps.publish-condition.outputs.should_publish == 'true'
        uses: docker/login-action@v3
        with:
          registry: docker.io
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}
      - name: Publish Component ${{ inputs.component-name }}
        if: steps.publish-condition.outputs.should_publish == 'true'
        id: publish
        uses: ./.github/workflows/template-publish-image
        with:
          image: |
            ghcr.io/kubeflow/katib/${{ inputs.component-name }}
            docker.io/kubeflowkatib/${{ inputs.component-name }}
          dockerfile: ${{ inputs.dockerfile }}
          platforms: ${{ inputs.platforms }}
          push: true
      - name: Test Build For Component ${{ inputs.component-name }}
        if: steps.publish.outcome == 'skipped'
        uses: ./.github/workflows/template-publish-image
        with:
          image: |
            ghcr.io/kubeflow/katib/${{ inputs.component-name }}
            docker.io/kubeflowkatib/${{ inputs.component-name }}
          dockerfile: ${{ inputs.dockerfile }}
          platforms: ${{ inputs.platforms }}
          push: false
--- a/.github/workflows/e2e-test-darts-cifar10.yaml
+++ b/.github/workflows/e2e-test-darts-cifar10.yaml
@ -0,0 +1,38 @@
 name: E2E Test with darts-cnn-cifar10
 on:
  pull_request:
    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  e2e:
    runs-on: ubuntu-22.04
    timeout-minutes: 120
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Setup Test Env
        uses: ./.github/workflows/template-setup-e2e-test
        with:
          kubernetes-version: ${{ matrix.kubernetes-version }}
          python-version: "3.11"
      - name: Run e2e test with ${{ matrix.experiments }} experiments
        uses: ./.github/workflows/template-e2e-test
        with:
          experiments: ${{ matrix.experiments }}
          # Comma Delimited
          trial-images: darts-cnn-cifar10-cpu
    strategy:
      fail-fast: false
      matrix:
        kubernetes-version: ["v1.29.2", "v1.30.7", "v1.31.3"]
        # Comma Delimited
        experiments: ["darts-cpu"]
--- a/.github/workflows/e2e-test-enas-cifar10.yaml
+++ b/.github/workflows/e2e-test-enas-cifar10.yaml
@ -0,0 +1,38 @@
 name: E2E Test with enas-cnn-cifar10
 on:
  pull_request:
    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  e2e:
    runs-on: ubuntu-22.04
    timeout-minutes: 120
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Setup Test Env
        uses: ./.github/workflows/template-setup-e2e-test
        with:
          kubernetes-version: ${{ matrix.kubernetes-version }}
          python-version: "3.8"
      - name: Run e2e test with ${{ matrix.experiments }} experiments
        uses: ./.github/workflows/template-e2e-test
        with:
          experiments: ${{ matrix.experiments }}
          # Comma Delimited
          trial-images: enas-cnn-cifar10-cpu
    strategy:
      fail-fast: false
      matrix:
        kubernetes-version: ["v1.29.2", "v1.30.7", "v1.31.3"]
        # Comma Delimited
        experiments: ["enas-cpu"]
--- a/.github/workflows/e2e-test-pytorch-mnist.yaml
+++ b/.github/workflows/e2e-test-pytorch-mnist.yaml
@ -0,0 +1,46 @@
 name: E2E Test with pytorch-mnist
 on:
  pull_request:
    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  e2e:
    runs-on: ubuntu-22.04
    timeout-minutes: 120
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Setup Test Env
        uses: ./.github/workflows/template-setup-e2e-test
        with:
          kubernetes-version: ${{ matrix.kubernetes-version }}
          python-version: "3.10"
      - name: Run e2e test with ${{ matrix.experiments }} experiments
        uses: ./.github/workflows/template-e2e-test
        with:
          experiments: ${{ matrix.experiments }}
          training-operator: true
          # Comma Delimited
          trial-images: pytorch-mnist-cpu
    strategy:
      fail-fast: false
      matrix:
        kubernetes-version: ["v1.29.2", "v1.30.7", "v1.31.3"]
        # Comma Delimited
        experiments:
          # suggestion-hyperopt
          - "long-running-resume,from-volume-resume,median-stop"
          # others
          - "grid,bayesian-optimization,tpe,multivariate-tpe,cma-es,hyperband"
          - "hyperopt-distribution,optuna-distribution"
          - "file-metrics-collector,pytorchjob-mnist"
          - "median-stop-with-json-format,file-metrics-collector-with-json-format"
--- a/.github/workflows/e2e-test-simple-pbt.yaml
+++ b/.github/workflows/e2e-test-simple-pbt.yaml
@ -0,0 +1,38 @@
 name: E2E Test with simple-pbt
 on:
  pull_request:
    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  e2e:
    runs-on: ubuntu-22.04
    timeout-minutes: 120
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Setup Test Env
        uses: ./.github/workflows/template-setup-e2e-test
        with:
          kubernetes-version: ${{ matrix.kubernetes-version }}
      - name: Run e2e test with ${{ matrix.experiments }} experiments
        uses: ./.github/workflows/template-e2e-test
        with:
          experiments: ${{ matrix.experiments }}
          # Comma Delimited
          trial-images: simple-pbt
    strategy:
      fail-fast: false
      matrix:
        # Detail: https://hub.docker.com/r/kindest/node
        kubernetes-version: ["v1.29.2", "v1.30.7", "v1.31.3"]
        # Comma Delimited
        experiments: ["simple-pbt"]
--- a/.github/workflows/e2e-test-tf-mnist-with-summaries.yaml
+++ b/.github/workflows/e2e-test-tf-mnist-with-summaries.yaml
@ -0,0 +1,38 @@
 name: E2E Test with tf-mnist-with-summaries
 on:
  pull_request:
    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  e2e:
    runs-on: ubuntu-22.04
    timeout-minutes: 120
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Setup Test Env
        uses: ./.github/workflows/template-setup-e2e-test
        with:
          kubernetes-version: ${{ matrix.kubernetes-version }}
      - name: Run e2e test with ${{ matrix.experiments }} experiments
        uses: ./.github/workflows/template-e2e-test
        with:
          experiments: ${{ matrix.experiments }}
          training-operator: true
          # Comma Delimited
          trial-images: tf-mnist-with-summaries
    strategy:
      fail-fast: false
      matrix:
        kubernetes-version: ["v1.29.2", "v1.30.7", "v1.31.3"]
        # Comma Delimited
        experiments: ["tfjob-mnist-with-summaries"]
--- a/.github/workflows/e2e-test-tune-api.yaml
+++ b/.github/workflows/e2e-test-tune-api.yaml
@ -0,0 +1,40 @@
 name: E2E Test with tune API
 on:
  pull_request:
    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  e2e:
    runs-on: ubuntu-22.04
    timeout-minutes: 120
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Setup Test Env
        uses: ./.github/workflows/template-setup-e2e-test
        with:
          kubernetes-version: ${{ matrix.kubernetes-version }}
      - name: Install Katib SDK with extra requires
        shell: bash
        run: |
          pip install --prefer-binary -e 'sdk/python/v1beta1[huggingface]'
      - name: Run e2e test with tune API
        uses: ./.github/workflows/template-e2e-test
        with:
          tune-api: true
          training-operator: true
    strategy:
      fail-fast: false
      matrix:
        # Detail: https://hub.docker.com/r/kindest/node
        kubernetes-version: ["v1.29.2", "v1.30.7", "v1.31.3"]
--- a/.github/workflows/e2e-test-ui-random-search-postgres.yaml
+++ b/.github/workflows/e2e-test-ui-random-search-postgres.yaml
@ -0,0 +1,35 @@
 name: E2E Test with Katib UI, random search, and postgres
 on:
  - pull_request
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  e2e:
    runs-on: ubuntu-22.04
    timeout-minutes: 120
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Setup Test Env
        uses: ./.github/workflows/template-setup-e2e-test
        with:
          kubernetes-version: ${{ matrix.kubernetes-version }}
      - name: Run e2e test with ${{ matrix.experiments }} experiments
        uses: ./.github/workflows/template-e2e-test
        with:
          experiments: random
          # Comma Delimited
          trial-images: pytorch-mnist-cpu
          katib-ui: true
          database-type: postgres
    strategy:
      fail-fast: false
      matrix:
        kubernetes-version: ["v1.29.2", "v1.30.7", "v1.31.3"]
--- a/.github/workflows/free-up-disk-space/action.yaml
+++ b/.github/workflows/free-up-disk-space/action.yaml
@ -0,0 +1,49 @@
 name: Free-Up Disk Space
 description: Remove Non-Essential Tools And Move Docker Data Directory to /mnt/docker
 runs:
  using: composite
  steps:
    # This step is a Workaround to avoid the "No space left on device" error.
    # ref: https://github.com/actions/runner-images/issues/2840
    - name: Remove unnecessary files
      shell: bash
      run: |
        echo "Disk usage before cleanup:"
        df -hT
        sudo rm -rf /usr/share/dotnet
        sudo rm -rf /opt/ghc
        sudo rm -rf /usr/local/share/boost
        sudo rm -rf "$AGENT_TOOLSDIRECTORY"
        sudo rm -rf /usr/local/lib/android
        sudo rm -rf /usr/local/share/powershell
        sudo rm -rf /usr/share/swift
        echo "Disk usage after cleanup:"
        df -hT
    - name: Prune docker images
      shell: bash
      run: |
        docker image prune -a -f
        docker system df
        df -hT
    - name: Move docker data directory
      shell: bash
      run: |
        echo "Stopping docker service ..."
        sudo systemctl stop docker
        DOCKER_DEFAULT_ROOT_DIR=/var/lib/docker
        DOCKER_ROOT_DIR=/mnt/docker
        echo "Moving ${DOCKER_DEFAULT_ROOT_DIR} -> ${DOCKER_ROOT_DIR}"
        sudo mv ${DOCKER_DEFAULT_ROOT_DIR} ${DOCKER_ROOT_DIR}
        echo "Creating symlink ${DOCKER_DEFAULT_ROOT_DIR} -> ${DOCKER_ROOT_DIR}"
        sudo ln -s ${DOCKER_ROOT_DIR} ${DOCKER_DEFAULT_ROOT_DIR}
        echo "$(sudo ls -l ${DOCKER_DEFAULT_ROOT_DIR})"
        echo "Starting docker service ..."
        sudo systemctl daemon-reload
        sudo systemctl start docker
        echo "Docker service status:"
        sudo systemctl --no-pager -l -o short status docker
--- a/.github/workflows/publish-algorithm-images.yaml
+++ b/.github/workflows/publish-algorithm-images.yaml
@ -0,0 +1,42 @@
 name: Publish AutoML Algorithm Images
 on:
  push:
  pull_request:
    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 jobs:
  algorithm:
    name: Publish Image
    uses: ./.github/workflows/build-and-publish-images.yaml
    with:
      component-name: ${{ matrix.component-name }}
      platforms: linux/amd64,linux/arm64
      dockerfile: ${{ matrix.dockerfile }}
    secrets:
      DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
      DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
    strategy:
      fail-fast: false
      matrix:
        include:
          - component-name: suggestion-hyperopt
            dockerfile: cmd/suggestion/hyperopt/v1beta1/Dockerfile
          - component-name: suggestion-hyperband
            dockerfile: cmd/suggestion/hyperband/v1beta1/Dockerfile
          - component-name: suggestion-skopt
            dockerfile: cmd/suggestion/skopt/v1beta1/Dockerfile
          - component-name: suggestion-goptuna
            dockerfile: cmd/suggestion/goptuna/v1beta1/Dockerfile
          - component-name: suggestion-optuna
            dockerfile: cmd/suggestion/optuna/v1beta1/Dockerfile
          - component-name: suggestion-pbt
            dockerfile: cmd/suggestion/pbt/v1beta1/Dockerfile
          - component-name: suggestion-enas
            dockerfile: cmd/suggestion/nas/enas/v1beta1/Dockerfile
          - component-name: suggestion-darts
            dockerfile: cmd/suggestion/nas/darts/v1beta1/Dockerfile
          - component-name: earlystopping-medianstop
            dockerfile: cmd/earlystopping/medianstop/v1beta1/Dockerfile
--- a/.github/workflows/publish-conformance-images.yaml
+++ b/.github/workflows/publish-conformance-images.yaml
@ -0,0 +1,24 @@
 name: Publish Katib Conformance Test Images
 on:
  - push
  - pull_request
 jobs:
  core:
    name: Publish Image
    uses: ./.github/workflows/build-and-publish-images.yaml
    with:
      component-name: ${{ matrix.component-name }}
      platforms: linux/amd64,linux/arm64
      dockerfile: ${{ matrix.dockerfile }}
    secrets:
      DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
      DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
    strategy:
      fail-fast: false
      matrix:
        include:
          - component-name: katib-conformance
            dockerfile: Dockerfile.conformance
--- a/.github/workflows/publish-core-images.yaml
+++ b/.github/workflows/publish-core-images.yaml
@ -0,0 +1,32 @@
 name: Publish Katib Core Images
 on:
  - push
  - pull_request
 jobs:
  core:
    name: Publish Image
    uses: ./.github/workflows/build-and-publish-images.yaml
    with:
      component-name: ${{ matrix.component-name }}
      platforms: linux/amd64,linux/arm64
      dockerfile: ${{ matrix.dockerfile }}
    secrets:
      DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
      DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
    strategy:
      fail-fast: false
      matrix:
        include:
          - component-name: katib-controller
            dockerfile: cmd/katib-controller/v1beta1/Dockerfile
          - component-name: katib-db-manager
            dockerfile: cmd/db-manager/v1beta1/Dockerfile
          - component-name: katib-ui
            dockerfile: cmd/ui/v1beta1/Dockerfile
          - component-name: file-metrics-collector
            dockerfile: cmd/metricscollector/v1beta1/file-metricscollector/Dockerfile
          - component-name: tfevent-metrics-collector
            dockerfile: cmd/metricscollector/v1beta1/tfevent-metricscollector/Dockerfile
--- a/.github/workflows/publish-trial-images.yaml
+++ b/.github/workflows/publish-trial-images.yaml
@ -0,0 +1,48 @@
 name: Publish Trial Images
 on:
  push:
  pull_request:
    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 jobs:
  trial:
    name: Publish Image
    uses: ./.github/workflows/build-and-publish-images.yaml
    with:
      component-name: ${{ matrix.trial-name }}
      platforms: ${{ matrix.platforms }}
      dockerfile: ${{ matrix.dockerfile }}
    secrets:
      DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
      DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
    strategy:
      fail-fast: false
      matrix:
        include:
          - trial-name: pytorch-mnist-cpu
            platforms: linux/amd64,linux/arm64
            dockerfile: examples/v1beta1/trial-images/pytorch-mnist/Dockerfile.cpu
          - trial-name: pytorch-mnist-gpu
            platforms: linux/amd64
            dockerfile: examples/v1beta1/trial-images/pytorch-mnist/Dockerfile.gpu
          - trial-name: tf-mnist-with-summaries
            platforms: linux/amd64,linux/arm64
            dockerfile: examples/v1beta1/trial-images/tf-mnist-with-summaries/Dockerfile
          - trial-name: enas-cnn-cifar10-gpu
            platforms: linux/amd64
            dockerfile: examples/v1beta1/trial-images/enas-cnn-cifar10/Dockerfile.gpu
          - trial-name: enas-cnn-cifar10-cpu
            platforms: linux/amd64,linux/arm64
            dockerfile: examples/v1beta1/trial-images/enas-cnn-cifar10/Dockerfile.cpu
          - trial-name: darts-cnn-cifar10-cpu
            platforms: linux/amd64,linux/arm64
            dockerfile: examples/v1beta1/trial-images/darts-cnn-cifar10/Dockerfile.cpu
          - trial-name: darts-cnn-cifar10-gpu
            platforms: linux/amd64
            dockerfile: examples/v1beta1/trial-images/darts-cnn-cifar10/Dockerfile.gpu
          - trial-name: simple-pbt
            platforms: linux/amd64,linux/arm64
            dockerfile: examples/v1beta1/trial-images/simple-pbt/Dockerfile
--- a/.github/workflows/stale.yaml
+++ b/.github/workflows/stale.yaml
@ -0,0 +1,42 @@
 # This workflow warns and then closes issues and PRs that have had no activity for a specified amount of time.
 #
 # You can adjust the behavior by modifying this file.
 # For more information, see:
 # https://github.com/actions/stale
 name: Mark stale issues and pull requests
 on:
  schedule:
    - cron: "0 */5 * * *"
 jobs:
  stale:
    runs-on: ubuntu-22.04
    permissions:
      issues: write
      pull-requests: write
    steps:
      - uses: actions/stale@v5
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}
          days-before-stale: 90
          days-before-close: 20
          stale-issue-message: >
            This issue has been automatically marked as stale because it has not had
            recent activity. It will be closed if no further activity occurs. Thank you
            for your contributions.
          close-issue-message: >
            This issue has been automatically closed because it has not had recent
            activity. Please comment "/reopen" to reopen it.
          stale-issue-label: lifecycle/stale
          exempt-issue-labels: lifecycle/frozen
          stale-pr-message: >
            This pull request has been automatically marked as stale because it has not had
            recent activity. It will be closed if no further activity occurs. Thank you
            for your contributions.
          close-pr-message: >
            This pull request has been automatically closed because it has not had recent
            activity. Please comment "/reopen" to reopen it.
          stale-pr-label: lifecycle/stale
          exempt-pr-labels: lifecycle/frozen
--- a/.github/workflows/template-e2e-test/action.yaml
+++ b/.github/workflows/template-e2e-test/action.yaml
@ -0,0 +1,49 @@
 # Composite action for e2e tests.
 name: Run E2E Test
 description: Run e2e test using the minikube cluster
 inputs:
  experiments:
    required: false
    description: comma delimited experiment name
    default: ""
  training-operator:
    required: false
    description: whether to deploy training-operator or not
    default: false
  trial-images:
    required: false
    description: comma delimited trial image name
    default: ""
  katib-ui:
    required: true
    description: whether to deploy katib-ui or not
    default: false
  database-type:
    required: false
    description: mysql or postgres
    default: mysql
  tune-api:
    required: true
    description: whether to execute tune-api test or not
    default: false
 runs:
  using: composite
  steps:
    - name: Setup Minikube Cluster
      shell: bash
      run: ./test/e2e/v1beta1/scripts/gh-actions/setup-minikube.sh ${{ inputs.katib-ui }} ${{ inputs.tune-api }} ${{ inputs.trial-images }} ${{ inputs.experiments }}
    - name: Setup Katib
      shell: bash
      run: ./test/e2e/v1beta1/scripts/gh-actions/setup-katib.sh ${{ inputs.katib-ui }} ${{ inputs.training-operator }} ${{ inputs.database-type }}
    - name: Run E2E Experiment
      shell: bash
      run: |
        if "${{ inputs.tune-api }}"; then
          ./test/e2e/v1beta1/scripts/gh-actions/run-e2e-tune-api.sh
        else
          ./test/e2e/v1beta1/scripts/gh-actions/run-e2e-experiment.sh ${{ inputs.experiments }}
        fi
--- a/.github/workflows/template-publish-image/action.yaml
+++ b/.github/workflows/template-publish-image/action.yaml
@ -0,0 +1,62 @@
 # Composite action for publishing Katib images.
 name: Build And Publish Container Images
 description: Build MultiPlatform Supporting Container Images
 inputs:
  image:
    required: true
    description: image tag
  dockerfile:
    required: true
    description: path for dockerfile
  platforms:
    required: true
    description: linux/amd64 or linux/amd64,linux/arm64
  push:
    required: true
    description: whether to push container images or not
 runs:
  using: composite
  steps:
      # This step is a Workaround to avoid the "No space left on device" error.
      # ref: https://github.com/actions/runner-images/issues/2840
    - name: Remove unnecessary files
      shell: bash
      run: |
        sudo rm -rf /usr/share/dotnet
        sudo rm -rf /opt/ghc
        sudo rm -rf "/usr/local/share/boost"
        sudo rm -rf "$AGENT_TOOLSDIRECTORY"
        sudo rm -rf /usr/local/lib/android
        sudo rm -rf /usr/local/share/powershell
        sudo rm -rf /usr/share/swift
        echo "Disk usage after cleanup:"
        df -h
    - name: Set up QEMU
      uses: docker/setup-qemu-action@v3
    - name: Set Up Docker Buildx
      uses: docker/setup-buildx-action@v3
    - name: Add Docker Tags
      id: meta
      uses: docker/metadata-action@v5
      with:
        images: ${{ inputs.image }}
        tags: |
          type=raw,latest
          type=sha,prefix=v1beta1-
    - name: Build and Push
      uses: docker/build-push-action@v5
      with:
        context: .
        file: ${{ inputs.dockerfile }}
        push: ${{ inputs.push }}
        tags: ${{ steps.meta.outputs.tags }}
        cache-from: type=gha
        cache-to: type=gha,mode=max,ignore-error=true
        platforms: ${{ inputs.platforms }}
--- a/.github/workflows/template-setup-e2e-test/action.yaml
+++ b/.github/workflows/template-setup-e2e-test/action.yaml
@ -0,0 +1,48 @@
 # Composite action to setup e2e tests.
 name: Setup E2E Test
 description: setup env for e2e test using the minikube cluster
 inputs:
  kubernetes-version:
    required: true
    description: kubernetes version
  python-version:
    required: false
    description: Python version
    # Most latest supporting version
    default: "3.10"
 runs:
  using: composite
  steps:
    # This step is a Workaround to avoid the "No space left on device" error.
    # ref: https://github.com/actions/runner-images/issues/2840
    - name: Free-Up Disk Space
      uses: ./.github/workflows/free-up-disk-space
    - name: Setup kubectl
      uses: azure/setup-kubectl@v4
      with:
        version: ${{ inputs.kubernetes-version }}
    - name: Setup Minikube Cluster
      uses: medyagh/setup-minikube@v0.0.18
      with:
        network-plugin: cni
        cni: flannel
        driver: none
        kubernetes-version: ${{ inputs.kubernetes-version }}
        minikube-version: 1.34.0
        start-args: --wait-timeout=120s
    - name: Setup Docker Buildx
      uses: docker/setup-buildx-action@v3
    - name: Setup Python
      uses: actions/setup-python@v5
      with:
        python-version: ${{ inputs.python-version }}
    - name: Install Katib SDK
      shell: bash
      run: pip install --prefer-binary -e sdk/python/v1beta1
--- a/.github/workflows/test-charmed-katib.yaml
+++ b/.github/workflows/test-charmed-katib.yaml
@ -1,110 +0,0 @@
 name: Charmed Katib
 on:
  - push
  - pull_request
 jobs:
  lint:
    name: Lint
    runs-on: ubuntu-latest
    steps:
      - name: Check out code
        uses: actions/checkout@v2
      - name: Install dependencies
        run: |
          sudo apt-get install python3-setuptools
          sudo pip3 install black flake8
      - name: Check black
        run: black --check operators
      - name: Check flake8
        run: cd operators && flake8
  build:
    name: Test
    runs-on: ubuntu-latest
    steps:
      - name: Check out repo
        uses: actions/checkout@v2
      - uses: balchua/microk8s-actions@v0.2.2
        with:
          channel: "1.20/stable"
          addons: '["dns", "storage", "rbac"]'
      - name: Install dependencies
        run: |
          set -eux
          sudo apt update
          sudo apt install -y python3-pip
          sudo snap install charm --classic
          sudo snap install juju --classic
          sudo snap install juju-helpers --classic
          sudo snap install juju-wait --classic
          sudo pip3 install charmcraft==1.0.0
      - name: Build Docker images
        run: |
          set -eux
          images=("katib-controller" "katib-ui" "katib-db-manager")
          folders=("katib-controller" "ui" "db-manager")
          for idx in {0..2}; do
            docker build . \
                -t docker.io/kubeflowkatib/${images[$idx]}:latest \
                -f cmd/${folders[$idx]}/v1beta1/Dockerfile
            docker save docker.io/kubeflowkatib/${images[$idx]} > ${images[$idx]}.tar
            microk8s ctr image import ${images[$idx]}.tar
          done
      - name: Deploy Katib
        run: |
          set -eux
          cd operators/
          git clone git://git.launchpad.net/canonical-osm
          cp -r canonical-osm/charms/interfaces/juju-relation-mysql mysql
          sg microk8s -c 'juju bootstrap microk8s uk8s'
          juju add-model kubeflow
          juju bundle deploy -b bundle-edge.yaml --build
          juju wait -wvt 300
      - name: Test Katib
        run: |
          set -eux
          kubectl apply -f examples/v1beta1/random-example.yaml
      - name: Get pod statuses
        run: kubectl get all -A
        if: failure()
      - name: Get juju status
        run: juju status
        if: failure()
      - name: Get katib-controller workload logs
        run: kubectl logs --tail 100 -nkubeflow -ljuju-app=katib-controller
        if: failure()
      - name: Get katib-controller operator logs
        run: kubectl logs --tail 100 -nkubeflow -ljuju-operator=katib-controller
        if: failure()
      - name: Get katib-ui workload logs
        run: kubectl logs --tail 100 -nkubeflow -ljuju-app=katib-ui
        if: failure()
      - name: Get katib-ui operator logs
        run: kubectl logs --tail 100 -nkubeflow -ljuju-operator=katib-ui
        if: failure()
      - name: Get katib-db-manager workload logs
        run: kubectl logs --tail 100 -nkubeflow -ljuju-app=katib-db-manager
        if: failure()
      - name: Get katib-db-manager operator logs
        run: kubectl logs --tail 100 -nkubeflow -ljuju-operator=katib-db-manager
        if: failure()
--- a/.github/workflows/test-go.yaml
+++ b/.github/workflows/test-go.yaml
@ -1,52 +1,79 @@
 name: Go Test
 on:
-  - push
+  pull_request:
-  - pull_request
+    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
-  test:
+  generatetests:
-    name: Test
+    name: Generate And Format Test
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-22.04
    env:
      GOPATH: ${{ github.workspace }}/go
    defaults:
      run:
        working-directory: ${{ env.GOPATH }}/src/github.com/kubeflow/katib
    steps:
      - name: Check out code
-        uses: actions/checkout@v2
+        uses: actions/checkout@v4
        with:
          path: ${{ env.GOPATH }}/src/github.com/kubeflow/katib
      - name: Setup Go
-        uses: actions/setup-go@v2
+        uses: actions/setup-go@v5
        with:
-          go-version: 1.15.13
+          go-version-file: ${{ env.GOPATH }}/src/github.com/kubeflow/katib/go.mod
          cache-dependency-path: ${{ env.GOPATH }}/src/github.com/kubeflow/katib/go.sum
-      # Verify that go.mod and go.sum is synchronized
+      - name: Check Go Modules, Generated Go/Python codes, and Format
-      - name: Check Go modules
+        run: make check
-        run: |
+
-          if [[ ! -z $(go mod tidy && git diff --exit-code) ]]; then
+  unittests:
-            echo "Please run "go mod tidy" to sync Go modules"
+    name: Unit Test
-            exit 1
+    runs-on: ubuntu-22.04
-          fi
+    env:
      GOPATH: ${{ github.workspace }}/go
    defaults:
      run:
        working-directory: ${{ env.GOPATH }}/src/github.com/kubeflow/katib
    steps:
      - name: Check out code
        uses: actions/checkout@v4
        with:
          path: ${{ env.GOPATH }}/src/github.com/kubeflow/katib
      - name: Setup Go
        uses: actions/setup-go@v5
        with:
          go-version-file: ${{ env.GOPATH }}/src/github.com/kubeflow/katib/go.mod
          cache-dependency-path: ${{ env.GOPATH }}/src/github.com/kubeflow/katib/go.sum
      - name: Run Go test
-        run: |
+        run: go mod download && make test ENVTEST_K8S_VERSION=${{ matrix.kubernetes-version }}
          go mod download
          curl -L -O "https://github.com/kubernetes-sigs/kubebuilder/releases/download/v2.3.0/kubebuilder_2.3.0_linux_amd64.tar.gz"
          tar -zxvf kubebuilder_2.3.0_linux_amd64.tar.gz
          sudo mv kubebuilder_2.3.0_linux_amd64 /usr/local/kubebuilder
          export PATH=$PATH:/usr/local/kubebuilder/bin
          make check
          make test
      - name: Coveralls report
        uses: shogo82148/actions-goveralls@v1
        with:
          path-to-profile: coverage.out
          working-directory: ${{ env.GOPATH }}/src/github.com/kubeflow/katib
          parallel: true
    strategy:
      fail-fast: false
      matrix:
        # Detail: `setup-envtest list`
        kubernetes-version: ["1.29.3", "1.30.0", "1.31.0"]
  # notifies that all test jobs are finished.
  finish:
    needs: unittests
    runs-on: ubuntu-22.04
    steps:
      - uses: shogo82148/actions-goveralls@v1
        with:
          parallel-finished: true
--- a/.github/workflows/test-lint.yaml
+++ b/.github/workflows/test-lint.yaml
@ -0,0 +1,30 @@
 name: Lint Files
 on:
  pull_request:
    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  lint:
    name: Lint
    runs-on: ubuntu-22.04
    steps:
      - name: Check out code
        uses: actions/checkout@v4
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: 3.9
      - name: Check shell scripts
        run: make shellcheck
      - name: Run pre-commit
        uses: pre-commit/action@v3.0.1
--- a/.github/workflows/test-node.yaml
+++ b/.github/workflows/test-node.yaml
@ -1,24 +1,101 @@
 name: Frontend Test
 on:
-  - push
+  pull_request:
-  - pull_request
+    paths:
      - pkg/ui/v1beta1/frontend/**
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  test:
-    name: Test
+    name: Code format and lint
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-22.04
    steps:
      - name: Check out code
-        uses: actions/checkout@v2
+        uses: actions/checkout@v4
      - name: Setup Node
-        uses: actions/setup-node@v2
+        uses: actions/setup-node@v4
        with:
-          node-version: 12.18.1
+          node-version: 16.20.2
-      - name: Run Node test
+      - name: Format katib code
        run: |
-          npm install --global prettier@2.2.0
+          npm install prettier --prefix ./pkg/ui/v1beta1/frontend
          make prettier-check
      - name: Lint katib code
        run: |
          cd pkg/ui/v1beta1/frontend
          npm run lint-check
  frontend-unit-tests:
    name: Frontend Unit Tests
    runs-on: ubuntu-22.04
    steps:
      - name: Check out code
        uses: actions/checkout@v4
      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: 16.20.2
      - name: Fetch Kubeflow and install common code dependencies
        run: |
          COMMIT=$(cat pkg/ui/v1beta1/frontend/COMMIT)
          cd /tmp && git clone https://github.com/kubeflow/kubeflow.git
          cd kubeflow
          git checkout $COMMIT
          cd components/crud-web-apps/common/frontend/kubeflow-common-lib
          npm i
          npm run build
          npm link ./dist/kubeflow
      - name: Install KWA dependencies
        run: |
          cd pkg/ui/v1beta1/frontend
          npm i
          npm link kubeflow
      - name: Run unit tests
        run: |
          cd pkg/ui/v1beta1/frontend
          npm run test:prod
  frontend-ui-tests:
    name: UI tests with Cypress
    runs-on: ubuntu-22.04
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Setup node version to 16
        uses: actions/setup-node@v4
        with:
          node-version: 16
      - name: Fetch Kubeflow and install common code dependencies
        run: |
          COMMIT=$(cat pkg/ui/v1beta1/frontend/COMMIT)
          cd /tmp && git clone https://github.com/kubeflow/kubeflow.git
          cd kubeflow
          git checkout $COMMIT
          cd components/crud-web-apps/common/frontend/kubeflow-common-lib
          npm i
          npm run build
          npm link ./dist/kubeflow
      - name: Install KWA dependencies
        run: |
          cd pkg/ui/v1beta1/frontend
          npm i
          npm link kubeflow
      - name: Serve UI & run Cypress tests in Chrome and Firefox
        run: |
          cd pkg/ui/v1beta1/frontend
          npm run start & npx wait-on http://localhost:4200
          npm run ui-test-ci-all
--- a/.github/workflows/test-python.yaml
+++ b/.github/workflows/test-python.yaml
@ -0,0 +1,47 @@
 name: Python Test
 on:
  pull_request:
    paths-ignore:
      - "pkg/ui/v1beta1/frontend/**"
 concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  test:
    name: Test
    runs-on: ubuntu-22.04
    steps:
      - name: Check out code
        uses: actions/checkout@v4
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: 3.11
      - name: Run Python test
        run: make pytest
  # The skopt service doesn't work appropriately with Python 3.11.
  # So, we need to run the test with Python 3.9.
  # TODO (tenzen-y): Once we stop to support skopt, we can remove this test.
  # REF: https://github.com/kubeflow/katib/issues/2280
  test-skopt:
    name: Test Skopt
    runs-on: ubuntu-22.04
    steps:
      - name: Check out code
        uses: actions/checkout@v4
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: 3.9
      - name: Run Python test
        run: make pytest-skopt
--- a/.gitignore
+++ b/.gitignore
@ -6,6 +6,7 @@ __pycache__/
 *.egg-info
 build/
 *.charm
 test/unit/v1beta1/metricscollector/testdata
 # SDK generator JAR file
 hack/gen-python-sdk/openapi-generator-cli.jar
@ -21,6 +22,7 @@ bin
 *.dll
 *.so
 *.dylib
 pkg/metricscollector/v1beta1/file-metricscollector/testdata
 ## Test binary, build with `go test -c`
 *.test
@ -76,3 +78,6 @@ $RECYCLE.BIN/
 ## Vendor dir
 vendor
 # Jupyter Notebooks.
 **/.ipynb_checkpoints
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@ -0,0 +1,38 @@
 repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v2.3.0
    hooks:
      - id: check-yaml
        args: [--allow-multiple-documents]
      - id: check-json
  - repo: https://github.com/pycqa/isort
    rev: 5.11.5
    hooks:
      - id: isort
        name: isort
        entry: isort --profile black
  - repo: https://github.com/psf/black
    rev: 24.2.0
    hooks:
      - id: black
        files: (sdk|examples|pkg)/.*
  - repo: https://github.com/pycqa/flake8
    rev: 7.1.1
    hooks:
      - id: flake8
        files: (sdk|examples|pkg)/.*
 exclude: |
  (?x)^(
    .*zz_generated.deepcopy.*|
    .*pb.go|
    pkg/apis/manager/.*pb2(?:_grpc)?.py(?:i)?|
    pkg/apis/v1beta1/openapi_generated.go|
    pkg/mock/.*|
    pkg/client/controller/.*|
    sdk/python/v1beta1/kubeflow/katib/configuration.py|
    sdk/python/v1beta1/kubeflow/katib/rest.py|
    sdk/python/v1beta1/kubeflow/katib/__init__.py|
    sdk/python/v1beta1/kubeflow/katib/exceptions.py|
    sdk/python/v1beta1/kubeflow/katib/api_client.py|
    sdk/python/v1beta1/kubeflow/katib/models/.*
  )$
--- a/ADOPTERS.md
+++ b/ADOPTERS.md
@ -5,12 +5,16 @@ please add yourself into the following list by a pull request.
 Please keep the list in alphabetical order.
 | Organization                                     | Contact                                              | Description of Use                                                   |
-| ------------ | ------- | ------------------ |
+|--------------------------------------------------|------------------------------------------------------|----------------------------------------------------------------------|
 | [Akuity](https://akuity.io/)                     | [@terrytangyuan](https://github.com/terrytangyuan)   |                                                                      |
 | [Ant Group](https://www.antgroup.com/)           | [@ohmystack](https://github.com/ohmystack)           | Automatic training in Ant Group internal AI Platform                 |
 | [babylon health](https://www.babylonhealth.com/) | [@jeremievallee](https://github.com/jeremievallee)   | Hyperparameter tuning for AIR internal AI Platform                   |
 | [caicloud](https://caicloud.io/)                 | [@gaocegege](https://github.com/gaocegege)           | Hyperparameter tuning in Caicloud Cloud-Native AI Platform           |
 | [canonical](https://ubuntu.com/)                 | [@RFMVasconcelos](https://github.com/rfmvasconcelos) | Hyperparameter tuning for customer projects in Defense and Fintech   |
 | [CERN](https://home.cern/)                       | [@d-gol](https://github.com/d-gol)                   | Hyperparameter tuning within the ML platform on private cloud   |
 | [cisco](https://cisco.com/)                      | [@ramdootp](https://github.com/ramdootp)             | Hyperparameter tuning for conversational AI interface using Rasa     |
 | [cubonacci](https://www.cubonacci.com)           | [@janvdvegt](https://github.com/janvdvegt)           | Hyperparameter tuning within the Cubonacci machine learning platform |
 | [CyberAgent](https://www.cyberagent.co.jp/en/)   | [@tenzen-y](https://github.com/tenzen-y)             | Experiment in CyberAgent internal ML Platform on Private Cloud       |
 | [fuzhi](http://www.fuzhi.ai/)                    | [@planck0591](https://github.com/planck0591)         | Experiment and Trial in autoML Platform                              |
 | [karrot](https://uk.karrotmarket.com/)           | [@muik](https://github.com/muik)                     | Hyperparameter tuning in Karrot ML Platform                          |
 | [PITS Global Data Recovery Services](https://www.pitsdatarecovery.net/) | [@pheianox](https://github.com/pheianox) | CyberAgent and ML Platform |
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
--- a/CITATION.cff
+++ b/CITATION.cff
@ -0,0 +1,43 @@
 cff-version: 1.2.0
 message: "If you use Katib in your scientific publication, please cite it as below."
 authors:
  - family-names: "George"
    given-names: "Johnu"
  - family-names: "Gao"
    given-names: "Ce"
  - family-names: "Liu"
    given-names: "Richard"
  - family-names: "Liu"
    given-names: "Hou Gang"
  - family-names: "Tang"
    given-names: "Yuan"
  - family-names: "Pydipaty"
    given-names: "Ramdoot"
  - family-names: "Saha"
    given-names: "Amit Kumar"
 title: "Katib"
 type: software
 repository-code: "https://github.com/kubeflow/katib"
 preferred-citation:
  type: misc
  title: "A Scalable and Cloud-Native Hyperparameter Tuning System"
  authors:
    - family-names: "George"
      given-names: "Johnu"
    - family-names: "Gao"
      given-names: "Ce"
    - family-names: "Liu"
      given-names: "Richard"
    - family-names: "Liu"
      given-names: "Hou Gang"
    - family-names: "Tang"
      given-names: "Yuan"
    - family-names: "Pydipaty"
      given-names: "Ramdoot"
    - family-names: "Saha"
      given-names: "Amit Kumar"
  year: 2020
  url: "https://arxiv.org/abs/2006.02085"
  identifiers:
    - type: "other"
      value: "arXiv:2006.02085"
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -0,0 +1,167 @@
 # Developer Guide
 This developer guide is for people who want to contribute to the Katib project.
 If you're interesting in using Katib in your machine learning project,
 see the following guides:
 - [Getting started with Katib](https://kubeflow.org/docs/components/katib/hyperparameter/).
 - [How to configure Katib Experiment](https://kubeflow.org/docs/components/katib/experiment/).
 - [Katib architecture and concepts](https://www.kubeflow.org/docs/components/katib/reference/architecture/)
  for hyperparameter tuning and neural architecture search.
 ## Requirements
 - [Go](https://golang.org/) (1.22 or later)
 - [Docker](https://docs.docker.com/) (24.0 or later)
 - [Docker Buildx](https://docs.docker.com/build/buildx/) (0.8.0 or later)
 - [Java](https://docs.oracle.com/javase/8/docs/technotes/guides/install/install_overview.html) (8 or later)
 - [Python](https://www.python.org/) (3.11 or later)
 - [kustomize](https://kustomize.io/) (4.0.5 or later)
 - [pre-commit](https://pre-commit.com/)
 ## Build from source code
 **Note** that your Docker Desktop should
 [enable containerd image store](https://docs.docker.com/desktop/containerd/#enable-the-containerd-image-store)
 to build multi-arch images. Check source code as follows:
 ```bash
 make build REGISTRY=<image-registry> TAG=<image-tag>
 ```
 If you are using an Apple Silicon machine and encounter the "rosetta error: bss_size overflow," go to Docker Desktop -> General and uncheck "Use Rosetta for x86_64/amd64 emulation on Apple Silicon."
 To use your custom images for the Katib components, modify
 [Kustomization file](https://github.com/kubeflow/katib/blob/master/manifests/v1beta1/installs/katib-standalone/kustomization.yaml)
 and [Katib Config](https://github.com/kubeflow/katib/blob/master/manifests/v1beta1/installs/katib-standalone/katib-config.yaml)
 You can deploy Katib v1beta1 manifests into a Kubernetes cluster as follows:
 ```bash
 make deploy
 ```
 You can undeploy Katib v1beta1 manifests from a Kubernetes cluster as follows:
 ```bash
 make undeploy
 ```
 ## Technical and style guide
 The following guidelines apply primarily to Katib,
 but other projects like [Training Operator](https://github.com/kubeflow/training-operator) might also adhere to them.
 ## Go Development
 When coding:
 - Follow [effective go](https://go.dev/doc/effective_go) guidelines.
 - Run locally [`make check`](https://github.com/kubeflow/katib/blob/46173463027e4fd2e604e25d7075b2b31a702049/Makefile#L31)
  to verify if changes follow best practices before submitting PRs.
 Testing:
 - Use [`cmp.Diff`](https://pkg.go.dev/github.com/google/go-cmp/cmp#Diff) instead of `reflect.Equal`, to provide useful comparisons.
 - Define test cases as maps instead of slices to avoid dependencies on the running order.
  Map key should be equal to the test case name.
 ## Modify controller APIs
 If you want to modify Katib controller APIs, you have to
 generate deepcopy, clientset, listers, informers, open-api and Python SDK with the changed APIs.
 You can update the necessary files as follows:
 ```bash
 make generate
 ```
 ## Controller Flags
 Below is a list of command-line flags accepted by Katib controller:
 | Name         | Type   | Default | Description                                                                                                                      |
 | ------------ | ------ | ------- | -------------------------------------------------------------------------------------------------------------------------------- |
 | katib-config | string | ""      | The katib-controller will load its initial configuration from this file. Omit this flag to use the default configuration values. |
 ## DB Manager Flags
 Below is a list of command-line flags accepted by Katib DB Manager:
 | Name            | Type          | Default      | Description                                                         |
 | --------------- | ------------- | -------------| ------------------------------------------------------------------- |
 | connect-timeout | time.Duration | 60s          | Timeout before calling error during database connection             |
 | listen-address  | string        | 0.0.0.0:6789 | The network interface or IP address to receive incoming connections |
 ## Katib admission webhooks
 Katib uses three [Kubernetes admission webhooks](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/).
 1. `validator.experiment.katib.kubeflow.org` -
   [Validating admission webhook](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook)
   to validate the Katib Experiment before the creation.
 1. `defaulter.experiment.katib.kubeflow.org` -
   [Mutating admission webhook](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook)
   to set the [default values](../pkg/apis/controller/experiments/v1beta1/experiment_defaults.go)
   in the Katib Experiment before the creation.
 1. `mutator.pod.katib.kubeflow.org` - Mutating admission webhook to inject the metrics
   collector sidecar container to the training pod. Learn more about the Katib's
   metrics collector in the
   [Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/user-guides/metrics-collector/).
 You can find the YAMLs for the Katib webhooks
 [here](../manifests/v1beta1/components/webhook/webhooks.yaml).
 **Note:** If you are using a private Kubernetes cluster, you have to allow traffic
 via `TCP:8443` by specifying the firewall rule and you have to update the master
 plane CIDR source range to use the Katib webhooks
 ### Katib cert generator
 Katib Controller has the internal `cert-generator` to generate certificates for the webhooks.
 Once Katib is deployed in the Kubernetes cluster, the `cert-generator` follows these steps:
 - Generate the self-signed certificate and private key.
 - Update a Kubernetes Secret with the self-signed TLS certificate and private key.
 - Patch the webhooks with the `CABundle`.
 Once the `cert-generator` finished, the Katib controller starts to register controllers such as `experiment-controller` to the manager.
 You can find the `cert-generator` source code [here](../pkg/certgenerator/v1beta1).
 NOTE: the Katib also supports the [cert-manager](https://cert-manager.io/) to generate certs for the admission webhooks instead of using cert-generator.
 You can find the installation with the cert-manager [here](../manifests/v1beta1/installs/katib-cert-manager).
 ## Implement a new algorithm and use it in Katib
 Please see [new-algorithm-service.md](./new-algorithm-service.md).
 ## Katib UI documentation
 Please see [Katib UI README](../pkg/ui/v1beta1).
 ## Design proposals
 Please see [proposals](./proposals).
 ## Code Style
 ### pre-commit
 Make sure to install [pre-commit](https://pre-commit.com/) (`pip install
 pre-commit`) and run `pre-commit install` from the root of the repository at
 least once before creating git commits.
 The pre-commit [hooks](../.pre-commit-config.yaml) ensure code quality and
 consistency. They are executed in CI. PRs that fail to comply with the hooks
 will not be able to pass the corresponding CI gate. The hooks are only executed
 against staged files unless you run `pre-commit run --all`, in which case,
 they'll be executed against every file in the repository.
 Specific programmatically generated files listed in the `exclude` field in
 [.pre-commit-config.yaml](../.pre-commit-config.yaml) are deliberately excluded
 from the hooks.
--- a/Dockerfile.conformance
+++ b/Dockerfile.conformance
@ -0,0 +1,32 @@
 # Copyright 2023 The Kubeflow Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # Dockerfile for building the source code of conformance tests
 FROM python:3.10-slim
 WORKDIR /kubeflow/katib
 COPY sdk/ /kubeflow/katib/sdk/
 COPY examples/ /kubeflow/katib/examples/
 COPY test/ /kubeflow/katib/test/
 COPY pkg/ /kubeflow/katib/pkg/
 COPY conformance/run.sh .
 # Add test script.
 RUN chmod +x run.sh
 RUN pip install --prefer-binary -e sdk/python/v1beta1
 ENTRYPOINT [ "./run.sh" ]
--- a/154
+++ b/154
@ -1,75 +1,130 @@
-HAS_LINT := $(shell command -v golint;)
+HAS_LINT := $(shell command -v golangci-lint;)
 HAS_YAMLLINT := $(shell command -v yamllint;)
 HAS_SHELLCHECK := $(shell command -v shellcheck;)
 HAS_SETUP_ENVTEST := $(shell command -v setup-envtest;)
 HAS_MOCKGEN := $(shell command -v mockgen;)
 COMMIT := v1beta1-$(shell git rev-parse --short=7 HEAD)
-KATIB_REGISTRY := docker.io/kubeflowkatib
+KATIB_REGISTRY := ghcr.io/kubeflow/katib
 CPU_ARCH ?= linux/amd64,linux/arm64
 ENVTEST_K8S_VERSION ?= 1.31
 MOCKGEN_VERSION ?= $(shell grep 'go.uber.org/mock' go.mod | cut -d ' ' -f 2)
 GO_VERSION=$(shell grep '^go' go.mod | cut -d ' ' -f 2)
 GOPATH ?= $(shell go env GOPATH)
 TEST_TENSORFLOW_EVENT_FILE_PATH ?= $(CURDIR)/test/unit/v1beta1/metricscollector/testdata/tfevent-metricscollector/logs
 # Run tests
 .PHONY: test
-test:
+test: envtest
-	go test ./pkg/... ./cmd/... -coverprofile coverage.out
+	KUBEBUILDER_ASSETS="$(shell setup-envtest use $(ENVTEST_K8S_VERSION) -p path)" go test ./pkg/... ./cmd/... -coverprofile coverage.out
-check: generate fmt vet lint
+envtest:
 ifndef HAS_SETUP_ENVTEST
 	go install sigs.k8s.io/controller-runtime/tools/setup-envtest@release-0.19
 	$(info "setup-envtest has been installed")
 endif
 	$(info "setup-envtest has already installed")
 check: generated-codes go-mod fmt vet lint
 fmt:
 	hack/verify-gofmt.sh
 lint:
 ifndef HAS_LINT
-	go get -u golang.org/x/lint/golint
+	go install github.com/golangci/golangci-lint/cmd/golangci-lint@v1.64.7
-	echo "installing golint"
+	$(info "golangci-lint has been installed")
 endif
-	hack/verify-golint.sh
+	hack/verify-golangci-lint.sh
 yamllint:
 ifndef HAS_YAMLLINT
 	pip install --prefer-binary yamllint
 	$(info "yamllint has been installed")
 endif
 	hack/verify-yamllint.sh
 vet:
 	go vet ./pkg/... ./cmd/...
 shellcheck:
 ifndef HAS_SHELLCHECK
 	bash hack/install-shellcheck.sh
 	$(info "shellcheck has been installed")
 endif
 	hack/verify-shellcheck.sh
 update:
 	hack/update-gofmt.sh
 # Deploy Katib v1beta1 manifests using Kustomize into a k8s cluster.
 deploy:
-	bash scripts/v1beta1/deploy.sh
+	bash scripts/v1beta1/deploy.sh $(WITH_DATABASE_TYPE)
 # Undeploy Katib v1beta1 manifests using Kustomize from a k8s cluster
 undeploy:
 	bash scripts/v1beta1/undeploy.sh
 generated-codes: generate
 ifneq ($(shell bash hack/verify-generated-codes.sh '.'; echo $$?),0)
 	$(error 'Please run "make generate" to generate codes')
 endif
 go-mod: sync-go-mod
 ifneq ($(shell bash hack/verify-generated-codes.sh 'go.*'; echo $$?),0)
 	$(error 'Please run "go mod tidy -go $(GO_VERSION)" to sync Go modules')
 endif
 sync-go-mod:
 	go mod tidy -go $(GO_VERSION)
 .PHONY: go-mod-download
 go-mod-download:
 	go mod download
 CONTROLLER_GEN = $(shell pwd)/bin/controller-gen
 .PHONY: controller-gen
 controller-gen:
 	@GOBIN=$(shell pwd)/bin GO111MODULE=on go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.16.5
 # Run this if you update any existing controller APIs.
-# 1. Genereate deepcopy, clientset, listers, informers for the APIs (hack/update-codegen.sh)
+# 1. Generate deepcopy, clientset, listers, informers for the APIs (hack/update-codegen.sh)
 # 2. Generate open-api for the APIs (hack/update-openapigen)
 # 3. Generate Python SDK for Katib (hack/gen-python-sdk/gen-sdk.sh)
 # 4. Generate gRPC manager APIs (pkg/apis/manager/v1beta1/build.sh and pkg/apis/manager/health/build.sh)
-generate:
+# 5. Generate Go mock codes
-ifndef GOPATH
+generate: go-mod-download controller-gen
-	$(error GOPATH not defined, please define GOPATH. Run "go help gopath" to learn more about GOPATH)
+ifndef HAS_MOCKGEN
 	go install go.uber.org/mock/mockgen@$(MOCKGEN_VERSION)
 	$(info "mockgen has been installed")
 endif
 	go generate ./pkg/... ./cmd/...
 	hack/gen-python-sdk/gen-sdk.sh
-	cd ./pkg/apis/manager/v1beta1 && ./build.sh
+	hack/update-proto.sh
-	cd ./pkg/apis/manager/health && ./build.sh
+	hack/update-mockgen.sh
 # Build images for the Katib v1beta1 components.
 build: generate
-ifeq ($(and $(REGISTRY),$(TAG)),)
+ifeq ($(and $(REGISTRY),$(TAG),$(CPU_ARCH)),)
-	$(error REGISTRY and TAG must be set. Usage: make build REGISTRY=<registry> TAG=<tag>)
+	$(error REGISTRY and TAG must be set. Usage: make build REGISTRY=<registry> TAG=<tag> CPU_ARCH=<cpu-architecture>)
 endif
-	bash scripts/v1beta1/build.sh $(REGISTRY) $(TAG)
+	bash scripts/v1beta1/build.sh $(REGISTRY) $(TAG) $(CPU_ARCH)
 # Build and push Katib images from the latest master commit.
 push-latest: generate
-	bash scripts/v1beta1/build.sh $(KATIB_REGISTRY) latest
+	bash scripts/v1beta1/build.sh $(KATIB_REGISTRY) latest $(CPU_ARCH)
-	bash scripts/v1beta1/build.sh $(KATIB_REGISTRY) $(COMMIT)
+	bash scripts/v1beta1/build.sh $(KATIB_REGISTRY) $(COMMIT) $(CPU_ARCH)
 	bash scripts/v1beta1/push.sh $(KATIB_REGISTRY) latest
 	bash scripts/v1beta1/push.sh $(KATIB_REGISTRY) $(COMMIT)
 # Build and push Katib images for the given tag.
-push-tag: generate
+push-tag:
 ifeq ($(TAG),)
 	$(error TAG must be set. Usage: make push-tag TAG=<release-tag>)
 endif
-	bash scripts/v1beta1/build.sh $(KATIB_REGISTRY) $(TAG)
+	bash scripts/v1beta1/build.sh $(KATIB_REGISTRY) $(TAG) $(CPU_ARCH)
 	bash scripts/v1beta1/build.sh $(KATIB_REGISTRY) $(COMMIT)
 	bash scripts/v1beta1/push.sh $(KATIB_REGISTRY) $(TAG)
 	bash scripts/v1beta1/push.sh $(KATIB_REGISTRY) $(COMMIT)
 # Release a new version of Katib.
 release:
@ -78,6 +133,15 @@ ifeq ($(and $(BRANCH),$(TAG)),)
 endif
 	bash scripts/v1beta1/release.sh $(BRANCH) $(TAG)
 # Update all Katib images.
 update-images:
 ifeq ($(and $(OLD_PREFIX),$(NEW_PREFIX),$(TAG)),)
 	$(error OLD_PREFIX, NEW_PREFIX, and TAG must be set. \
 	Usage: make update-images OLD_PREFIX=<old-prefix> NEW_PREFIX=<new-prefix> TAG=<tag> \
 	For more information, check this file: scripts/v1beta1/update-images.sh)
 endif
 	bash scripts/v1beta1/update-images.sh $(OLD_PREFIX) $(NEW_PREFIX) $(TAG)
 # Prettier UI format check for Katib v1beta1.
 prettier-check:
 	npm run format:check --prefix pkg/ui/v1beta1/frontend
@ -85,3 +149,45 @@ prettier-check:
 # Update boilerplate for the source code.
 update-boilerplate:
 	./hack/boilerplate/update-boilerplate.sh
 prepare-pytest:
 	pip install --prefer-binary -r test/unit/v1beta1/requirements.txt
 	pip install --prefer-binary -r cmd/suggestion/hyperopt/v1beta1/requirements.txt
 	pip install --prefer-binary -r cmd/suggestion/optuna/v1beta1/requirements.txt
 	pip install --prefer-binary -r cmd/suggestion/hyperband/v1beta1/requirements.txt
 	pip install --prefer-binary -r cmd/suggestion/nas/enas/v1beta1/requirements.txt
 	pip install --prefer-binary -r cmd/suggestion/nas/darts/v1beta1/requirements.txt
 	pip install --prefer-binary -r cmd/suggestion/pbt/v1beta1/requirements.txt
 	pip install --prefer-binary -r cmd/earlystopping/medianstop/v1beta1/requirements.txt
 	pip install --prefer-binary -r cmd/metricscollector/v1beta1/tfevent-metricscollector/requirements.txt
 	# `TypeIs` was introduced in typing-extensions 4.10.0, and torch 2.6.0 requires typing-extensions>=4.10.0.
 	# REF: https://github.com/kubeflow/katib/pull/2504
 	# TODO (tenzen-y): Once we upgrade libraries depended on typing-extensions==4.5.0, we can remove this line.
 	pip install typing-extensions==4.10.0
 prepare-pytest-testdata:
 ifeq ("$(wildcard $(TEST_TENSORFLOW_EVENT_FILE_PATH))", "")
 	python examples/v1beta1/trial-images/tf-mnist-with-summaries/mnist.py --epochs 5 --batch-size 200 --log-path $(TEST_TENSORFLOW_EVENT_FILE_PATH)
 endif
 # TODO(Electronic-Waste): Remove the import rewrite when protobuf supports `python_package` option.
 # REF: https://github.com/protocolbuffers/protobuf/issues/7061
 pytest: prepare-pytest prepare-pytest-testdata
 	pytest ./test/unit/v1beta1/suggestion --ignore=./test/unit/v1beta1/suggestion/test_skopt_service.py
 	pytest ./test/unit/v1beta1/earlystopping
 	pytest ./test/unit/v1beta1/metricscollector
 	cp ./pkg/apis/manager/v1beta1/python/api_pb2.py ./sdk/python/v1beta1/kubeflow/katib/katib_api_pb2.py
 	cp ./pkg/apis/manager/v1beta1/python/api_pb2_grpc.py ./sdk/python/v1beta1/kubeflow/katib/katib_api_pb2_grpc.py
 	sed -i "s/api_pb2/kubeflow\.katib\.katib_api_pb2/g" ./sdk/python/v1beta1/kubeflow/katib/katib_api_pb2_grpc.py
 	pytest ./sdk/python/v1beta1/kubeflow/katib
 	rm ./sdk/python/v1beta1/kubeflow/katib/katib_api_pb2.py ./sdk/python/v1beta1/kubeflow/katib/katib_api_pb2_grpc.py
 # The skopt service doesn't work appropriately with Python 3.11.
 # So, we need to run the test with Python 3.9.
 # TODO (tenzen-y): Once we stop to support skopt, we can remove this test.
 # REF: https://github.com/kubeflow/katib/issues/2280
 pytest-skopt:
 	pip install six
 	pip install --prefer-binary -r test/unit/v1beta1/requirements.txt
 	pip install --prefer-binary -r cmd/suggestion/skopt/v1beta1/requirements.txt
 	pytest ./test/unit/v1beta1/suggestion/test_skopt_service.py
--- a/6
+++ b/6
@ -1,8 +1,10 @@
 approvers:
  - andreyvelich
  - gaocegege
  - hougangliu
  - johnugeorge
 reviewers:
  - anencore94
  - c-bata
-  - sperlingxx
+  - Electronic-Waste
 emeritus_approvers:
  - tenzen-y
--- a/2
+++ b/2
@ -1,3 +1,3 @@
-version: "1"
+version: "3"
 domain: kubeflow.org
 repo: github.com/kubeflow/katib
--- a/README.md
+++ b/README.md
@ -1,15 +1,18 @@
 # Kubeflow Katib
 [![Build Status](https://github.com/kubeflow/katib/actions/workflows/test-go.yaml/badge.svg?branch=master)](https://github.com/kubeflow/katib/actions/workflows/test-go.yaml?branch=master)
 [![Coverage Status](https://coveralls.io/repos/github/kubeflow/katib/badge.svg?branch=master)](https://coveralls.io/github/kubeflow/katib?branch=master)
 [![Go Report Card](https://goreportcard.com/badge/github.com/kubeflow/katib)](https://goreportcard.com/report/github.com/kubeflow/katib)
 [![Releases](https://img.shields.io/github/release-pre/kubeflow/katib.svg?sort=semver)](https://github.com/kubeflow/katib/releases)
 [![Slack Status](https://img.shields.io/badge/slack-join_chat-white.svg?logo=slack&style=social)](https://www.kubeflow.org/docs/about/community/#kubeflow-slack-channels)
 [![OpenSSF Best Practices](https://www.bestpractices.dev/projects/9941/badge)](https://www.bestpractices.dev/projects/9941)
 <h1 align="center">
    <img src="./docs/images/logo-title.png" alt="logo" width="200">
  <br>
 </h1>
-[![Build Status](https://travis-ci.com/kubeflow/katib.svg?branch=master)](https://travis-ci.com/kubeflow/katib)
+Kubeflow Katib is a Kubernetes-native project for automated machine learning (AutoML).
 [![Coverage Status](https://coveralls.io/repos/github/kubeflow/katib/badge.svg?branch=master)](https://coveralls.io/github/kubeflow/katib?branch=master)
 [![Go Report Card](https://goreportcard.com/badge/github.com/kubeflow/katib)](https://goreportcard.com/report/github.com/kubeflow/katib)
 [![Releases](https://img.shields.io/github/release-pre/kubeflow/katib.svg?sort=semver)](https://github.com/kubeflow/katib/releases)
 [![Slack Status](https://img.shields.io/badge/slack-join_chat-white.svg?logo=slack&style=social)](https://kubeflow.slack.com/archives/C018PMV53NW)
 Katib is a Kubernetes-native project for automated machine learning (AutoML).
 Katib supports
 [Hyperparameter Tuning](https://en.wikipedia.org/wiki/Hyperparameter_optimization),
 [Early Stopping](https://en.wikipedia.org/wiki/Early_stopping) and
@ -17,309 +20,187 @@ Katib supports
 Katib is the project which is agnostic to machine learning (ML) frameworks.
 It can tune hyperparameters of applications written in any language of the
-users’ choice and natively supports many ML frameworks, such as TensorFlow,
+users’ choice and natively supports many ML frameworks, such as
-MXNet, PyTorch, XGBoost, and others.
+[TensorFlow](https://www.tensorflow.org/), [PyTorch](https://pytorch.org/), [XGBoost](https://xgboost.readthedocs.io/en/latest/), and others.
-## Getting Started
+Katib can perform training jobs using any Kubernetes
-
+[Custom Resources](https://www.kubeflow.org/docs/components/katib/trial-template/)
-Follow the
+with out of the box support for [Kubeflow Training Operator](https://github.com/kubeflow/training-operator),
-[getting-started guide](https://www.kubeflow.org/docs/components/katib/hyperparameter/)
+[Argo Workflows](https://github.com/argoproj/argo-workflows), [Tekton Pipelines](https://github.com/tektoncd/pipeline)
-on the Kubeflow website.
+and many more.
 ## Name
 Katib stands for `secretary` in Arabic.
-## Concepts in Katib
+## Search Algorithms
-For a detailed description of the concepts in Katib and AutoML, check the
+Katib supports several search algorithms. Follow the
-[Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/overview/).
+[Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/user-guides/hp-tuning/configure-algorithm/#hp-tuning-algorithms)
 to know more about each algorithm and check the
 [this guide](https://www.kubeflow.org/docs/components/katib/user-guides/hp-tuning/configure-algorithm/#use-custom-algorithm-in-katib)
 to implement your custom algorithm.
-Katib has the concepts of `Experiment`, `Suggestion`, `Trial` and `Worker Job`.
+<table>
  <tbody>
    <tr align="center">
      <td>
        <b>Hyperparameter Tuning</b>
      </td>
      <td>
        <b>Neural Architecture Search</b>
      </td>
      <td>
        <b>Early Stopping</b>
      </td>
    </tr>
    <tr align="center">
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#random-search">Random Search</a>
      </td>
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#neural-architecture-search-based-on-enas">ENAS</a>
      </td>
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/early-stopping/#median-stopping-rule">Median Stop</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#grid-search">Grid Search</a>
      </td>
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#differentiable-architecture-search-darts">DARTS</a>
      </td>
      <td>
      </td>
    </tr>
    <tr align="center">
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#bayesian-optimization">Bayesian Optimization</a>
      </td>
      <td>
      </td>
      <td>
      </td>
    </tr>
    <tr align="center">
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#tree-of-parzen-estimators-tpe">TPE</a>
      </td>
      <td>
      </td>
      <td>
      </td>
    </tr>
    <tr align="center">
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#multivariate-tpe">Multivariate TPE</a>
      </td>
      <td>
      </td>
      <td>
      </td>
    </tr>
    <tr align="center">
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#covariance-matrix-adaptation-evolution-strategy-cma-es">CMA-ES</a>
      </td>
      <td>
      </td>
      <td>
      </td>
    </tr>
    <tr align="center">
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#sobols-quasirandom-sequence">Sobol's Quasirandom Sequence</a>
      </td>
      <td>
      </td>
      <td>
      </td>
    </tr>
    <tr align="center">
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#hyperband">HyperBand</a>
      </td>
      <td>
      </td>
      <td>
      </td>
    </tr>
    <tr align="center">
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#pbt">Population Based Training</a>
      </td>
      <td>
      </td>
      <td>
      </td>
    </tr>
  </tbody>
 </table>
-### Experiment
+To perform the above algorithms Katib supports the following frameworks:
-An `Experiment` represents a single optimization run over a feasible space.
+- [Goptuna](https://github.com/c-bata/goptuna)
-Each `Experiment` contains a configuration:
+- [Hyperopt](https://github.com/hyperopt/hyperopt)
 - [Optuna](https://github.com/optuna/optuna)
 - [Scikit Optimize](https://github.com/scikit-optimize/scikit-optimize)
-1. **Objective**: What you want to optimize.
+## Prerequisites
 2. **Search Space**: Constraints for configurations describing the feasible space.
 3. **Search Algorithm**: How to find the optimal configurations.
-Katib `Experiment` is defined as a CRD. Check the detailed guide to
+Please check [the official Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/installation/#prerequisites)
-[configuring and running a Katib `Experiment`](https://kubeflow.org/docs/components/katib/experiment/)
+for prerequisites to install Katib.
 in the Kubeflow docs.
 ### Suggestion
 A `Suggestion` is a set of hyperparameter values that the hyperparameter tuning
 process has proposed. Katib creates a `Trial` to evaluate
 the suggested set of values.
 Katib `Suggestion` is defined as a CRD.
 ### Trial
 A `Trial` is one iteration of the hyperparameter tuning process.
 A `Trial` corresponds to one worker job instance with a list of parameter
 assignments. The list of parameter assignments corresponds to a `Suggestion`.
 Each `Experiment` runs several `Trials`. The `Experiment` runs the `Trials` until
 it reaches either the objective or the configured maximum number of `Trials`.
 Katib `Trial` is defined as a CRD.
 ### Worker Job
 The `Worker Job` is the process that runs to evaluate a `Trial` and calculate
 its objective value.
 The `Worker Job` can be any type of Kubernetes resource or
 [Kubernetes CRD](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/).
 Follow the [`Trial` template guide](https://www.kubeflow.org/docs/components/katib/trial-template/#custom-resource)
 to support your own Kubernetes resource in Katib.
 Katib has these CRD examples in upstream:
 - [Kubernetes `Job`](https://kubernetes.io/docs/concepts/workloads/controllers/job/)
 - [Kubeflow `TFJob`](https://www.kubeflow.org/docs/components/training/tftraining/)
 - [Kubeflow `PyTorchJob`](https://www.kubeflow.org/docs/components/training/pytorch/)
 - [Kubeflow `MPIJob`](https://www.kubeflow.org/docs/components/training/mpi/)
 - [Kubeflow `XGBoostJob`](https://github.com/kubeflow/xgboost-operator)
 - [Tekton `Pipelines`](./examples/v1beta1/tekton)
 - [Argo `Workflows`](./examples/v1beta1/argo)
 Thus, Katib supports multiple frameworks with the help of different job kinds.
 ### Search Algorithms
 Katib currently supports several search algorithms. Follow the
 [Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/experiment/#search-algorithms-in-detail)
 to know more about each algorithm.
 #### Hyperparameter Tuning
 - [Random Search](https://en.wikipedia.org/wiki/Hyperparameter_optimization#Random_search)
 - [Tree of Parzen Estimators (TPE)](https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf)
 - [Multivariate TPE](https://tech.preferred.jp/en/blog/multivariate-tpe-makes-optuna-even-more-powerful/)
 - [Grid Search](https://en.wikipedia.org/wiki/Hyperparameter_optimization#Grid_search)
 - [Hyperband](https://arxiv.org/pdf/1603.06560.pdf)
 - [Bayesian Optimization](https://arxiv.org/pdf/1012.2599.pdf)
 - [Covariance Matrix Adaptation Evolution Strategy (CMA-ES)](https://arxiv.org/abs/1604.00772)
 - [Sobol's Quasirandom Sequence](https://dl.acm.org/doi/10.1145/641876.641879)
 #### Neural Architecture Search
 - [Efficient Neural Architecture Search (ENAS)](https://github.com/kubeflow/katib/tree/master/pkg/suggestion/v1beta1/nas/enas)
 - [Differentiable Architecture Search (DARTS)](https://github.com/kubeflow/katib/tree/master/pkg/suggestion/v1beta1/nas/darts)
 ## Components in Katib
 Katib consists of several components as shown below. Each component is running
 on Kubernetes as a deployment. Each component communicates with others via GRPC
 and the API is defined at `pkg/apis/manager/v1beta1/api.proto`.
 - Katib main components:
  - `katib-db-manager` - the GRPC API server of Katib which is the DB Interface.
  - `katib-mysql` - the data storage backend of Katib using mysql.
  - `katib-ui` - the user interface of Katib.
  - `katib-controller` - the controller for the Katib CRDs in Kubernetes.
 ## Web UI
 Katib provides a Web UI.
 During 1.3 we've worked on a new iteration of the UI, which is rewritten in
 Angular and is utilizing the common code of the other Kubeflow [dashboards](https://github.com/kubeflow/kubeflow/tree/master/components/crud-web-apps).
 The users are currently able to list, delete and create Experiments in their
 cluster via this new UI as well as inspect the owned Trials. One important
 missing functionalities are the ability to edit the Trial templates ConfigMaps
 and view Neural Architecture Search models. Check [this Project](https://github.com/kubeflow/katib/projects/1)
 to monitor the current progress.
 ![katibui](./docs/images/katib-ui.png)
 To use the old Katib UI you can update the Katib image `newName` with the previous
 image tag `docker.io/kubeflowkatib/katib-ui:v0.11.1` in the [Kustomize](./manifests/v1beta1/installs/katib-standalone/kustomization.yaml#L29)
 manifests.
 ## GRPC API documentation
 Check the [Katib v1beta1 API reference docs](https://www.kubeflow.org/docs/reference/katib/v1beta1/katib/).
 ## Installation
-For standard installation of Katib with support for all job operators,
+Please follow [the Kubeflow Katib guide](https://www.kubeflow.org/docs/components/katib/installation/#installing-katib)
-install Kubeflow.
+for the detailed instructions on how to install Katib.
 Follow the documentation:
- [Kubeflow installation guide](https://www.kubeflow.org/docs/started/getting-started/)
+### Installing the Control Plane
 - [Kubeflow Katib guides](https://www.kubeflow.org/docs/components/katib/).
-If you install Katib with other Kubeflow components,
+Run the following command to install the latest stable release of Katib control plane:
 you can't submit Katib jobs in Kubeflow namespace. Check the
 [Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/hyperparameter/#example-using-random-algorithm)
 to know more about it.
 Alternatively, if you want to install Katib manually with TF and PyTorch
 operators support, follow these steps:
 Create Kubeflow namespace:
 ```
-kubectl create namespace kubeflow
+kubectl apply -k "github.com/kubeflow/katib.git/manifests/v1beta1/installs/katib-standalone?ref=v0.17.0"
 ```
-Clone Kubeflow manifest repository:
+Run the following command to install the latest changes of Katib control plane:
 ```
-git clone -b v1.2-branch git@github.com:kubeflow/manifests.git
+kubectl apply -k "github.com/kubeflow/katib.git/manifests/v1beta1/installs/katib-standalone?ref=master"
 Set `MANIFESTS_DIR` to the cloned folder.
 export MANIFESTS_DIR=<cloned-folder>
 ```
-### TF operator
+For the Katib Experiments check the [complete examples list](./examples/v1beta1).
-For installing TF operator, run the following:
+### Installing the Python SDK
-```
+Katib implements [a Python SDK](https://pypi.org/project/kubeflow-katib/) to simplify creation of
-cd "${MANIFESTS_DIR}/tf-training/tf-job-crds/base"
+hyperparameter tuning jobs for Data Scientists.
-kustomize build . | kubectl apply -f -
+
-cd "${MANIFESTS_DIR}/tf-training/tf-job-operator/base"
+Run the following command to install the latest stable release of Katib SDK:
-kustomize build . | kubectl apply -f -
+
 ```sh
 pip install -U kubeflow-katib
 ```
-### PyTorch operator
+## Getting Started
-For installing PyTorch operator, run the following:
+Please refer to [the getting started guide](https://www.kubeflow.org/docs/components/katib/getting-started/#getting-started-with-katib-python-sdk)
-
+to quickly create your first hyperparameter tuning Experiment using the Python SDK.
 ```
 cd "${MANIFESTS_DIR}/pytorch-job/pytorch-job-crds/base"
 kustomize build . | kubectl apply -f -
 cd "${MANIFESTS_DIR}/pytorch-job/pytorch-operator/base/"
 kustomize build . | kubectl apply -f -
 ```
 ### Katib
 Note that your [kustomize](https://kustomize.io/) version should be >= 3.2.
 To install Katib run:
 ```
 git clone git@github.com:kubeflow/katib.git
 make deploy
 ```
 Check if all components are running successfully:
 ```
 kubectl get pods -n kubeflow
 ```
 Expected output:
 ```
 NAME                                READY   STATUS    RESTARTS   AGE
 katib-controller-858d6cc48c-df9jc   1/1     Running   1          20m
 katib-db-manager-7966fbdf9b-w2tn8   1/1     Running   0          20m
 katib-mysql-7f8bc6956f-898f9        1/1     Running   0          20m
 katib-ui-7cf9f967bf-nm72p           1/1     Running   0          20m
 pytorch-operator-55f966b548-9gq9v   1/1     Running   0          20m
 tf-job-operator-796b4747d8-4fh82    1/1     Running   0          21m
 ```
 ### Running examples
 After deploy everything, you can run examples to verify the installation.
 This is an example for TF operator:
 ```
 kubectl create -f https://raw.githubusercontent.com/kubeflow/katib/master/examples/v1beta1/tfjob-example.yaml
 ```
 This is an example for PyTorch operator:
 ```
 kubectl create -f https://raw.githubusercontent.com/kubeflow/katib/master/examples/v1beta1/pytorchjob-example.yaml
 ```
 Check the
 [Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/hyperparameter/#example-using-random-algorithm)
 how to monitor your `Experiment` status.
 You can view your results in Katib UI.
 If you used standard installation, access the Katib UI via Kubeflow dashboard.
 Otherwise, port-forward the `katib-ui`:
 ```
 kubectl -n kubeflow port-forward svc/katib-ui 8080:80
 ```
 You can access the Katib UI using this URL: `http://localhost:8080/katib/`.
 ### Katib SDK
 Katib supports Python SDK:
 - Check the [Katib v1beta1 SDK documentation](https://github.com/kubeflow/katib/tree/master/sdk/python/v1beta1).
 Run `make generate` to update Katib SDK.
 ### Cleanups
 To delete installed TF and PyTorch operator run `kubectl delete -f`
 on the respective folders.
 To delete Katib run `make undeploy`.
 ## Quick Start
 Please follow the
 [Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/hyperparameter/#examples)
 to submit your first Katib experiment.
 ## Community
-We are always growing our community and invite new users and AutoML enthusiasts
+The following links provide information on how to get involved in the community:
 to contribute to the Katib project. The following links provide information
 about getting involved in the community:
- If you use Katib, please update [the adopters list](ADOPTERS.md).
+- Attend [the bi-weekly AutoML and Training Working Group](https://bit.ly/2PWVCkV)
-
+  community meeting.
- Subscribe
+- Join our [`#kubeflow-katib`](https://www.kubeflow.org/docs/about/community/#kubeflow-slack-channels)
-  [to the calendar](https://calendar.google.com/calendar/u/0/r?cid=ZDQ5bnNpZWZzbmZna2Y5MW8wdThoMmpoazRAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ)
+  Slack channel.
-  to attend the AutoML WG community meeting.
+- Check out [who is using Katib](ADOPTERS.md) and [presentations about Katib project](docs/presentations.md).
 - Check
  [the AutoML WG meeting notes](https://docs.google.com/document/d/1MChKfzrKAeFRtYqypFbMXL6ZIc_OgijjkvbqmwRV-64/edit).
 - Join
  [the AutoML WG Slack channel](https://kubeflow.slack.com/archives/C018PMV53NW).
 - Learn more about Katib in
  [the presentations and demos list](./docs/presentations.md).
 ### Blog posts
 - [Kubeflow Katib: Scalable, Portable and Cloud Native System for AutoML](https://blog.kubeflow.org/katib/)
  (by Andrey Velichkevich)
 ### Events
 - [AutoML and Training WG Summit. 16th of July 2021](https://docs.google.com/document/d/1vGluSPHmAqEr8k9Dmm82RcQ-MVnqbYYSfnjMGB-aPuo/edit?usp=sharing)
 ## Contributing
-Please feel free to test the system!
+Please refer to the [CONTRIBUTING guide](CONTRIBUTING.md).
 [developer-guide.md](./docs/developer-guide.md) is a good starting point
 for developers.
 ## Citation
--- a/ROADMAP.md
+++ b/ROADMAP.md
@ -1,3 +1,45 @@
 # Katib 2022/2023 Roadmap
 ## AutoML Features
 - Support advance HyperParameter tuning algorithms:
  - Population Based Training (PBT) - [#1382](https://github.com/kubeflow/katib/issues/1382)
  - Tree of Parzen Estimators (TPE)
  - Multivariate TPE
  - Sobol’s Quasirandom Sequence
  - Asynchronous Successive Halving - [ASHA](https://arxiv.org/pdf/1810.05934.pdf)
 - Support multi-objective optimization - [#1549](https://github.com/kubeflow/katib/issues/1549)
 - Support various HP distributions (log-uniform, uniform, normal) - [#1207](https://github.com/kubeflow/katib/issues/1207)
 - Support Auto Model Compression - [#460](https://github.com/kubeflow/katib/issues/460)
 - Support Auto Feature Engineering - [#475](https://github.com/kubeflow/katib/issues/475)
 - Improve Neural Architecture Search design
 ## Backend and API Enhancements
 - Conformance tests for Katib - [#2044](https://github.com/kubeflow/katib/issues/2044)
 - Support push-based metrics collection in Katib - [#577](https://github.com/kubeflow/katib/issues/577)
 - Support PostgreSQL as a Katib DB - [#915](https://github.com/kubeflow/katib/issues/915)
 - Improve Katib scalability - [#1847](https://github.com/kubeflow/katib/issues/1847)
 - Promote Katib APIs to the `v1` version
 - Support multiple CRD versions (`v1beta1`, `v1`) with conversion webhook
 ## Improve Katib User Experience
 - Simplify Katib Experiment creation with Katib SDK - [#1951](https://github.com/kubeflow/katib/pull/1951)
 - Fully migrate to a new Katib UI - [Project 1](https://github.com/kubeflow/katib/projects/1)
 - Expose Trial logs in Katib UI - [#971](https://github.com/kubeflow/katib/issues/971)
 - Enhance Katib UI visualization metrics for AutoML Experiments
 - Improve Katib Config UX - [#2150](https://github.com/kubeflow/katib/issues/2150)
 ## Integration with Kubeflow Components
 - Kubeflow Pipeline as a Katib Trial target - [#1914](https://github.com/kubeflow/katib/issues/1914)
 - Improve data passing when Katib Experiment is part of Kubeflow Pipeline - [#1846](https://github.com/kubeflow/katib/issues/1846)
 # History
 # Katib 2021 Roadmap
 ## New Features
@ -24,8 +66,6 @@
 - Support multiple CRD version with conversion webhook
 - MLMD integration with Katib Experiments
 # History
 # Katib 2020 Roadmap
 ## New Features
--- a/SECURITY.md
+++ b/SECURITY.md
@ -0,0 +1,64 @@
 # Security Policy
 ## Supported Versions
 Kubeflow Katib versions are expressed as `vX.Y.Z`, where X is the major version,
 Y is the minor version, and Z is the patch version, following the
 [Semantic Versioning](https://semver.org/) terminology.
 The Kubeflow Katib project maintains release branches for the most recent two minor releases.
 Applicable fixes, including security fixes, may be backported to those two release branches,
 depending on severity and feasibility.
 Users are encouraged to stay updated with the latest releases to benefit from security patches and
 improvements.
 ## Reporting a Vulnerability
 We're extremely grateful for security researchers and users that report vulnerabilities to the
 Kubeflow Open Source Community. All reports are thoroughly investigated by Kubeflow projects owners.
 You can use the following ways to report security vulnerabilities privately:
 - Using the Kubeflow Katib repository [GitHub Security Advisory](https://github.com/kubeflow/katib/security/advisories/new).
 - Using our private Kubeflow Steering Committee mailing list: ksc@kubeflow.org.
 Please provide detailed information to help us understand and address the issue promptly.
 ## Disclosure Process
 **Acknowledgment**: We will acknowledge receipt of your report within 10 business days.
 **Assessment**: The Kubeflow projects owners will investigate the reported issue to determine its
 validity and severity.
 **Resolution**: If the issue is confirmed, we will work on a fix and prepare a release.
 **Notification**: Once a fix is available, we will notify the reporter and coordinate a public
 disclosure.
 **Public Disclosure**: Details of the vulnerability and the fix will be published in the project's
 release notes and communicated through appropriate channels.
 ## Prevention Mechanisms
 Kubeflow Katib employs several measures to prevent security issues:
 **Code Reviews**: All code changes are reviewed by maintainers to ensure code quality and security.
 **Dependency Management**: Regular updates and monitoring of dependencies (e.g. Dependabot) to
 address known vulnerabilities.
 **Continuous Integration**: Automated testing and security checks are integrated into the CI/CD pipeline.
 **Image Scanning**: Container images are scanned for vulnerabilities.
 ## Communication Channels
 For the general questions please join the following resources:
 - Kubeflow [Slack channels](https://www.kubeflow.org/docs/about/community/#kubeflow-slack-channels).
 - Kubeflow discuss [mailing list](https://www.kubeflow.org/docs/about/community/#kubeflow-mailing-list).
 Please **do not report** security vulnerabilities through public channels.
--- a/cmd/cert-generator/v1beta1/Dockerfile
+++ b/cmd/cert-generator/v1beta1/Dockerfile
@ -1,13 +0,0 @@
 FROM alpine:3.12.4
 ARG KUBECTL_VERSION="v1.19.3"
 RUN apk add --update openssl
 RUN wget https://storage.googleapis.com/kubernetes-release/release/$KUBECTL_VERSION/bin/linux/amd64/kubectl \
  && chmod +x ./kubectl && mv ./kubectl /usr/local/bin/kubectl
 COPY ./hack/cert-generator.sh /app/cert-generator.sh
 RUN chmod +x /app/cert-generator.sh
 WORKDIR /app
 ENTRYPOINT ["sh", "./cert-generator.sh"]
--- a/cmd/db-manager/v1beta1/Dockerfile
+++ b/cmd/db-manager/v1beta1/Dockerfile
@ -1,6 +1,8 @@
 # Build the Katib DB manager.
 FROM golang:alpine AS build-env
 ARG TARGETARCH
 WORKDIR /go/src/github.com/kubeflow/katib
 # Download packages.
@ -13,29 +15,10 @@ COPY cmd/ cmd/
 COPY pkg/ pkg/
 # Build the binary.
-RUN if [ "$(uname -m)" = "ppc64le" ]; then \
+RUN CGO_ENABLED=0 GOOS=linux GOARCH="${TARGETARCH}" go build -a -o katib-db-manager ./cmd/db-manager/v1beta1
    CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le go build -a -o katib-db-manager ./cmd/db-manager/v1beta1; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -a -o katib-db-manager ./cmd/db-manager/v1beta1; \
    else \
    CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o katib-db-manager ./cmd/db-manager/v1beta1; \
    fi
 # Add GRPC health probe.
 RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
    if [ "$(uname -m)" = "ppc64le" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-ppc64le; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-arm64; \
    else \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64; \
    fi && \
    chmod +x /bin/grpc_health_probe
 # Copy the db-manager into a thin image.
-FROM alpine:3.7
+FROM alpine:3.15
 WORKDIR /app
 COPY --from=build-env /bin/grpc_health_probe /bin/
 COPY --from=build-env /go/src/github.com/kubeflow/katib/katib-db-manager /app/
 ENTRYPOINT ["./katib-db-manager"]
 CMD ["-w", "kubernetes"]
--- a/cmd/db-manager/v1beta1/main.go
+++ b/cmd/db-manager/v1beta1/main.go
@ -1,5 +1,5 @@
 /*
-Copyright 2021 The Kubeflow Authors.
+Copyright 2022 The Kubeflow Authors.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
@ -22,19 +22,21 @@ import (
 	"fmt"
 	"net"
 	"os"
 	"time"
 	health_pb "github.com/kubeflow/katib/pkg/apis/manager/health"
 	api_pb "github.com/kubeflow/katib/pkg/apis/manager/v1beta1"
 	db "github.com/kubeflow/katib/pkg/db/v1beta1"
 	"github.com/kubeflow/katib/pkg/db/v1beta1/common"
-	"k8s.io/klog"
+	"k8s.io/klog/v2"
 	"google.golang.org/grpc"
 	"google.golang.org/grpc/reflection"
 )
 const (
-	port = "0.0.0.0:6789"
+	defaultListenAddress  = "0.0.0.0:6789"
 	defaultConnectTimeout = time.Second * 60
 )
 var dbIf common.KatibDBInterface
@ -87,25 +89,30 @@ func (s *server) Check(ctx context.Context, in *health_pb.HealthCheckRequest) (*
 }
 func main() {
 	var connectTimeout time.Duration
 	var listenAddress string
 	flag.DurationVar(&connectTimeout, "connect-timeout", defaultConnectTimeout, "Timeout before calling error during database connection. (e.g. 120s)")
 	flag.StringVar(&listenAddress, "listen-address", defaultListenAddress, "The network interface or IP address to receive incoming connections. (e.g. 0.0.0.0:6789)")
 	flag.Parse()
 	var err error
 	dbNameEnvName := common.DBNameEnvName
 	dbName := os.Getenv(dbNameEnvName)
 	if dbName == "" {
 		klog.Fatal("DB_NAME env is not set. Exiting")
 	}
-	dbIf, err = db.NewKatibDBInterface(dbName)
+	dbIf, err = db.NewKatibDBInterface(dbName, connectTimeout)
 	if err != nil {
 		klog.Fatalf("Failed to open db connection: %v", err)
 	}
 	dbIf.DBInit()
-	listener, err := net.Listen("tcp", port)
+	listener, err := net.Listen("tcp", listenAddress)
 	if err != nil {
 		klog.Fatalf("Failed to listen: %v", err)
 	}
 	size := 1<<31 - 1
-	klog.Infof("Start Katib manager: %s", port)
+	klog.Infof("Start Katib manager: %s", listenAddress)
 	s := grpc.NewServer(grpc.MaxRecvMsgSize(size), grpc.MaxSendMsgSize(size))
 	api_pb.RegisterDBManagerServer(s, &server{})
 	health_pb.RegisterHealthServer(s, &server{})
--- a/cmd/db-manager/v1beta1/main_test.go
+++ b/cmd/db-manager/v1beta1/main_test.go
@ -1,5 +1,5 @@
 /*
-Copyright 2021 The Kubeflow Authors.
+Copyright 2022 The Kubeflow Authors.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
@ -20,7 +20,7 @@ import (
 	"context"
 	"testing"
-	"github.com/golang/mock/gomock"
+	"go.uber.org/mock/gomock"
 	health_pb "github.com/kubeflow/katib/pkg/apis/manager/health"
 	api_pb "github.com/kubeflow/katib/pkg/apis/manager/v1beta1"
--- a/cmd/earlystopping/medianstop/v1beta1/Dockerfile
+++ b/cmd/earlystopping/medianstop/v1beta1/Dockerfile
@ -1,22 +1,24 @@
-FROM python:3.6
+FROM python:3.11-slim
 ARG TARGETARCH
 ENV TARGET_DIR /opt/katib
 ENV EARLY_STOPPING_DIR cmd/earlystopping/medianstop/v1beta1
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python
-RUN if [ "$(uname -m)" = "ppc64le" ] || [ "$(uname -m)" = "aarch64" ]; then \
+RUN if [ "${TARGETARCH}" = "ppc64le" ] || [ "${TARGETARCH}" = "arm64" ]; then \
    apt-get -y update && \
    apt-get -y install gfortran libopenblas-dev liblapack-dev && \
-  pip install cython; \
+    apt-get clean && \
    rm -rf /var/lib/apt/lists/*; \
  fi
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${EARLY_STOPPING_DIR}/ ${TARGET_DIR}/${EARLY_STOPPING_DIR}/
 WORKDIR  ${TARGET_DIR}/${EARLY_STOPPING_DIR}
 RUN pip install --no-cache-dir -r requirements.txt
 WORKDIR  ${TARGET_DIR}/${EARLY_STOPPING_DIR}
 RUN pip install --prefer-binary --no-cache-dir -r requirements.txt
 RUN chgrp -R 0 ${TARGET_DIR} \
  && chmod -R g+rwX ${TARGET_DIR}
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/earlystopping/medianstop/v1beta1/main.py
+++ b/cmd/earlystopping/medianstop/v1beta1/main.py
@ -1,4 +1,4 @@
-# Copyright 2021 The Kubeflow Authors.
+# Copyright 2022 The Kubeflow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@ -12,12 +12,14 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import grpc
 import time
 import logging
 import time
 from concurrent import futures
 import grpc
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.earlystopping.v1beta1.medianstop.service import MedianStopService
 from concurrent import futures
 _ONE_DAY_IN_SECONDS = 60 * 60 * 24
 DEFAULT_PORT = "0.0.0.0:6788"
--- a/cmd/earlystopping/medianstop/v1beta1/requirements.txt
+++ b/cmd/earlystopping/medianstop/v1beta1/requirements.txt
@ -1,4 +1,5 @@
-grpcio==1.23.0
+grpcio>=1.64.1
-protobuf==3.9.1
+protobuf>=4.21.12,<5
 googleapis-common-protos==1.6.0
-kubernetes==11.0.0
+kubernetes==22.6.0
 cython>=0.29.24
--- a/cmd/katib-controller/v1beta1/Dockerfile
+++ b/cmd/katib-controller/v1beta1/Dockerfile
@ -1,6 +1,8 @@
 # Build the Katib controller.
 FROM golang:alpine AS build-env
 ARG TARGETARCH
 WORKDIR /go/src/github.com/kubeflow/katib
 # Download packages.
@ -13,16 +15,10 @@ COPY cmd/ cmd/
 COPY pkg/ pkg/
 # Build the binary.
-RUN if [ "$(uname -m)" = "ppc64le" ]; then \
+RUN CGO_ENABLED=0 GOOS=linux GOARCH=${TARGETARCH} go build -a -o katib-controller ./cmd/katib-controller/v1beta1
    CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le go build -a -o katib-controller  ./cmd/katib-controller/v1beta1; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -a -o katib-controller  ./cmd/katib-controller/v1beta1; \
    else \
    CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o katib-controller  ./cmd/katib-controller/v1beta1; \
    fi
 # Copy the controller-manager into a thin image.
-FROM alpine:3.7
+FROM alpine:3.15
 WORKDIR /app
 COPY --from=build-env /go/src/github.com/kubeflow/katib/katib-controller .
 ENTRYPOINT ["./katib-controller"]
--- a/cmd/katib-controller/v1beta1/main.go
+++ b/cmd/katib-controller/v1beta1/main.go
@ -1,5 +1,5 @@
 /*
-Copyright 2021 The Kubeflow Authors.
+Copyright 2022 The Kubeflow Authors.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
@ -24,59 +24,75 @@ import (
 	"os"
 	"github.com/spf13/viper"
 	"k8s.io/apimachinery/pkg/runtime"
 	_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
 	"sigs.k8s.io/controller-runtime/pkg/client/config"
 	"sigs.k8s.io/controller-runtime/pkg/healthz"
 	logf "sigs.k8s.io/controller-runtime/pkg/log"
 	"sigs.k8s.io/controller-runtime/pkg/log/zap"
 	"sigs.k8s.io/controller-runtime/pkg/manager"
 	"sigs.k8s.io/controller-runtime/pkg/manager/signals"
 	metricsserver "sigs.k8s.io/controller-runtime/pkg/metrics/server"
 	"sigs.k8s.io/controller-runtime/pkg/webhook"
 	configv1beta1 "github.com/kubeflow/katib/pkg/apis/config/v1beta1"
 	apis "github.com/kubeflow/katib/pkg/apis/controller"
-	controller "github.com/kubeflow/katib/pkg/controller.v1beta1"
+	cert "github.com/kubeflow/katib/pkg/certgenerator/v1beta1"
 	"github.com/kubeflow/katib/pkg/controller.v1beta1"
 	"github.com/kubeflow/katib/pkg/controller.v1beta1/consts"
-	trialutil "github.com/kubeflow/katib/pkg/controller.v1beta1/trial/util"
+	"github.com/kubeflow/katib/pkg/util/v1beta1/katibconfig"
-	webhook "github.com/kubeflow/katib/pkg/webhook/v1beta1"
+	webhookv1beta1 "github.com/kubeflow/katib/pkg/webhook/v1beta1"
 	utilruntime "k8s.io/apimachinery/pkg/util/runtime"
 	clientgoscheme "k8s.io/client-go/kubernetes/scheme"
 )
 var (
 	scheme = runtime.NewScheme()
 	log    = logf.Log.WithName("entrypoint")
 )
 func init() {
 	utilruntime.Must(apis.AddToScheme(scheme))
 	utilruntime.Must(configv1beta1.AddToScheme(scheme))
 	utilruntime.Must(clientgoscheme.AddToScheme(scheme))
 }
 func main() {
 	logf.SetLogger(zap.New())
 	log := logf.Log.WithName("entrypoint")
 	var experimentSuggestionName string
 	var metricsAddr string
 	var webhookPort int
 	var injectSecurityContext bool
 	var enableGRPCProbeInSuggestion bool
 	var trialResources trialutil.GvkListFlag
 	flag.StringVar(&experimentSuggestionName, "experiment-suggestion-name",
 		"default", "The implementation of suggestion interface in experiment controller (default)")
 	flag.StringVar(&metricsAddr, "metrics-addr", ":8080", "The address the metric endpoint binds to.")
 	flag.BoolVar(&injectSecurityContext, "webhook-inject-securitycontext", false, "Inject the securityContext of container[0] in the sidecar")
 	flag.BoolVar(&enableGRPCProbeInSuggestion, "enable-grpc-probe-in-suggestion", true, "enable grpc probe in suggestions")
 	flag.Var(&trialResources, "trial-resources", "The list of resources that can be used as trial template, in the form: Kind.version.group (e.g. TFJob.v1.kubeflow.org)")
 	flag.IntVar(&webhookPort, "webhook-port", 8443, "The port number to be used for admission webhook server.")
 	// TODO (andreyvelich): Currently it is not possible to set different webhook service name.
 	// flag.StringVar(&serviceName, "webhook-service-name", "katib-controller", "The service name which will be used in webhook")
 	// TODO (andreyvelich): Currently is is not possible to store webhook cert in the local file system.
 	// flag.BoolVar(&certLocalFS, "cert-localfs", false, "Store the webhook cert in local file system")
 	var katibConfigFile string
 	flag.StringVar(&katibConfigFile, "katib-config", "",
 		"The katib-controller will load its initial configuration from this file. "+
 			"Omit this flag to use the default configuration values. ")
 	flag.Parse()
 	initConfig, err := katibconfig.GetInitConfigData(scheme, katibConfigFile)
 	if err != nil {
 		log.Error(err, "Failed to get KatibConfig")
 		os.Exit(1)
 	}
 	// Set the config in viper.
-	viper.Set(consts.ConfigExperimentSuggestionName, experimentSuggestionName)
+	viper.Set(consts.ConfigExperimentSuggestionName, initConfig.ControllerConfig.ExperimentSuggestionName)
-	viper.Set(consts.ConfigInjectSecurityContext, injectSecurityContext)
+	viper.Set(consts.ConfigInjectSecurityContext, initConfig.ControllerConfig.InjectSecurityContext)
-	viper.Set(consts.ConfigEnableGRPCProbeInSuggestion, enableGRPCProbeInSuggestion)
+	viper.Set(consts.ConfigEnableGRPCProbeInSuggestion, initConfig.ControllerConfig.EnableGRPCProbeInSuggestion)
-	viper.Set(consts.ConfigTrialResources, trialResources)
+
 	trialGVKs, err := katibconfig.TrialResourcesToGVKs(initConfig.ControllerConfig.TrialResources)
 	if err != nil {
 		log.Error(err, "Failed to parse trialResources")
 		os.Exit(1)
 	}
 	viper.Set(consts.ConfigTrialResources, trialGVKs)
 	log.Info("Config:",
 		consts.ConfigExperimentSuggestionName,
 		viper.GetString(consts.ConfigExperimentSuggestionName),
 		"webhook-port",
-		webhookPort,
+		initConfig.ControllerConfig.WebhookPort,
 		"metrics-addr",
-		metricsAddr,
+		initConfig.ControllerConfig.MetricsAddr,
 		"healthz-addr",
 		initConfig.ControllerConfig.HealthzAddr,
 		consts.ConfigInjectSecurityContext,
 		viper.GetBool(consts.ConfigInjectSecurityContext),
 		consts.ConfigEnableGRPCProbeInSuggestion,
@ -94,7 +110,13 @@ func main() {
 	// Create a new katib controller to provide shared dependencies and start components
 	mgr, err := manager.New(cfg, manager.Options{
-		MetricsBindAddress: metricsAddr,
+		Metrics: metricsserver.Options{
 			BindAddress: initConfig.ControllerConfig.MetricsAddr,
 		},
 		HealthProbeBindAddress: initConfig.ControllerConfig.HealthzAddr,
 		LeaderElection:         initConfig.ControllerConfig.EnableLeaderElection,
 		LeaderElectionID:       initConfig.ControllerConfig.LeaderElectionID,
 		Scheme:                 scheme,
 	})
 	if err != nil {
 		log.Error(err, "Failed to create the manager")
@ -103,11 +125,50 @@ func main() {
 	log.Info("Registering Components.")
-	// Setup Scheme for all resources
+	// Create a webhook server.
-	if err := apis.AddToScheme(mgr.GetScheme()); err != nil {
+	hookServer := webhook.NewServer(webhook.Options{
-		log.Error(err, "Unable to add APIs to scheme")
+		Port:    *initConfig.ControllerConfig.WebhookPort,
 		CertDir: consts.CertDir,
 	})
 	ctx := signals.SetupSignalHandler()
 	certsReady := make(chan struct{})
 	defer close(certsReady)
 	// The setupControllers will register controllers to the manager
 	// after generated certs for the admission webhooks.
 	go setupControllers(mgr, certsReady, hookServer)
 	if initConfig.CertGeneratorConfig.Enable {
 		if err = cert.AddToManager(mgr, initConfig.CertGeneratorConfig, certsReady); err != nil {
 			log.Error(err, "Failed to set up cert-generator")
 		}
 	} else {
 		certsReady <- struct{}{}
 	}
 	log.Info("Setting up health checker.")
 	if err := mgr.AddReadyzCheck("readyz", hookServer.StartedChecker()); err != nil {
 		log.Error(err, "Unable to add readyz endpoint to the manager")
 		os.Exit(1)
 	}
 	if err = mgr.AddHealthzCheck("healthz", healthz.Ping); err != nil {
 		log.Error(err, "Add webhook server health checker to the manager failed")
 		os.Exit(1)
 	}
 	// Start the Cmd
 	log.Info("Starting the manager.")
 	if err = mgr.Start(ctx); err != nil {
 		log.Error(err, "Unable to run the manager")
 		os.Exit(1)
 	}
 }
 func setupControllers(mgr manager.Manager, certsReady chan struct{}, hookServer webhook.Server) {
 	// The certsReady blocks to register controllers until generated certs.
 	<-certsReady
 	log.Info("Certs ready")
 	// Setup all Controllers
 	log.Info("Setting up controller.")
@ -117,15 +178,8 @@ func main() {
 	}
 	log.Info("Setting up webhooks.")
-	if err := webhook.AddToManager(mgr, webhookPort); err != nil {
+	if err := webhookv1beta1.AddToManager(mgr, hookServer); err != nil {
 		log.Error(err, "Unable to register webhooks to the manager")
 		os.Exit(1)
 	}
 	// Start the Cmd
 	log.Info("Starting the Cmd.")
 	if err := mgr.Start(signals.SetupSignalHandler()); err != nil {
 		log.Error(err, "Unable to run the manager")
 		os.Exit(1)
 	}
 }
--- a/cmd/metricscollector/v1beta1/file-metricscollector/Dockerfile
+++ b/cmd/metricscollector/v1beta1/file-metricscollector/Dockerfile
@ -1,6 +1,8 @@
 # Build the Katib file metrics collector.
 FROM golang:alpine AS build-env
 ARG TARGETARCH
 WORKDIR /go/src/github.com/kubeflow/katib
 # Download packages.
@ -13,16 +15,10 @@ COPY cmd/ cmd/
 COPY pkg/ pkg/
 # Build the binary.
-RUN if [ "$(uname -m)" = "ppc64le" ]; then \
+RUN CGO_ENABLED=0 GOOS=linux GOARCH=${TARGETARCH} go build -a -o file-metricscollector ./cmd/metricscollector/v1beta1/file-metricscollector
    CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le go build -a -o file-metricscollector ./cmd/metricscollector/v1beta1/file-metricscollector; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -a -o file-metricscollector ./cmd/metricscollector/v1beta1/file-metricscollector; \
    else \
    CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o file-metricscollector ./cmd/metricscollector/v1beta1/file-metricscollector; \
    fi
 # Copy the file metrics collector into a thin image.
-FROM alpine:3.7
+FROM alpine:3.15
 WORKDIR /app
 COPY --from=build-env /go/src/github.com/kubeflow/katib/file-metricscollector .
 ENTRYPOINT ["./file-metricscollector"]
--- a/cmd/metricscollector/v1beta1/file-metricscollector/main.go
+++ b/cmd/metricscollector/v1beta1/file-metricscollector/main.go
@ -1,5 +1,5 @@
 /*
-Copyright 2021 The Kubeflow Authors.
+Copyright 2022 The Kubeflow Authors.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
@ -39,19 +39,21 @@ package main
 import (
 	"context"
 	"encoding/json"
 	"flag"
 	"fmt"
 	"io/ioutil"
 	"os"
 	"path/filepath"
 	"regexp"
 	"strconv"
 	"strings"
 	"time"
-	"github.com/hpcloud/tail"
+	"github.com/nxadm/tail"
-	psutil "github.com/shirou/gopsutil/process"
+	psutil "github.com/shirou/gopsutil/v3/process"
 	"google.golang.org/grpc"
-	"k8s.io/klog"
+	"google.golang.org/grpc/credentials/insecure"
 	"k8s.io/klog/v2"
 	commonv1beta1 "github.com/kubeflow/katib/pkg/apis/controller/common/v1beta1"
 	api "github.com/kubeflow/katib/pkg/apis/manager/v1beta1"
@ -102,6 +104,7 @@ var (
 	earlyStopServiceAddr = flag.String("s-earlystop", "", "Katib Early Stopping service endpoint")
 	trialName            = flag.String("t", "", "Trial Name")
 	metricsFilePath      = flag.String("path", "", "Metrics File Path")
 	metricsFileFormat    = flag.String("format", "", "Metrics File Format")
 	metricNames          = flag.String("m", "", "Metric names")
 	objectiveType        = flag.String("o-type", "", "Objective type")
 	metricFilters        = flag.String("f", "", "Metric filters")
@ -131,13 +134,17 @@ func printMetricsFile(mFile string) {
 	checkMetricFile(mFile)
 	// Print lines from metrics file.
-	t, _ := tail.TailFile(mFile, tail.Config{Follow: true})
+	t, err := tail.TailFile(mFile, tail.Config{Follow: true, ReOpen: true})
 	if err != nil {
 		klog.Errorf("Failed to open metrics file: %v", err)
 	}
 	for line := range t.Lines {
 		klog.Info(line.Text)
 	}
 }
-func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string) {
+func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string, fileFormat commonv1beta1.FileFormat) {
 	// metricStartStep is the dict where key = metric name, value = start step.
 	// We should apply early stopping rule only if metric is reported at least "start_step" times.
@ -148,9 +155,6 @@ func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string) {
 		}
 	}
 	// First metric is objective in metricNames array.
 	objMetric := strings.Split(*metricNames, ";")[0]
 	objType := commonv1beta1.ObjectiveType(*objectiveType)
 	// For objective metric we calculate best optimal value from the recorded metrics.
 	// This is workaround for Median Stop algorithm.
 	// TODO (andreyvelich): Think about it, maybe define latest, max or min strategy type in stop-rule as well ?
@ -159,8 +163,10 @@ func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string) {
 	// Check that metric file exists.
 	checkMetricFile(mFile)
-	// Get Main proccess.
+	// Get Main process.
-	_, mainProcPid, err := common.GetMainProcesses(mFile)
+	// Extract the metric file dir path based on the file name.
 	mDirPath, _ := filepath.Split(mFile)
 	_, mainProcPid, err := common.GetMainProcesses(mDirPath)
 	if err != nil {
 		klog.Fatalf("GetMainProcesses failed: %v", err)
 	}
@ -169,9 +175,6 @@ func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string) {
 		klog.Fatalf("Failed to create new Process from pid %v, error: %v", mainProcPid, err)
 	}
 	// Get list of regural expressions from filters.
 	metricRegList := filemc.GetFilterRegexpList(filters)
 	// Start watch log lines.
 	t, _ := tail.TailFile(mFile, tail.Config{Follow: true})
 	for line := range t.Lines {
@ -179,6 +182,12 @@ func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string) {
 		// Print log line
 		klog.Info(logText)
 		switch fileFormat {
 		case commonv1beta1.TextFormat:
 			// Get list of regural expressions from filters.
 			var metricRegList []*regexp.Regexp
 			metricRegList = filemc.GetFilterRegexpList(filters)
 			// Check if log line contains metric from stop rules.
 			isRuleLine := false
 			for _, rule := range stopRules {
@ -212,45 +221,43 @@ func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string) {
 						if metricName != rule.Name {
 							continue
 						}
-
+						stopRules, optimalObjValue = updateStopRules(stopRules, optimalObjValue, metricValue, metricStartStep, rule, idx)
 					// Calculate optimalObjValue.
 					if metricName == objMetric {
 						if optimalObjValue == nil {
 							optimalObjValue = &metricValue
 						} else if objType == commonv1beta1.ObjectiveTypeMaximize && metricValue > *optimalObjValue {
 							optimalObjValue = &metricValue
 						} else if objType == commonv1beta1.ObjectiveTypeMinimize && metricValue < *optimalObjValue {
 							optimalObjValue = &metricValue
 					}
 						// Assign best optimal value to metric value.
 						metricValue = *optimalObjValue
 				}
-
+			}
-					// Reduce steps if appropriate metric is reported.
+		case commonv1beta1.JsonFormat:
-					// Once rest steps are empty we apply early stopping rule.
+			var logJsonObj map[string]interface{}
-					if _, ok := metricStartStep[metricName]; ok {
+			if err = json.Unmarshal([]byte(logText), &logJsonObj); err != nil {
-						metricStartStep[metricName]--
+				klog.Fatalf("Failed to unmarshal logs in %v format, log: %s, error: %v", commonv1beta1.JsonFormat, logText, err)
-						if metricStartStep[metricName] != 0 {
+			}
 			// Check if log line contains metric from stop rules.
 			isRuleLine := false
 			for _, rule := range stopRules {
 				if _, exist := logJsonObj[rule.Name]; exist {
 					isRuleLine = true
 					break
 				}
 			}
 			// If log line doesn't contain appropriate metric, continue track file.
 			if !isRuleLine {
 				continue
 			}
 					}
-					ruleValue, err := strconv.ParseFloat(rule.Value, 64)
+			// stopRules contains array of EarlyStoppingRules that has not been reached yet.
 			// After rule is reached we delete appropriate element from the array.
 			for idx, rule := range stopRules {
 				value, exist := logJsonObj[rule.Name].(string)
 				if !exist {
 					continue
 				}
 				metricValue, err := strconv.ParseFloat(strings.TrimSpace(value), 64)
 				if err != nil {
-						klog.Fatalf("Unable to parse value %v to float for rule metric %v", rule.Value, rule.Name)
+					klog.Fatalf("Unable to parse value %v to float for metric %v", metricValue, rule.Name)
 					}
 					// Metric value can be equal, less or greater than stop rule.
 					// Deleting suitable stop rule from the array.
 					if rule.Comparison == commonv1beta1.ComparisonTypeEqual && metricValue == ruleValue {
 						stopRules = deleteStopRule(stopRules, idx)
 					} else if rule.Comparison == commonv1beta1.ComparisonTypeLess && metricValue < ruleValue {
 						stopRules = deleteStopRule(stopRules, idx)
 					} else if rule.Comparison == commonv1beta1.ComparisonTypeGreater && metricValue > ruleValue {
 						stopRules = deleteStopRule(stopRules, idx)
 					}
 				}
 				stopRules, optimalObjValue = updateStopRules(stopRules, optimalObjValue, metricValue, metricStartStep, rule, idx)
 			}
 		default:
 			klog.Fatalf("Format must be set to %v or %v", commonv1beta1.TextFormat, commonv1beta1.JsonFormat)
 		}
 		// If stopRules array is empty, Trial is early stopped.
@ -266,12 +273,12 @@ func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string) {
 				klog.Fatalf("Create mark file %v error: %v", markFile, err)
 			}
-			err = ioutil.WriteFile(markFile, []byte(common.TrainingEarlyStopped), 0)
+			err = os.WriteFile(markFile, []byte(common.TrainingEarlyStopped), 0)
 			if err != nil {
 				klog.Fatalf("Write to file %v error: %v", markFile, err)
 			}
-			// Get child proccess from main PID.
+			// Get child process from main PID.
 			childProc, err := mainProc.Children()
 			if err != nil {
 				klog.Fatalf("Get children proceses for main PID: %v failed: %v", mainProcPid, err)
@ -289,9 +296,9 @@ func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string) {
 			}
 			// Report metrics to DB.
-			reportMetrics(filters)
+			reportMetrics(filters, fileFormat)
-			// Wait until main proccess is completed.
+			// Wait until main process is completed.
 			timeout := 60 * time.Second
 			endTime := time.Now().Add(timeout)
 			isProcRunning := true
@ -304,11 +311,10 @@ func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string) {
 			}
 			// Create connection and client for Early Stopping service.
-			conn, err := grpc.Dial(*earlyStopServiceAddr, grpc.WithInsecure())
+			conn, err := grpc.NewClient(*earlyStopServiceAddr, grpc.WithTransportCredentials(insecure.NewCredentials()))
 			if err != nil {
 				klog.Fatalf("Could not connect to Early Stopping service, error: %v", err)
 			}
 			defer conn.Close()
 			c := api.NewEarlyStoppingClient(conn)
 			setTrialStatusReq := &api.SetTrialStatusRequest{
@ -320,11 +326,63 @@ func watchMetricsFile(mFile string, stopRules stopRulesFlag, filters []string) {
 			if err != nil {
 				klog.Fatalf("Set Trial status error: %v", err)
 			}
 			conn.Close()
 			klog.Infof("Trial status is successfully updated")
 		}
 	}
 }
 func updateStopRules(
 	stopRules []commonv1beta1.EarlyStoppingRule,
 	optimalObjValue *float64,
 	metricValue float64,
 	metricStartStep map[string]int,
 	rule commonv1beta1.EarlyStoppingRule,
 	ruleIdx int,
 ) ([]commonv1beta1.EarlyStoppingRule, *float64) {
 	// First metric is objective in metricNames array.
 	objMetric := strings.Split(*metricNames, ";")[0]
 	objType := commonv1beta1.ObjectiveType(*objectiveType)
 	// Calculate optimalObjValue.
 	if rule.Name == objMetric {
 		if optimalObjValue == nil {
 			optimalObjValue = &metricValue
 		} else if objType == commonv1beta1.ObjectiveTypeMaximize && metricValue > *optimalObjValue {
 			optimalObjValue = &metricValue
 		} else if objType == commonv1beta1.ObjectiveTypeMinimize && metricValue < *optimalObjValue {
 			optimalObjValue = &metricValue
 		}
 		// Assign best optimal value to metric value.
 		metricValue = *optimalObjValue
 	}
 	// Reduce steps if appropriate metric is reported.
 	// Once rest steps are empty we apply early stopping rule.
 	if _, ok := metricStartStep[rule.Name]; ok {
 		metricStartStep[rule.Name]--
 		if metricStartStep[rule.Name] != 0 {
 			return stopRules, optimalObjValue
 		}
 	}
 	ruleValue, err := strconv.ParseFloat(rule.Value, 64)
 	if err != nil {
 		klog.Fatalf("Unable to parse value %v to float for rule metric %v", rule.Value, rule.Name)
 	}
 	// Metric value can be equal, less or greater than stop rule.
 	// Deleting suitable stop rule from the array.
 	if rule.Comparison == commonv1beta1.ComparisonTypeEqual && metricValue == ruleValue {
 		return deleteStopRule(stopRules, ruleIdx), optimalObjValue
 	} else if rule.Comparison == commonv1beta1.ComparisonTypeLess && metricValue < ruleValue {
 		return deleteStopRule(stopRules, ruleIdx), optimalObjValue
 	} else if rule.Comparison == commonv1beta1.ComparisonTypeGreater && metricValue > ruleValue {
 		return deleteStopRule(stopRules, ruleIdx), optimalObjValue
 	}
 	return stopRules, optimalObjValue
 }
 func deleteStopRule(stopRules []commonv1beta1.EarlyStoppingRule, idx int) []commonv1beta1.EarlyStoppingRule {
@ -346,9 +404,11 @@ func main() {
 		filters = strings.Split(*metricFilters, ";")
 	}
 	fileFormat := commonv1beta1.FileFormat(*metricsFileFormat)
 	// If stop rule is set we need to parse metrics during run.
 	if len(stopRules) != 0 {
-		go watchMetricsFile(*metricsFilePath, stopRules, filters)
+		go watchMetricsFile(*metricsFilePath, stopRules, filters, fileFormat)
 	} else {
 		go printMetricsFile(*metricsFilePath)
 	}
@ -367,13 +427,13 @@ func main() {
 	// If training was not early stopped, report the metrics.
 	if !isEarlyStopped {
-		reportMetrics(filters)
+		reportMetrics(filters, fileFormat)
 	}
 }
-func reportMetrics(filters []string) {
+func reportMetrics(filters []string, fileFormat commonv1beta1.FileFormat) {
-	conn, err := grpc.Dial(*dbManagerServiceAddr, grpc.WithInsecure())
+	conn, err := grpc.NewClient(*dbManagerServiceAddr, grpc.WithTransportCredentials(insecure.NewCredentials()))
 	if err != nil {
 		klog.Fatalf("Could not connect to DB manager service, error: %v", err)
 	}
@ -384,7 +444,7 @@ func reportMetrics(filters []string) {
 	if len(*metricNames) != 0 {
 		metricList = strings.Split(*metricNames, ";")
 	}
-	olog, err := filemc.CollectObservationLog(*metricsFilePath, metricList, filters)
+	olog, err := filemc.CollectObservationLog(*metricsFilePath, metricList, filters, fileFormat)
 	if err != nil {
 		klog.Fatalf("Failed to collect logs: %v", err)
 	}
--- a/cmd/metricscollector/v1beta1/tfevent-metricscollector/Dockerfile
+++ b/cmd/metricscollector/v1beta1/tfevent-metricscollector/Dockerfile
@ -1,7 +1,24 @@
-FROM tensorflow/tensorflow:1.11.0
+FROM python:3.11-slim
-RUN pip install rfc3339 grpcio googleapis-common-protos
+
-ADD . /usr/src/app/github.com/kubeflow/katib
+ARG TARGETARCH
-WORKDIR /usr/src/app/github.com/kubeflow/katib/cmd/metricscollector/v1beta1/tfevent-metricscollector/
+ENV TARGET_DIR /opt/katib
-RUN pip install --no-cache-dir -r requirements.txt
+ENV METRICS_COLLECTOR_DIR cmd/metricscollector/v1beta1/tfevent-metricscollector
-ENV PYTHONPATH /usr/src/app/github.com/kubeflow/katib:/usr/src/app/github.com/kubeflow/katib/pkg/apis/manager/v1beta1/python:/usr/src/app/github.com/kubeflow/katib/pkg/metricscollector/v1beta1/tfevent-metricscollector/:/usr/src/app/github.com/kubeflow/katib/pkg/metricscollector/v1beta1/common/
+ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/metricscollector/v1beta1/tfevent-metricscollector/::${TARGET_DIR}/pkg/metricscollector/v1beta1/common/
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${METRICS_COLLECTOR_DIR}/ ${TARGET_DIR}/${METRICS_COLLECTOR_DIR}/
 WORKDIR  ${TARGET_DIR}/${METRICS_COLLECTOR_DIR}
 RUN if [ "${TARGETARCH}" = "arm64" ]; then \
    apt-get -y update && \
    apt-get -y install gfortran libpcre3 libpcre3-dev && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*; \
    fi
 RUN pip install --prefer-binary --no-cache-dir -r requirements.txt
 RUN chgrp -R 0 ${TARGET_DIR} \
  && chmod -R g+rwX ${TARGET_DIR}
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/metricscollector/v1beta1/tfevent-metricscollector/Dockerfile.aarch64
+++ b/cmd/metricscollector/v1beta1/tfevent-metricscollector/Dockerfile.aarch64
@ -1,28 +0,0 @@
 FROM ubuntu:18.04
 RUN apt-get update \
    && apt-get -y install software-properties-common \
    autoconf \
    automake \
    build-essential \
    cmake \
    pkg-config \
    wget \
    python-pip \
    libhdf5-dev \
    libhdf5-serial-dev \
    hdf5-tools\
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*
 RUN wget https://github.com/lhelontra/tensorflow-on-arm/releases/download/v1.11.0/tensorflow-1.11.0-cp27-none-linux_aarch64.whl \
    && pip install tensorflow-1.11.0-cp27-none-linux_aarch64.whl \
    && rm tensorflow-1.11.0-cp27-none-linux_aarch64.whl \
    && rm -rf .cache
 RUN pip install rfc3339 grpcio googleapis-common-protos jupyter
 ADD . /usr/src/app/github.com/kubeflow/katib
 WORKDIR /usr/src/app/github.com/kubeflow/katib/cmd/metricscollector/v1beta1/tfevent-metricscollector/
 RUN pip install --no-cache-dir -r requirements.txt
 ENV PYTHONPATH /usr/src/app/github.com/kubeflow/katib:/usr/src/app/github.com/kubeflow/katib/pkg/apis/manager/v1beta1/python:/usr/src/app/github.com/kubeflow/katib/pkg/metricscollector/v1beta1/tfevent-metricscollector/:/usr/src/app/github.com/kubeflow/katib/pkg/metricscollector/v1beta1/common/
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/metricscollector/v1beta1/tfevent-metricscollector/Dockerfile.ppc64le
+++ b/cmd/metricscollector/v1beta1/tfevent-metricscollector/Dockerfile.ppc64le
@ -1,7 +1,6 @@
-FROM ibmcom/tensorflow-ppc64le:1.14.0-py3
+FROM ibmcom/tensorflow-ppc64le:2.2.0-py3
 RUN pip install rfc3339 grpcio googleapis-common-protos
 ADD . /usr/src/app/github.com/kubeflow/katib
 WORKDIR /usr/src/app/github.com/kubeflow/katib/cmd/metricscollector/v1beta1/tfevent-metricscollector/
-RUN pip install --no-cache-dir -r requirements.txt
+RUN pip install --prefer-binary --no-cache-dir -r requirements.txt
 ENV PYTHONPATH /usr/src/app/github.com/kubeflow/katib:/usr/src/app/github.com/kubeflow/katib/pkg/apis/manager/v1beta1/python:/usr/src/app/github.com/kubeflow/katib/pkg/metricscollector/v1beta1/tfevent-metricscollector/:/usr/src/app/github.com/kubeflow/katib/pkg/metricscollector/v1beta1/common/
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/metricscollector/v1beta1/tfevent-metricscollector/main.py
+++ b/cmd/metricscollector/v1beta1/tfevent-metricscollector/main.py
@ -1,4 +1,4 @@
-# Copyright 2021 The Kubeflow Authors.
+# Copyright 2022 The Kubeflow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@ -12,13 +12,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import grpc
 import argparse
 from logging import INFO, StreamHandler, getLogger
 import api_pb2
-from pns import WaitMainProcesses
+import api_pb2_grpc
 import const
 import grpc
 from pns import WaitMainProcesses
 from tfevent_loader import MetricsCollector
 from logging import getLogger, StreamHandler, INFO
 timeout_in_seconds = 60
@ -55,25 +57,28 @@ if __name__ == '__main__':
    wait_all_processes = opt.wait_all_processes.lower() == "true"
    db_manager_server = opt.db_manager_server_addr.split(':')
    if len(db_manager_server) != 2:
-        raise Exception("Invalid Katib DB manager service address: %s" %
+        raise Exception(
-                        opt.db_manager_server_addr)
+            f"Invalid Katib DB manager service address: {opt.db_manager_server_addr}"
        )
    WaitMainProcesses(
        pool_interval=opt.poll_interval,
        timout=opt.timeout,
        wait_all=wait_all_processes,
-        completed_marked_dir=opt.metrics_file_dir)
+        completed_marked_dir=opt.metrics_file_dir,
    )
-    mc = MetricsCollector(opt.metric_names.split(';'))
+    mc = MetricsCollector(opt.metric_names.split(";"))
    observation_log = mc.parse_file(opt.metrics_file_dir)
-    channel = grpc.beta.implementations.insecure_channel(
+    with grpc.insecure_channel(opt.db_manager_server_addr) as channel:
-        db_manager_server[0], int(db_manager_server[1]))
+        stub = api_pb2_grpc.DBManagerStub(channel)
-
+        logger.info(
-    with api_pb2.beta_create_DBManager_stub(channel) as client:
+            f"In {opt.trial_name} {str(len(observation_log.metric_logs))} metrics will be reported."
-        logger.info("In " + opt.trial_name + " " +
+        )
-                    str(len(observation_log.metric_logs)) + " metrics will be reported.")
+        stub.ReportObservationLog(
-        client.ReportObservationLog(api_pb2.ReportObservationLogRequest(
+            api_pb2.ReportObservationLogRequest(
-            trial_name=opt.trial_name,
+                trial_name=opt.trial_name, observation_log=observation_log
-            observation_log=observation_log
+            ),
-        ), timeout=timeout_in_seconds)
+            timeout=timeout_in_seconds,
        )
--- a/cmd/metricscollector/v1beta1/tfevent-metricscollector/requirements.txt
+++ b/cmd/metricscollector/v1beta1/tfevent-metricscollector/requirements.txt
@ -1 +1,6 @@
-psutil==5.6.6
+psutil==5.9.4
 rfc3339>=6.2
 grpcio>=1.64.1
 googleapis-common-protos==1.6.0
 tensorflow==2.16.1
 protobuf>=4.21.12,<5
--- a/cmd/new-ui/v1beta1/Dockerfile
+++ b/cmd/new-ui/v1beta1/Dockerfile
@ -1,63 +0,0 @@
 # --- Clone the kubeflow/kubeflow code ---
 FROM ubuntu AS fetch-kubeflow-kubeflow
 RUN apt-get update && apt-get install git -y
 WORKDIR /kf
 RUN git clone https://github.com/kubeflow/kubeflow.git && \
    cd kubeflow && \
    git checkout 24bcb8e
 # --- Build the frontend kubeflow library ---
 FROM node:12 AS frontend-kubeflow-lib
 WORKDIR /src
 ARG LIB=/kf/kubeflow/components/crud-web-apps/common/frontend/kubeflow-common-lib
 COPY --from=fetch-kubeflow-kubeflow $LIB/package*.json ./
 RUN npm ci
 COPY --from=fetch-kubeflow-kubeflow $LIB/ ./
 RUN npm run build
 # --- Build the frontend ---
 FROM node:12 AS frontend
 WORKDIR /src
 COPY ./pkg/new-ui/v1beta1/frontend/package*.json ./
 RUN npm ci
 COPY ./pkg/new-ui/v1beta1/frontend/ .
 COPY --from=frontend-kubeflow-lib /src/dist/kubeflow/ ./node_modules/kubeflow/
 RUN npm run build:prod
 # --- Build the backend ---
 FROM golang:alpine AS go-build
 WORKDIR /go/src/github.com/kubeflow/katib
 # Download packages.
 COPY go.mod .
 COPY go.sum .
 RUN go mod download -x
 # Copy sources.
 COPY cmd/ cmd/
 COPY pkg/ pkg/
 # Build the binary.
 RUN if [ "$(uname -m)" = "ppc64le" ]; then \
    CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le go build -a -o katib-ui  ./cmd/new-ui/v1beta1; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -a -o katib-ui  ./cmd/new-ui/v1beta1; \
    else \
    CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o katib-ui  ./cmd/new-ui/v1beta1; \
    fi
 # --- Compose the web app ---
 FROM alpine:3.7
 WORKDIR /app
 COPY --from=go-build /go/src/github.com/kubeflow/katib/katib-ui /app/
 COPY --from=frontend /src/dist/static /app/build/static/
 ENTRYPOINT ["./katib-ui"]
--- a/cmd/new-ui/v1beta1/main.go
+++ b/cmd/new-ui/v1beta1/main.go
@ -1,74 +0,0 @@
 /*
 Copyright 2021 The Kubeflow Authors.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
    http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 */
 package main
 import (
 	"flag"
 	"fmt"
 	"log"
 	"net/http"
 	_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
 	common_v1beta1 "github.com/kubeflow/katib/pkg/common/v1beta1"
 	ui "github.com/kubeflow/katib/pkg/new-ui/v1beta1"
 )
 var (
 	port, host, buildDir, dbManagerAddr *string
 )
 func init() {
 	port = flag.String("port", "8080", "The port to listen to for incoming HTTP connections")
 	host = flag.String("host", "0.0.0.0", "The host to listen to for incoming HTTP connections")
 	buildDir = flag.String("build-dir", "/app/build", "The dir of frontend")
 	dbManagerAddr = flag.String("db-manager-address", common_v1beta1.GetDBManagerAddr(), "The address of Katib DB manager")
 }
 func main() {
 	flag.Parse()
 	kuh := ui.NewKatibUIHandler(*dbManagerAddr)
 	log.Printf("Serving the frontend dir %s", *buildDir)
 	frontend := http.FileServer(http.Dir(*buildDir))
 	http.HandleFunc("/katib/", kuh.ServeIndex(*buildDir))
 	http.Handle("/katib/static/", http.StripPrefix("/katib/", frontend))
 	http.HandleFunc("/katib/fetch_experiments/", kuh.FetchAllExperiments)
 	http.HandleFunc("/katib/create_experiment/", kuh.CreateExperiment)
 	http.HandleFunc("/katib/delete_experiment/", kuh.DeleteExperiment)
 	http.HandleFunc("/katib/fetch_experiment/", kuh.FetchExperiment)
 	http.HandleFunc("/katib/fetch_suggestion/", kuh.FetchSuggestion)
 	http.HandleFunc("/katib/fetch_hp_job_info/", kuh.FetchHPJobInfo)
 	http.HandleFunc("/katib/fetch_hp_job_trial_info/", kuh.FetchHPJobTrialInfo)
 	http.HandleFunc("/katib/fetch_nas_job_info/", kuh.FetchNASJobInfo)
 	http.HandleFunc("/katib/fetch_trial_templates/", kuh.FetchTrialTemplates)
 	http.HandleFunc("/katib/add_template/", kuh.AddTemplate)
 	http.HandleFunc("/katib/edit_template/", kuh.EditTemplate)
 	http.HandleFunc("/katib/delete_template/", kuh.DeleteTemplate)
 	http.HandleFunc("/katib/fetch_namespaces", kuh.FetchNamespaces)
 	log.Printf("Serving at %s:%s", *host, *port)
 	if err := http.ListenAndServe(fmt.Sprintf("%s:%s", *host, *port), nil); err != nil {
 		panic(err)
 	}
 }
--- a/cmd/suggestion/chocolate/v1beta1/Dockerfile
+++ b/cmd/suggestion/chocolate/v1beta1/Dockerfile
@ -1,31 +0,0 @@
 FROM python:3.6
 ENV TARGET_DIR /opt/katib
 ENV SUGGESTION_DIR cmd/suggestion/chocolate/v1beta1
 RUN if [ "$(uname -m)" = "ppc64le" ] || [ "$(uname -m)" = "aarch64" ]; then \
    apt-get -y update && \
    apt-get -y install gfortran libopenblas-dev liblapack-dev && \
    pip install cython 'numpy>=1.13.3'; \
    fi
 RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
    if [ "$(uname -m)" = "ppc64le" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-ppc64le; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-arm64; \
    else \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64; \
    fi && \
    chmod +x /bin/grpc_health_probe
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${SUGGESTION_DIR}/ ${TARGET_DIR}/${SUGGESTION_DIR}/
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --no-cache-dir -r requirements.txt
 RUN chgrp -R 0 ${TARGET_DIR} \
  && chmod -R g+rwX ${TARGET_DIR}
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/suggestion/chocolate/v1beta1/requirements.txt
+++ b/cmd/suggestion/chocolate/v1beta1/requirements.txt
@ -1,11 +0,0 @@
 grpcio==1.23.0
 cloudpickle==0.5.6
 numpy>=1.13.3
 scikit-learn>=0.19.0
 scipy>=0.19.1
 forestci==0.3
 protobuf==3.9.1
 googleapis-common-protos==1.6.0
 SQLAlchemy==1.3.8
 git+https://github.com/AIworx-Labs/chocolate@master
 ghalton>=0.6
--- a/cmd/suggestion/goptuna/v1beta1/Dockerfile
+++ b/cmd/suggestion/goptuna/v1beta1/Dockerfile
@ -1,6 +1,8 @@
 # Build the Goptuna Suggestion.
 FROM golang:alpine AS build-env
 ARG TARGETARCH
 WORKDIR /go/src/github.com/kubeflow/katib
 # Download packages.
@ -13,32 +15,15 @@ COPY cmd/ cmd/
 COPY pkg/ pkg/
 # Build the binary.
-RUN if [ "$(uname -m)" = "ppc64le" ]; then \
+RUN CGO_ENABLED=0 GOOS=linux GOARCH=${TARGETARCH} go build -a -o goptuna-suggestion ./cmd/suggestion/goptuna/v1beta1
  CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le go build -a -o goptuna-suggestion ./cmd/suggestion/goptuna/v1beta1; \
  elif [ "$(uname -m)" = "aarch64" ]; then \
  CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -a -o goptuna-suggestion ./cmd/suggestion/goptuna/v1beta1; \
  else \
  CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o goptuna-suggestion ./cmd/suggestion/goptuna/v1beta1; \
  fi
 # Add GRPC health probe.
 RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
  if [ "$(uname -m)" = "ppc64le" ]; then \
  wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-ppc64le; \
  elif [ "$(uname -m)" = "aarch64" ]; then \
  wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-arm64; \
  else \
  wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64; \
  fi && \
  chmod +x /bin/grpc_health_probe
 # Copy the Goptuna suggestion into a thin image.
-FROM alpine:3.7
+FROM alpine:3.15
 ENV TARGET_DIR /opt/katib
 WORKDIR ${TARGET_DIR}
-COPY --from=build-env /bin/grpc_health_probe /bin/
+
 COPY --from=build-env /go/src/github.com/kubeflow/katib/goptuna-suggestion ${TARGET_DIR}/
 RUN chgrp -R 0 ${TARGET_DIR} \
--- a/cmd/suggestion/goptuna/v1beta1/main.go
+++ b/cmd/suggestion/goptuna/v1beta1/main.go
@ -1,5 +1,5 @@
 /*
-Copyright 2021 The Kubeflow Authors.
+Copyright 2022 The Kubeflow Authors.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
@ -24,7 +24,7 @@ import (
 	api_v1_beta1 "github.com/kubeflow/katib/pkg/apis/manager/v1beta1"
 	suggestion "github.com/kubeflow/katib/pkg/suggestion/v1beta1/goptuna"
 	"google.golang.org/grpc"
-	"k8s.io/klog"
+	"k8s.io/klog/v2"
 )
 const (
--- a/cmd/suggestion/hyperband/v1beta1/Dockerfile
+++ b/cmd/suggestion/hyperband/v1beta1/Dockerfile
@ -1,32 +1,24 @@
-FROM python:3.6
+FROM python:3.11-slim
 ARG TARGETARCH
 ENV TARGET_DIR /opt/katib
 ENV SUGGESTION_DIR cmd/suggestion/hyperband/v1beta1
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
-RUN if [ "$(uname -m)" = "ppc64le" ] || [ "$(uname -m)" = "aarch64" ]; then \
+RUN if [ "${TARGETARCH}" = "ppc64le" ] || [ "${TARGETARCH}" = "arm64" ]; then \
    apt-get -y update && \
    apt-get -y install gfortran libopenblas-dev liblapack-dev && \
-    pip install cython; \
+    apt-get clean && \
    rm -rf /var/lib/apt/lists/*; \
    fi
 RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
    if [ "$(uname -m)" = "ppc64le" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-ppc64le; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-arm64; \
    else \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64; \
    fi && \
    chmod +x /bin/grpc_health_probe
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${SUGGESTION_DIR}/ ${TARGET_DIR}/${SUGGESTION_DIR}/
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --no-cache-dir -r requirements.txt
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --prefer-binary --no-cache-dir -r requirements.txt
 RUN chgrp -R 0 ${TARGET_DIR} \
  && chmod -R g+rwX ${TARGET_DIR}
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/suggestion/hyperband/v1beta1/main.py
+++ b/cmd/suggestion/hyperband/v1beta1/main.py
@ -1,4 +1,4 @@
-# Copyright 2021 The Kubeflow Authors.
+# Copyright 2022 The Kubeflow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@ -12,13 +12,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import grpc
 import time
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.suggestion.v1beta1.hyperband.service import HyperbandService
 from concurrent import futures
 import grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.suggestion.v1beta1.hyperband.service import HyperbandService
 _ONE_DAY_IN_SECONDS = 60 * 60 * 24
 DEFAULT_PORT = "0.0.0.0:6789"
--- a/cmd/suggestion/hyperband/v1beta1/requirements.txt
+++ b/cmd/suggestion/hyperband/v1beta1/requirements.txt
@ -1,8 +1,9 @@
-grpcio==1.23.0
+grpcio>=1.64.1
 cloudpickle==0.5.6
-numpy>=1.13.3
+numpy>=1.25.2
-scikit-learn>=0.19.0
+scikit-learn>=0.24.0
-scipy>=0.19.1
+scipy>=1.5.4
 forestci==0.3
-protobuf==3.9.1
+protobuf>=4.21.12,<5
 googleapis-common-protos==1.6.0
 cython>=0.29.24
--- a/cmd/suggestion/hyperopt/v1beta1/Dockerfile
+++ b/cmd/suggestion/hyperopt/v1beta1/Dockerfile
@ -1,33 +1,24 @@
-FROM python:3.6
+FROM python:3.11-slim
 ARG TARGETARCH
 ENV TARGET_DIR /opt/katib
 ENV SUGGESTION_DIR cmd/suggestion/hyperopt/v1beta1
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
-RUN if [ "$(uname -m)" = "ppc64le" ] || [ "$(uname -m)" = "aarch64" ]; then \
+RUN if [ "${TARGETARCH}" = "ppc64le" ]; then \
    apt-get -y update && \
    apt-get -y install gfortran libopenblas-dev liblapack-dev && \
-    pip install cython; \
+    apt-get clean && \
    rm -rf /var/lib/apt/lists/*; \
    fi
 RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
    if [ "$(uname -m)" = "ppc64le" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-ppc64le; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-arm64; \
    else \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64; \
    fi && \
    chmod +x /bin/grpc_health_probe
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${SUGGESTION_DIR}/ ${TARGET_DIR}/${SUGGESTION_DIR}/
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --no-cache-dir -r requirements.txt
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --prefer-binary --no-cache-dir -r requirements.txt
 RUN chgrp -R 0 ${TARGET_DIR} \
  && chmod -R g+rwX ${TARGET_DIR}
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/suggestion/hyperopt/v1beta1/main.py
+++ b/cmd/suggestion/hyperopt/v1beta1/main.py
@ -1,4 +1,4 @@
-# Copyright 2021 The Kubeflow Authors.
+# Copyright 2022 The Kubeflow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@ -12,13 +12,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import grpc
 import time
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.suggestion.v1beta1.hyperopt.service import HyperoptService
 from concurrent import futures
 import grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.suggestion.v1beta1.hyperopt.service import HyperoptService
 _ONE_DAY_IN_SECONDS = 60 * 60 * 24
 DEFAULT_PORT = "0.0.0.0:6789"
--- a/cmd/suggestion/hyperopt/v1beta1/requirements.txt
+++ b/cmd/suggestion/hyperopt/v1beta1/requirements.txt
@ -1,9 +1,10 @@
-grpcio==1.23.0
+grpcio>=1.64.1
 cloudpickle==0.5.6
-numpy>=1.13.3
+numpy>=1.25.2
-scikit-learn>=0.19.0
+scikit-learn>=0.24.0
-scipy>=0.19.1
+scipy>=1.5.4
 forestci==0.3
-protobuf==3.9.1
+protobuf>=4.21.12,<5
 googleapis-common-protos==1.6.0
-hyperopt==0.2.3
+hyperopt==0.2.5
 cython>=0.29.24
--- a/cmd/suggestion/nas/darts/v1beta1/Dockerfile
+++ b/cmd/suggestion/nas/darts/v1beta1/Dockerfile
@ -1,33 +1,24 @@
-FROM python:3.6
+FROM python:3.11-slim
 ARG TARGETARCH
 ENV TARGET_DIR /opt/katib
 ENV SUGGESTION_DIR cmd/suggestion/nas/darts/v1beta1
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
-RUN if [ "$(uname -m)" = "ppc64le" ] || [ "$(uname -m)" = "aarch64" ]; then \
+RUN if [ "${TARGETARCH}" = "ppc64le" ]; then \
    apt-get -y update && \
    apt-get -y install gfortran libopenblas-dev liblapack-dev && \
-    pip install cython; \
+    apt-get clean && \
    rm -rf /var/lib/apt/lists/*; \
    fi
 RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
    if [ "$(uname -m)" = "ppc64le" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-ppc64le; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-arm64; \
    else \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64; \
    fi && \
    chmod +x /bin/grpc_health_probe
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${SUGGESTION_DIR}/ ${TARGET_DIR}/${SUGGESTION_DIR}/
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --no-cache-dir -r requirements.txt
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --prefer-binary --no-cache-dir -r requirements.txt
 RUN chgrp -R 0 ${TARGET_DIR} \
  && chmod -R g+rwX ${TARGET_DIR}
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/suggestion/nas/darts/v1beta1/main.py
+++ b/cmd/suggestion/nas/darts/v1beta1/main.py
@ -1,4 +1,4 @@
-# Copyright 2021 The Kubeflow Authors.
+# Copyright 2022 The Kubeflow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@ -12,13 +12,14 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import grpc
 from concurrent import futures
 import time
-from pkg.apis.manager.v1beta1.python import api_pb2_grpc
+from concurrent import futures
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.suggestion.v1beta1.nas.darts.service import DartsService
 import grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.suggestion.v1beta1.nas.darts.service import DartsService
 _ONE_DAY_IN_SECONDS = 60 * 60 * 24
 DEFAULT_PORT = "0.0.0.0:6789"
--- a/cmd/suggestion/nas/darts/v1beta1/requirements.txt
+++ b/cmd/suggestion/nas/darts/v1beta1/requirements.txt
@ -1,3 +1,4 @@
-grpcio==1.23.0
+grpcio>=1.64.1
-protobuf==3.9.1
+protobuf>=4.21.12,<5
 googleapis-common-protos==1.6.0
 cython>=0.29.24
--- a/cmd/suggestion/nas/enas/v1beta1/Dockerfile
+++ b/cmd/suggestion/nas/enas/v1beta1/Dockerfile
@ -1,29 +1,24 @@
-FROM python:3.6
+FROM python:3.11-slim
 ARG TARGETARCH
 ENV TARGET_DIR /opt/katib
 ENV SUGGESTION_DIR cmd/suggestion/nas/enas/v1beta1
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
-RUN if [ "$(uname -m)" = "ppc64le" ]; then \
+RUN if [ "${TARGETARCH}" = "ppc64le" ]; then \
    apt-get -y update && \
    apt-get -y install gfortran libopenblas-dev liblapack-dev && \
-    pip install cython; \
+    apt-get clean && \
    rm -rf /var/lib/apt/lists/*; \
    fi
 RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
    if [ "$(uname -m)" = "ppc64le" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-ppc64le; \
    else \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64; \
    fi && \
    chmod +x /bin/grpc_health_probe
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${SUGGESTION_DIR}/ ${TARGET_DIR}/${SUGGESTION_DIR}/
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --no-cache-dir -r requirements.txt
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --prefer-binary --no-cache-dir -r requirements.txt
 RUN chgrp -R 0 ${TARGET_DIR} \
  && chmod -R g+rwX ${TARGET_DIR}
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/suggestion/nas/enas/v1beta1/Dockerfile.aarch64
+++ b/cmd/suggestion/nas/enas/v1beta1/Dockerfile.aarch64
@ -1,58 +0,0 @@
 FROM golang:alpine AS build-env
 # The GOPATH in the image is /go.
 ADD . /go/src/github.com/kubeflow/katib
 RUN if [ "$(uname -m)" = "ppc64le" ] || [ "$(uname -m)" = "aarch64" ]; then \
    apk --update add git gcc musl-dev && \
    go get github.com/grpc-ecosystem/grpc-health-probe && \
    mv $GOPATH/bin/grpc-health-probe /bin/grpc_health_probe && \
    chmod +x /bin/grpc_health_probe; \
    else \
    GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64 && \
    chmod +x /bin/grpc_health_probe; \
    fi
 FROM python:3.7-slim-buster
 ENV TARGET_DIR /opt/katib
 ENV SUGGESTION_DIR cmd/suggestion/nas/enas/v1beta1
 RUN apt-get update \
    && apt-get -y install software-properties-common \
    autoconf \
    automake \
    build-essential \
    cmake \
    libtool \
    pkg-config \
    wget \
    gfortran \
    libopenblas-dev \
    liblapack-dev \
    libhdf5-dev \
    libhdf5-serial-dev \
    hdf5-tools \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*
 RUN pip install cython numpy
 RUN wget https://github.com/lhelontra/tensorflow-on-arm/releases/download/v1.14.0-buster/tensorflow-1.14.0-cp37-none-linux_aarch64.whl \
    && pip install tensorflow-1.14.0-cp37-none-linux_aarch64.whl \
    && rm tensorflow-1.14.0-cp37-none-linux_aarch64.whl \
    && rm -rf .cache
 RUN pip install 'grpcio==1.23.0' 'protobuf==3.9.1' 'googleapis-common-protos==1.6.0'
 COPY --from=build-env /bin/grpc_health_probe /bin/
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${SUGGESTION_DIR}/ ${TARGET_DIR}/${SUGGESTION_DIR}/
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN chgrp -R 0 ${TARGET_DIR} \
    && chmod -R g+rwX ${TARGET_DIR}
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/suggestion/nas/enas/v1beta1/main.py
+++ b/cmd/suggestion/nas/enas/v1beta1/main.py
@ -1,4 +1,4 @@
-# Copyright 2021 The Kubeflow Authors.
+# Copyright 2022 The Kubeflow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@ -12,15 +12,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import grpc
 from concurrent import futures
 import time
 from concurrent import futures
 import grpc
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.suggestion.v1beta1.nas.enas.service import EnasService
 _ONE_DAY_IN_SECONDS = 60 * 60 * 24
 DEFAULT_PORT = "0.0.0.0:6789"
--- a/cmd/suggestion/nas/enas/v1beta1/requirements.txt
+++ b/cmd/suggestion/nas/enas/v1beta1/requirements.txt
@ -1,4 +1,5 @@
-grpcio==1.23.0
+grpcio>=1.64.1
 protobuf==3.9.1
 googleapis-common-protos==1.6.0
-tensorflow==1.15.4
+cython>=0.29.24
 tensorflow==2.16.1
 protobuf>=4.21.12,<5
--- a/cmd/suggestion/optuna/v1beta1/Dockerfile
+++ b/cmd/suggestion/optuna/v1beta1/Dockerfile
@ -1,31 +1,24 @@
-FROM python:3.9
+FROM python:3.11-slim
 ARG TARGETARCH
 ENV TARGET_DIR /opt/katib
 ENV SUGGESTION_DIR cmd/suggestion/optuna/v1beta1
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
-RUN if [ "$(uname -m)" = "ppc64le" ] || [ "$(uname -m)" = "aarch64" ]; then \
+RUN if [ "${TARGETARCH}" = "ppc64le" ]; then \
    apt-get -y update && \
-    apt-get -y install gfortran libopenblas-dev liblapack-dev; \
+    apt-get -y install gfortran libopenblas-dev liblapack-dev && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*; \
    fi
 RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
    if [ "$(uname -m)" = "ppc64le" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-ppc64le; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-arm64; \
    else \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64; \
    fi && \
    chmod +x /bin/grpc_health_probe
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${SUGGESTION_DIR}/ ${TARGET_DIR}/${SUGGESTION_DIR}/
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --no-cache-dir -r requirements.txt
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --prefer-binary --no-cache-dir -r requirements.txt
 RUN chgrp -R 0 ${TARGET_DIR} \
  && chmod -R g+rwX ${TARGET_DIR}
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/suggestion/optuna/v1beta1/main.py
+++ b/cmd/suggestion/optuna/v1beta1/main.py
@ -1,4 +1,4 @@
-# Copyright 2021 The Kubeflow Authors.
+# Copyright 2022 The Kubeflow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@ -12,13 +12,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import grpc
 import time
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.suggestion.v1beta1.optuna.service import OptunaService
 from concurrent import futures
 import grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.suggestion.v1beta1.optuna.service import OptunaService
 _ONE_DAY_IN_SECONDS = 60 * 60 * 24
 DEFAULT_PORT = "0.0.0.0:6789"
--- a/cmd/suggestion/optuna/v1beta1/requirements.txt
+++ b/cmd/suggestion/optuna/v1beta1/requirements.txt
@ -1,4 +1,4 @@
-grpcio==1.39.0
+grpcio>=1.64.1
-protobuf==3.17.3
+protobuf>=4.21.12,<5
 googleapis-common-protos==1.53.0
-optuna>=2.8.0
+optuna==3.3.0
--- a/cmd/suggestion/pbt/v1beta1/Dockerfile
+++ b/cmd/suggestion/pbt/v1beta1/Dockerfile
@ -0,0 +1,24 @@
 FROM python:3.11-slim
 ARG TARGETARCH
 ENV TARGET_DIR /opt/katib
 ENV SUGGESTION_DIR cmd/suggestion/pbt/v1beta1
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
 RUN if [ "${TARGETARCH}" = "ppc64le" ]; then \
    apt-get -y update && \
    apt-get -y install gfortran libopenblas-dev liblapack-dev && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*; \
    fi
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${SUGGESTION_DIR}/ ${TARGET_DIR}/${SUGGESTION_DIR}/
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --prefer-binary --no-cache-dir -r requirements.txt
 RUN chgrp -R 0 ${TARGET_DIR} \
  && chmod -R g+rwX ${TARGET_DIR}
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/suggestion/chocolate/v1beta1/main.py
+++ b/cmd/suggestion/chocolate/v1beta1/main.py
@ -1,4 +1,4 @@
-# Copyright 2021 The Kubeflow Authors.
+# Copyright 2022 The Kubeflow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@ -12,22 +12,25 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import grpc
 import time
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.suggestion.v1beta1.chocolate.service import ChocolateService
 from concurrent import futures
 import grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.suggestion.v1beta1.pbt.service import PbtService
 _ONE_DAY_IN_SECONDS = 60 * 60 * 24
 DEFAULT_PORT = "0.0.0.0:6789"
 def serve():
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
-    service = ChocolateService()
+    service = PbtService()
    api_pb2_grpc.add_SuggestionServicer_to_server(service, server)
    health_pb2_grpc.add_HealthServicer_to_server(service, server)
    server.add_insecure_port(DEFAULT_PORT)
    print("Listening...")
    server.start()
--- a/cmd/suggestion/pbt/v1beta1/requirements.txt
+++ b/cmd/suggestion/pbt/v1beta1/requirements.txt
@ -0,0 +1,4 @@
 grpcio>=1.64.1
 protobuf>=4.21.12,<5
 googleapis-common-protos==1.53.0
 numpy==1.25.2
--- a/cmd/suggestion/skopt/v1beta1/Dockerfile
+++ b/cmd/suggestion/skopt/v1beta1/Dockerfile
@ -1,32 +1,24 @@
-FROM python:3.6
+FROM python:3.10-slim
 ARG TARGETARCH
 ENV TARGET_DIR /opt/katib
 ENV SUGGESTION_DIR cmd/suggestion/skopt/v1beta1
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
-RUN if [ "$(uname -m)" = "ppc64le" ] || [ "$(uname -m)" = "aarch64" ]; then \
+RUN if [ "${TARGETARCH}" = "ppc64le" ]; then \
    apt-get -y update && \
    apt-get -y install gfortran libopenblas-dev liblapack-dev && \
-    pip install cython; \
+    apt-get clean && \
    rm -rf /var/lib/apt/lists/*; \
    fi
 RUN GRPC_HEALTH_PROBE_VERSION=v0.3.1 && \
    if [ "$(uname -m)" = "ppc64le" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-ppc64le; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-arm64; \
    else \
    wget -qO/bin/grpc_health_probe https://github.com/grpc-ecosystem/grpc-health-probe/releases/download/${GRPC_HEALTH_PROBE_VERSION}/grpc_health_probe-linux-amd64; \
    fi && \
    chmod +x /bin/grpc_health_probe
 ADD ./pkg/ ${TARGET_DIR}/pkg/
 ADD ./${SUGGESTION_DIR}/ ${TARGET_DIR}/${SUGGESTION_DIR}/
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --no-cache-dir -r requirements.txt
 WORKDIR  ${TARGET_DIR}/${SUGGESTION_DIR}
 RUN pip install --prefer-binary --no-cache-dir -r requirements.txt
 RUN chgrp -R 0 ${TARGET_DIR} \
  && chmod -R g+rwX ${TARGET_DIR}
 ENV PYTHONPATH ${TARGET_DIR}:${TARGET_DIR}/pkg/apis/manager/v1beta1/python:${TARGET_DIR}/pkg/apis/manager/health/python
 ENTRYPOINT ["python", "main.py"]
--- a/cmd/suggestion/skopt/v1beta1/main.py
+++ b/cmd/suggestion/skopt/v1beta1/main.py
@ -1,4 +1,4 @@
-# Copyright 2021 The Kubeflow Authors.
+# Copyright 2022 The Kubeflow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@ -12,13 +12,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import grpc
 import time
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.suggestion.v1beta1.skopt.service import SkoptService
 from concurrent import futures
 import grpc
 from pkg.apis.manager.health.python import health_pb2_grpc
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.suggestion.v1beta1.skopt.service import SkoptService
 _ONE_DAY_IN_SECONDS = 60 * 60 * 24
 DEFAULT_PORT = "0.0.0.0:6789"
--- a/cmd/suggestion/skopt/v1beta1/requirements.txt
+++ b/cmd/suggestion/skopt/v1beta1/requirements.txt
@ -1,9 +1,13 @@
-grpcio==1.23.0
+grpcio>=1.64.1
 cloudpickle==0.5.6
-numpy>=1.13.3
+# This is a workaround to avoid the following error.
-scikit-learn==0.22.0
+# AttributeError: module 'numpy' has no attribute 'int'
-scipy>=0.19.1
+# See more: https://github.com/numpy/numpy/pull/22607
 numpy==1.23.5
 scikit-learn>=0.24.0, <=1.3.0
 scipy>=1.5.4
 forestci==0.3
-protobuf==3.9.1
+protobuf>=4.21.12,<5
 googleapis-common-protos==1.6.0
-scikit-optimize==0.5.2
+scikit-optimize>=0.9.0
 cython>=0.29.24
--- a/cmd/ui/v1beta1/Dockerfile
+++ b/cmd/ui/v1beta1/Dockerfile
@ -1,15 +1,56 @@
-# Build the Katib UI.
+# --- Clone the kubeflow/kubeflow code ---
-FROM node:12.18.1 AS npm-build
+FROM alpine/git AS fetch-kubeflow-kubeflow
-# Build frontend.
+WORKDIR /kf
-ADD /pkg/ui/v1beta1/frontend /frontend
+COPY ./pkg/ui/v1beta1/frontend/COMMIT ./
-RUN cd /frontend && npm ci
+RUN git clone https://github.com/kubeflow/kubeflow.git && \
-RUN cd /frontend && npm run build
+    COMMIT=$(cat ./COMMIT) && \
-RUN rm -rf /frontend/node_modules
+    cd kubeflow && \
    git checkout $COMMIT
-# Build backend.
+# --- Build the frontend kubeflow library ---
 FROM node:16-alpine AS frontend-kubeflow-lib
 WORKDIR /src
 ARG LIB=/kf/kubeflow/components/crud-web-apps/common/frontend/kubeflow-common-lib
 COPY --from=fetch-kubeflow-kubeflow $LIB/package*.json ./
 RUN npm config set fetch-retry-mintimeout 200000 && \
    npm config set fetch-retry-maxtimeout 1200000 && \
    npm config get registry && \
    npm config set registry https://registry.npmjs.org/ && \
    npm config delete https-proxy && \
    npm config set loglevel verbose && \
    npm cache clean --force && \
    npm ci --force --prefer-offline --no-audit
 COPY --from=fetch-kubeflow-kubeflow $LIB/ ./
 RUN npm run build
 # --- Build the frontend ---
 FROM node:16-alpine AS frontend
 WORKDIR /src
 COPY ./pkg/ui/v1beta1/frontend/package*.json ./
 RUN npm config set fetch-retry-mintimeout 200000 && \
    npm config set fetch-retry-maxtimeout 1200000 && \
    npm config get registry && \
    npm config set registry https://registry.npmjs.org/ && \
    npm config delete https-proxy && \
    npm config set loglevel verbose && \
    npm cache clean --force && \
    npm ci --force --prefer-offline --no-audit
 COPY ./pkg/ui/v1beta1/frontend/ .
 COPY --from=frontend-kubeflow-lib /src/dist/kubeflow/ ./node_modules/kubeflow/
 RUN npm run build:prod
 # --- Build the backend ---
 FROM golang:alpine AS go-build
 ARG TARGETARCH
 WORKDIR /go/src/github.com/kubeflow/katib
 # Download packages.
@ -22,17 +63,11 @@ COPY cmd/ cmd/
 COPY pkg/ pkg/
 # Build the binary.
-RUN if [ "$(uname -m)" = "ppc64le" ]; then \
+RUN CGO_ENABLED=0 GOOS=linux GOARCH=${TARGETARCH} go build -a -o katib-ui  ./cmd/ui/v1beta1
    CGO_ENABLED=0 GOOS=linux GOARCH=ppc64le go build -a -o katib-ui  ./cmd/ui/v1beta1; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
    CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build -a -o katib-ui  ./cmd/ui/v1beta1; \
    else \
    CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o katib-ui  ./cmd/ui/v1beta1; \
    fi
-# Copy the backend and frontend into a thin image.
+# --- Compose the web app ---
-FROM alpine:3.7
+FROM alpine:3.15
 WORKDIR /app
 COPY --from=go-build /go/src/github.com/kubeflow/katib/katib-ui /app/
-COPY --from=npm-build /frontend/build /app/build
+COPY --from=frontend /src/dist/static /app/build/static/
 ENTRYPOINT ["./katib-ui"]
--- a/cmd/ui/v1beta1/main.go
+++ b/cmd/ui/v1beta1/main.go
@ -1,5 +1,5 @@
 /*
-Copyright 2021 The Kubeflow Authors.
+Copyright 2022 The Kubeflow Authors.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
@ -33,7 +33,7 @@ var (
 )
 func init() {
-	port = flag.String("port", "80", "The port to listen to for incoming HTTP connections")
+	port = flag.String("port", "8080", "The port to listen to for incoming HTTP connections")
 	host = flag.String("host", "0.0.0.0", "The host to listen to for incoming HTTP connections")
 	buildDir = flag.String("build-dir", "/app/build", "The dir of frontend")
 	dbManagerAddr = flag.String("db-manager-address", common_v1beta1.GetDBManagerAddr(), "The address of Katib DB manager")
@ -45,17 +45,17 @@ func main() {
 	log.Printf("Serving the frontend dir %s", *buildDir)
 	frontend := http.FileServer(http.Dir(*buildDir))
-	http.Handle("/katib/", http.StripPrefix("/katib/", frontend))
+	http.HandleFunc("/katib/", kuh.ServeIndex(*buildDir))
 	http.Handle("/katib/static/", http.StripPrefix("/katib/", frontend))
-	http.HandleFunc("/katib/fetch_experiments/", kuh.FetchAllExperiments)
+	http.HandleFunc("/katib/fetch_experiments/", kuh.FetchExperiments)
-	http.HandleFunc("/katib/submit_yaml/", kuh.SubmitYamlJob)
+	http.HandleFunc("/katib/create_experiment/", kuh.CreateExperiment)
 	http.HandleFunc("/katib/submit_hp_job/", kuh.SubmitParamsJob)
 	http.HandleFunc("/katib/submit_nas_job/", kuh.SubmitParamsJob)
 	http.HandleFunc("/katib/delete_experiment/", kuh.DeleteExperiment)
 	http.HandleFunc("/katib/fetch_experiment/", kuh.FetchExperiment)
 	http.HandleFunc("/katib/fetch_trial/", kuh.FetchTrial)
 	http.HandleFunc("/katib/fetch_suggestion/", kuh.FetchSuggestion)
 	http.HandleFunc("/katib/fetch_hp_job_info/", kuh.FetchHPJobInfo)
@ -67,6 +67,7 @@ func main() {
 	http.HandleFunc("/katib/edit_template/", kuh.EditTemplate)
 	http.HandleFunc("/katib/delete_template/", kuh.DeleteTemplate)
 	http.HandleFunc("/katib/fetch_namespaces", kuh.FetchNamespaces)
 	http.HandleFunc("/katib/fetch_trial_logs/", kuh.FetchTrialLogs)
 	log.Printf("Serving at %s:%s", *host, *port)
 	if err := http.ListenAndServe(fmt.Sprintf("%s:%s", *host, *port), nil); err != nil {
--- a/conformance/run.sh
+++ b/conformance/run.sh
@ -0,0 +1,13 @@
 #!/bin/sh
 # Run conformance test and generate test report.
 python test/e2e/v1beta1/scripts/gh-actions/run-e2e-experiment.py --experiment-path examples/v1beta1/hp-tuning/random.yaml --namespace kf-conformance \
 --trial-pod-labels '{"sidecar.istio.io/inject": "false"}' | tee /tmp/katib-conformance.log
 # Create the done file.
 touch /tmp/katib-conformance.done
 echo "Done..."
 # Keep the container running so the test logs can be downloaded.
 while true; do sleep 10000; done
--- a/docs/README.md
+++ b/docs/README.md
@ -0,0 +1,5 @@
 # Katib Documentation
 Welcome to Kubeflow Katib!
 The Katib documentation is available on [kubeflow.org](https://www.kubeflow.org/docs/components/katib/).
--- a/docs/developer-guide.md
+++ b/docs/developer-guide.md
@ -1,150 +0,0 @@
 # Table of Contents
 - [Table of Contents](#table-of-contents)
 - [Developer Guide](#developer-guide)
  - [Requirements](#requirements)
  - [Build from source code](#build-from-source-code)
  - [Modify controller APIs](#modify-controller-apis)
  - [Controller Flags](#controller-flags)
  - [Workflow design](#workflow-design)
  - [Katib admission webhooks](#katib-admission-webhooks)
    - [Katib cert generator](#katib-cert-generator)
  - [Implement a new algorithm and use it in Katib](#implement-a-new-algorithm-and-use-it-in-katib)
  - [Algorithm settings documentation](#algorithm-settings-documentation)
  - [Katib UI documentation](#katib-ui-documentation)
  - [Design proposals](#design-proposals)
 Created by [gh-md-toc](https://github.com/ekalinin/github-markdown-toc)
 # Developer Guide
 This developer guide is for people who want to contribute to the Katib project.
 If you're interesting in using Katib in your machine learning project,
 see the following user guides:
 - [Concepts](https://www.kubeflow.org/docs/components/katib/overview/)
  in Katib, hyperparameter tuning, and neural architecture search.
 - [Getting started with Katib](https://kubeflow.org/docs/components/katib/hyperparameter/).
 - Detailed guide to [configuring and running a Katib
  experiment](https://kubeflow.org/docs/components/katib/experiment/).
 ## Requirements
 - [Go](https://golang.org/) (1.13 or later)
 - [Docker](https://docs.docker.com/) (17.05 or later.)
 - [kustomize](https://kustomize.io/) (3.2 or later)
 ## Build from source code
 Check source code as follows:
 ```bash
 make build REGISTRY=<image-registry> TAG=<image-tag>
 ```
 To use your custom images for the Katib component, modify
 [Kustomization file](https://github.com/kubeflow/katib/blob/master/manifests/v1beta1/installs/katib-standalone/kustomization.yaml)
 and [Katib config patch](https://github.com/kubeflow/katib/blob/master/manifests/v1beta1/installs/katib-standalone/katib-config-patch.yaml)
 You can deploy Katib v1beta1 manifests into a k8s cluster as follows:
 ```bash
 make deploy
 ```
 You can undeploy Katib v1beta1 manifests from a k8s cluster as follows:
 ```bash
 make undeploy
 ```
 ## Modify controller APIs
 If you want to modify Katib controller APIs, you have to
 generate deepcopy, clientset, listers, informers, open-api and Python SDK with the changed APIs.
 You can update the necessary files as follows:
 ```bash
 make generate
 ```
 ## Controller Flags
 Below is a list of command-line flags accepted by Katib controller:
 | Name                            | Type                      | Default   | Description                                                                                                            |
 | ------------------------------- | ------------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------- |
 | enable-grpc-probe-in-suggestion | bool                      | true      | Enable grpc probe in suggestions                                                                                       |
 | experiment-suggestion-name      | string                    | "default" | The implementation of suggestion interface in experiment controller                                                    |
 | metrics-addr                    | string                    | ":8080"   | The address the metric endpoint binds to                                                                               |
 | trial-resources                 | []schema.GroupVersionKind | null      | The list of resources that can be used as trial template, in the form: Kind.version.group (e.g. TFJob.v1.kubeflow.org) |
 | webhook-inject-securitycontext  | bool                      | false     | Inject the securityContext of container[0] in the sidecar                                                              |
 | webhook-port                    | int                       | 8443      | The port number to be used for admission webhook server                                                                |
 ## Workflow design
 Please see [workflow-design.md](./workflow-design.md).
 ## Katib admission webhooks
 Katib uses three [Kubernetes admission webhooks](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/).
 1. `validator.experiment.katib.kubeflow.org` -
   [Validating admission webhook](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook)
   to validate the Katib Experiment before the creation.
 1. `defaulter.experiment.katib.kubeflow.org` -
   [Mutating admission webhook](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook)
   to set the [default values](../pkg/apis/controller/experiments/v1beta1/experiment_defaults.go)
   in the Katib Experiment before the creation.
 1. `mutator.pod.katib.kubeflow.org` - Mutating admission webhook to inject the metrics
   collector sidecar container to the training pod. Learn more about the Katib's
   metrics collector in the
   [Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/experiment/#metrics-collector).
 You can find the YAMLs for the Katib webhooks
 [here](../manifests/v1beta1/components/webhook/webhooks.yaml).
 **Note:** If you are using a private Kubernetes cluster, you have to allow traffic
 via `TCP:8443` by specifying the firewall rule and you have to update the master
 plane CIDR source range to use the Katib webhooks
 ### Katib cert generator
 Katib uses the custom `cert-generator` [Kubernetes Job](https://kubernetes.io/docs/concepts/workloads/controllers/job/)
 to generate certificates for the webhooks.
 Once Katib is deployed in the Kubernetes cluster, the `cert-generator` Job follows these steps:
 - Generate a certificate using [`openssl`](https://www.openssl.org/).
 - Create a Kubernetes [Certificate Signing Request](https://kubernetes.io/docs/reference/access-authn-authz/certificate-signing-requests/)
  to approve and sign the certificate.
 - Create a Kubernetes Secret with the signed certificate. Secret has
  the `katib-webhook-cert` name and `cert-generator` Job's `ownerReference` to
  clean-up resources once Katib is uninstalled.
  Once Secret is created, the Katib controller Deployment spawns the Pod,
  since the controller has the `katib-webhook-cert` Secret volume.
 - Patch the webhooks with the `CABundle`.
 You can find the `cert-generator` source code [here](../hack/cert-generator.sh).
 ## Implement a new algorithm and use it in Katib
 Please see [new-algorithm-service.md](./new-algorithm-service.md).
 ## Algorithm settings documentation
 Please see [algorithm-settings.md](./algorithm-settings.md).
 ## Katib UI documentation
 Please see [Katib UI README](https://github.com/kubeflow/katib/tree/master/pkg/ui/v1beta1).
 ## Design proposals
 Please see [proposals](./proposals).
--- a/docs/images-location.md
+++ b/docs/images-location.md
@ -0,0 +1,351 @@
 # Katib Images Location
 Here you can find the location for images that are used in Katib.
 ## Katib Components Images
 The following table shows images for the
 [Katib components](https://www.kubeflow.org/docs/components/katib/reference/architecture/#katib-control-plane-components).
 <table>
  <tbody>
    <tr align="center">
      <td>
        <b>Image Name</b>
      </td>
      <td>
        <b>Description</b>
      </td>
      <td>
        <b>Location</b>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/katib-controller</code>
      </td>
      <td>
        Katib Controller
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/tree/master/cmd/katib-controller/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/katib-ui</code>
      </td>
      <td>
        Katib User Interface
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/tree/master/cmd/ui/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/katib-db-manager</code>
      </td>
      <td>
        Katib DB Manager
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/tree/master/cmd/db-manager/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>docker.io/mysql</code>
      </td>
      <td>
        Katib MySQL DB
      </td>
      <td>
        <a href="https://github.com/docker-library/mysql/blob/c506174eab8ae160f56483e8d72410f8f1e1470f/8.0/Dockerfile.debian">Dockerfile</a>
      </td>
    </tr>
  </tbody>
 </table>
 ## Katib Metrics Collectors Images
 The following table shows images for the
 [Katib Metrics Collectors](https://www.kubeflow.org/docs/components/katib/user-guides/metrics-collector/).
 <table>
  <tbody>
    <tr align="center">
      <td>
        <b>Image Name</b>
      </td>
      <td>
        <b>Description</b>
      </td>
      <td>
        <b>Location</b>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/file-metrics-collector</code>
      </td>
      <td>
        File Metrics Collector
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/cmd/metricscollector/v1beta1/file-metricscollector/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/tfevent-metrics-collector</code>
      </td>
      <td>
        Tensorflow Event Metrics Collector
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/cmd/metricscollector/v1beta1/tfevent-metricscollector/Dockerfile">Dockerfile</a>
      </td>
    </tr>
  </tbody>
 </table>
 ## Katib Suggestions and Early Stopping Images
 The following table shows images for the
 [Katib Suggestion services](https://www.kubeflow.org/docs/components/katib/reference/architecture/#suggestion)
 and the [Katib Early Stopping algorithms](https://www.kubeflow.org/docs/components/katib/user-guides/early-stopping/#early-stopping-algorithms).
 <table>
  <tbody>
    <tr align="center">
      <td>
        <b>Image Name</b>
      </td>
      <td>
        <b>Description</b>
      </td>
      <td>
        <b>Location</b>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/suggestion-hyperopt</code>
      </td>
      <td>
        <a href="https://github.com/hyperopt/hyperopt">Hyperopt</a> Suggestion
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/cmd/suggestion/hyperopt/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/suggestion-skopt</code>
      </td>
      <td>
        <a href="https://github.com/scikit-optimize/scikit-optimize">Skopt</a> Suggestion
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/cmd/suggestion/skopt/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/suggestion-optuna</code>
      </td>
      <td>
        <a href="https://github.com/optuna/optuna">Optuna</a> Suggestion
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/cmd/suggestion/optuna/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/suggestion-goptuna</code>
      </td>
      <td>
        <a href="https://github.com/c-bata/goptuna">Goptuna</a> Suggestion
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/cmd/suggestion/goptuna/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/suggestion-hyperband</code>
      </td>
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#hyperband">Hyperband</a> Suggestion
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/cmd/suggestion/hyperband/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/suggestion-enas</code>
      </td>
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#enas">ENAS</a> Suggestion
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/cmd/suggestion/nas/enas/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/suggestion-darts</code>
      </td>
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/experiment/#differentiable-architecture-search-darts">DARTS</a> Suggestion
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/cmd/suggestion/nas/darts/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/earlystopping-medianstop</code>
      </td>
      <td>
        <a href="https://www.kubeflow.org/docs/components/katib/early-stopping/#median-stopping-rule">Median Stopping Rule</a>
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/cmd/earlystopping/medianstop/v1beta1/Dockerfile">Dockerfile</a>
      </td>
    </tr>
  </tbody>
 </table>
 ## Training Containers Images
 The following table shows images for training containers which are used in the
 [Katib Trials](https://www.kubeflow.org/docs/components/katib/reference/architecture/#trial).
 <table>
  <tbody>
    <tr align="center">
      <td>
        <b>Image Name</b>
      </td>
      <td>
        <b>Description</b>
      </td>
      <td>
        <b>Location</b>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/pytorch-mnist-cpu</code>
      </td>
      <td>
        PyTorch MNIST example with printing metrics to the file or StdOut with CPU support
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/examples/v1beta1/trial-images/pytorch-mnist/Dockerfile.cpu">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/pytorch-mnist-gpu</code>
      </td>
      <td>
        PyTorch MNIST example with printing metrics to the file or StdOut with GPU support
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/examples/v1beta1/trial-images/pytorch-mnist/Dockerfile.gpu">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/tf-mnist-with-summaries</code>
      </td>
      <td>
        Tensorflow MNIST example with saving metrics in the summaries
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/examples/v1beta1/trial-images/tf-mnist-with-summaries/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/xgboost-lightgbm</code>
      </td>
      <td>
        Distributed LightGBM example for XGBoostJob
      </td>
      <td>
        <a href="https://github.com/kubeflow/xgboost-operator/blob/9c8c97d0125a8156f12b8ef5b93f99e709fb57ea/config/samples/lightgbm-dist/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>docker.io/kubeflow/mpi-horovod-mnist</code>
      </td>
      <td>
        Distributed Horovod example for MPIJob
      </td>
      <td>
        <a href="https://github.com/kubeflow/mpi-operator/blob/947d396a9caf70d3c94bf587d5e5da32b70f0f53/examples/horovod/Dockerfile.cpu">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>docker.io/inaccel/jupyter:lab</code>
      </td>
      <td>
        FPGA XGBoost with parameter tuning
      </td>
      <td>
        <a href="https://github.com/inaccel/jupyter/blob/master/lab/Dockerfile">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/enas-cnn-cifar10-gpu</code>
      </td>
      <td>
        Keras CIFAR-10 CNN example for ENAS with GPU support
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/examples/v1beta1/trial-images/enas-cnn-cifar10/Dockerfile.gpu">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/enas-cnn-cifar10-cpu</code>
      </td>
      <td>
        Keras CIFAR-10 CNN example for ENAS with CPU support
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/examples/v1beta1/trial-images/enas-cnn-cifar10/Dockerfile.cpu">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/darts-cnn-cifar10-gpu</code>
      </td>
      <td>
        PyTorch CIFAR-10 CNN example for DARTS with GPU support
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/examples/v1beta1/trial-images/darts-cnn-cifar10/Dockerfile.gpu">Dockerfile</a>
      </td>
    </tr>
    <tr align="center">
      <td>
        <code>ghcr.io/kubeflow/katib/darts-cnn-cifar10-cpu</code>
      </td>
      <td>
        PyTorch CIFAR-10 CNN example for DARTS with CPU support
      </td>
      <td>
        <a href="https://github.com/kubeflow/katib/blob/master/examples/v1beta1/trial-images/darts-cnn-cifar10/Dockerfile.cpu">Dockerfile</a>
      </td>
    </tr>
 </table>
--- a/docs/images/SystemFlow.png
+++ b/docs/images/SystemFlow.png
--- a/docs/images/katib-ui.png
+++ b/docs/images/katib-ui.png
--- a/docs/images/katib-workflow.png
+++ b/docs/images/katib-workflow.png
--- a/docs/new-algorithm-service.md
+++ b/docs/new-algorithm-service.md
@ -1,235 +0,0 @@
 # Document about how to add a new algorithm in Katib
 ## Implement a new algorithm and use it in Katib
 ### Implement the algorithm
 The design of Katib follows the `ask-and-tell` pattern:
 > They often follow a pattern a bit like this: 1. ask for a new set of parameters 1. walk to the Experiment and program in the new parameters 1. observe the outcome of running the Experiment 1. walk back to your laptop and tell the optimizer about the outcome 1. go to step 1
 When an Experiment is created, one algorithm service will be created. Then Katib asks for new sets of parameters via `GetSuggestions` GRPC call. After that, Katib creates new trials according to the sets and observe the outcome. When the trials are finished, Katib tells the metrics of the finished trials to the algorithm, and ask another new sets.
 The new algorithm needs to implement `Suggestion` service defined in [api.proto](../pkg/apis/manager/v1beta1/api.proto). One sample algorithm looks like:
 ```python
 from pkg.apis.manager.v1beta1.python import api_pb2
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.suggestion.v1beta1.internal.search_space import HyperParameter, HyperParameterSearchSpace
 from pkg.suggestion.v1beta1.internal.trial import Trial, Assignment
 from pkg.suggestion.v1beta1.hyperopt.base_service import BaseHyperoptService
 from pkg.suggestion.v1beta1.base_health_service import HealthServicer
 # Inherit SuggestionServicer and implement GetSuggestions.
 class HyperoptService(
        api_pb2_grpc.SuggestionServicer, HealthServicer):
    def ValidateAlgorithmSettings(self, request, context):
        # Optional, it is used to validate algorithm settings defined by users.
        pass
    def GetSuggestions(self, request, context):
        # Convert the Experiment in GRPC request to the search space.
        # search_space example:
        #   HyperParameterSearchSpace(
        #       goal: MAXIMIZE,
        #       params: [HyperParameter(name: param-1, type: INTEGER, min: 1, max: 5, step: 0),
        #                HyperParameter(name: param-2, type: CATEGORICAL, list: cat1, cat2, cat3),
        #                HyperParameter(name: param-3, type: DISCRETE, list: 3, 2, 6),
        #                HyperParameter(name: param-4, type: DOUBLE, min: 1, max: 5, step: )]
        #   )
        search_space = HyperParameterSearchSpace.convert(request.experiment)
        # Convert the trials in GRPC request to the trials in algorithm side.
        # trials example:
        #   [Trial(
        #       assignment: [Assignment(name=param-1, value=2),
        #                    Assignment(name=param-2, value=cat1),
        #                    Assignment(name=param-3, value=2),
        #                    Assignment(name=param-4, value=3.44)],
        #       target_metric: Metric(name="metric-2" value="5643"),
        #       additional_metrics: [Metric(name=metric-1, value=435),
        #                            Metric(name=metric-3, value=5643)],
        #   Trial(
        #       assignment: [Assignment(name=param-1, value=3),
        #                    Assignment(name=param-2, value=cat2),
        #                    Assignment(name=param-3, value=6),
        #                    Assignment(name=param-4, value=4.44)],
        #       target_metric: Metric(name="metric-2" value="3242"),
        #       additional_metrics: [Metric(name=metric=1, value=123),
        #                            Metric(name=metric-3, value=543)],
        trials = Trial.convert(request.trials)
        #--------------------------------------------------------------
        # Your code here
        # Implement the logic to generate new assignments for the given request number.
        # For example, if request.request_number is 2, you should return:
        # [
        #   [Assignment(name=param-1, value=3),
        #    Assignment(name=param-2, value=cat2),
        #    Assignment(name=param-3, value=3),
        #    Assignment(name=param-4, value=3.22)
        #   ],
        #   [Assignment(name=param-1, value=4),
        #    Assignment(name=param-2, value=cat4),
        #    Assignment(name=param-3, value=2),
        #    Assignment(name=param-4, value=4.32)
        #   ],
        # ]
        list_of_assignments = your_logic(search_space, trials, request.request_number)
        #--------------------------------------------------------------
        # Convert list_of_assignments to
        return api_pb2.GetSuggestionsReply(
            trials=Assignment.generate(list_of_assignments)
        )
 ```
 ### Make a GRPC server for the algorithm
 Create a package under [cmd/suggestion](../cmd/suggestion). Then create the main function and Dockerfile. The new GRPC server should serve in port 6789.
 Here is an example: [cmd/suggestion/hyperopt](../cmd/suggestion/hyperopt).
 Then build the Docker image.
 ### Use the algorithm in Katib.
 Update the [Katib config](../manifests/v1beta1/components/controller/katib-config.yaml)
 and [Katib config patch](../manifests/v1beta1/installs/katib-standalone/katib-config-patch.yaml)
 with the new algorithm entity:
 ```diff
  suggestion: |-
    {
      "tpe": {
        "image": "docker.io/kubeflowkatib/suggestion-hyperopt"
      },
      "random": {
        "image": "docker.io/kubeflowkatib/suggestion-hyperopt"
      },
 +     "<new-algorithm-name>": {
 +       "image": "image built in the previous stage"
 +     }
    }
 ```
 Learn more about Katib config in the
 [Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/katib-config/)
 ### Contribute the algorithm to Katib
 If you want to contribute the algorithm to Katib, you could add unit test and/or
 e2e test for it in the CI and submit a PR.
 #### Unit Test
 Here is an example [test_hyperopt_service.py](../test/suggestion/v1beta1/test_hyperopt_service.py):
 ```python
 import grpc
 import grpc_testing
 import unittest
 from pkg.apis.manager.v1beta1.python import api_pb2_grpc
 from pkg.apis.manager.v1beta1.python import api_pb2
 from pkg.suggestion.v1beta1.hyperopt.service import HyperoptService
 class TestHyperopt(unittest.TestCase):
    def setUp(self):
        servicers = {
            api_pb2.DESCRIPTOR.services_by_name['Suggestion']: HyperoptService()
        }
        self.test_server = grpc_testing.server_from_dictionary(
            servicers, grpc_testing.strict_real_time())
 if __name__ == '__main__':
    unittest.main()
 ```
 You can setup the GRPC server using `grpc_testing`, then define your own test cases.
 #### E2E Test (Optional)
 E2e tests help Katib verify that the algorithm works well.
 Follow below steps to add your algorithm (Suggestion) to the Katib CI
 (replace `<name>` with your Suggestion name):
 1. Submit a PR to add a new ECR private registry to the AWS
   [`ECR_Private_Registry_List`](https://github.com/kubeflow/testing/blob/master/aws/IaC/CDK/test-infra/config/static_config/ECR_Resources.py#L18).
   Registry name should follow the pattern: `katib/v1beta1/suggestion-<name>`
 1. Create a new Experiment YAML in the [examples/v1beta1](../examples/v1beta1)
   with the new algorithm.
 1. Update [`setup-katib.sh`](../test/scripts/v1beta1/setup-katib.sh)
   script to modify `katib-config.yaml` with the new test Suggestion image name.
   For example:
   ```sh
   sed -i -e "s@docker.io/kubeflowkatib/suggestion-<name>@${ECR_REGISTRY}/${REPO_NAME}/v1beta1/suggestion-<name>@" ${CONFIG_PATCH}
   ```
 1. Add a new two steps in the CI workflow
   ([test/workflows/components/workflows-v1beta1.libsonnet](../test/workflows/components/workflows-v1beta1.libsonnet))
   to build and run the new Suggestion:
 ```diff
 . . .
                  {
                    name: "build-suggestion-hyperopt",
                    template: "build-suggestion-hyperopt",
                  },
                  {
                    name: "build-suggestion-chocolate",
                    template: "build-suggestion-chocolate",
                  },
 +                 {
 +                   name: "build-suggestion-<name>",
 +                   template: "build-suggestion-<name>",
 +                 },
 . . .
                  {
                    name: "run-tpe-e2e-tests",
                    template: "run-tpe-e2e-tests",
                  },
                  {
                    name: "run-grid-e2e-tests",
                    template: "run-grid-e2e-tests",
                  },
 +                 {
 +                   name: "run-<name>-e2e-tests",
 +                   template: "run-<name>-e2e-tests",
 +                 },
 . . .
            $.parts(namespace, name, overrides).e2e(prow_env, bucket).buildTemplate("build-suggestion-hyperopt", kanikoExecutorImage, [
              "/kaniko/executor",
              "--dockerfile=" + katibDir + "/cmd/suggestion/hyperopt/v1beta1/Dockerfile",
              "--context=dir://" + katibDir,
              "--destination=" + registry + "/katib/v1beta1/suggestion-hyperopt:$(PULL_BASE_SHA)",
            ]),  // build suggestion hyperopt
            $.parts(namespace, name, overrides).e2e(prow_env, bucket).buildTemplate("build-suggestion-chocolate", kanikoExecutorImage, [
              "/kaniko/executor",
              "--dockerfile=" + katibDir + "/cmd/suggestion/chocolate/v1beta1/Dockerfile",
              "--context=dir://" + katibDir,
              "--destination=" + registry + "/katib/v1beta1/suggestion-chocolate:$(PULL_BASE_SHA)",
            ]),  // build suggestion chocolate
 +           $.parts(namespace, name, overrides).e2e(prow_env, bucket).buildTemplate("build-suggestion-<name>", kanikoExecutorImage, [
 +             "/kaniko/executor",
 +             "--dockerfile=" + katibDir + "/cmd/suggestion/<name>/v1beta1/Dockerfile",
 +             "--context=dir://" + katibDir,
 +             "--destination=" + registry + "/katib/v1beta1/suggestion-<name>:$(PULL_BASE_SHA)",
 +           ]),  // build suggestion <name>
 . . .
            $.parts(namespace, name, overrides).e2e(prow_env, bucket).buildTemplate("run-tpe-e2e-tests", testWorkerImage, [
              "test/scripts/v1beta1/run-e2e-experiment.sh",
              "examples/v1beta1/tpe-example.yaml",
            ]),  // run TPE algorithm
            $.parts(namespace, name, overrides).e2e(prow_env, bucket).buildTemplate("run-grid-e2e-tests", testWorkerImage, [
              "test/scripts/v1beta1/run-e2e-experiment.sh",
              "examples/v1beta1/grid-example.yaml",
            ]),  // run grid algorithm
 +           $.parts(namespace, name, overrides).e2e(prow_env, bucket).buildTemplate("run-<name>-e2e-tests", testWorkerImage, [
 +             "test/scripts/v1beta1/run-e2e-experiment.sh",
 +             "examples/v1beta1/<name>-example.yaml",
 +           ]),  // run <name> algorithm
 . . .
 ```
--- a/docs/presentations.md
+++ b/docs/presentations.md
@ -4,16 +4,22 @@ Below are the list of Katib presentations and demos. If you want to add your
 presentation or demo in this list please send a pull request. Please keep the
 list in reverse chronological order.
-| Presentation or Demo title | Presenters | Date |
+| Title | Presenters | Event | Date |
-| --- | --- | --- |
+| --- | --- | --- | --- |
-| [AutoML and Training WG Summit July 2021](https://youtube.com/playlist?list=PL2gwy7BdKoGd9HQBCz1iC7vyFVN7Wa9N2) | Kubeflow Community | 2021-07-16 |
+| [Hiding Kubernetes Complexity for ML Engineers Using Kubeflow](https://docs.google.com/presentation/d/1Fepo9TUgbsO7YpxenCq17Y9KKQU_VgqYjAVBFWAFIU4/edit?usp=sharing) | Andrey Velichkevich | RE-WORK MLOps Summit | 2022-11-10 |
-| [MLREPA 2021: MLOps and AutoML in Cloud-Native Way with Kubeflow and Katib](https://youtu.be/33VJ6KNBBvU) | Andrey Velichkevich | 2021-04-25 |
+| [Managing Thousands of Automatic Machine Learning Experiments with Argo and Katib](https://youtu.be/0jBNXZjQ01I) | Andrey Velichkevich, [Yuan Tang](https://terrytangyuan.github.io/about/) | ArgoCon | 2022-09-21 |
-| [KF Community: A Tour of Katib's new UI for Kubeflow 1.3](https://youtu.be/1DtjB_boWcQ) | Kimonas Sotirchos | 2021-03-30 |
+| [Cloud Native AutoML with Argo Workflows and Katib](https://youtu.be/KjHqmS4gIxM?t=181) | Andrey Velichkevich, Johnu George | Argo Community Meeting | 2022-02-16 |
-| [KF Community: New UI for Kubeflow components](https://youtu.be/OKqx3IS2_G4) | Stefano Fioravanzo | 2020-12-08 |
+| [When Machine Learning Toolkit for Kubernetes Meets PaddlePaddle](https://github.com/terrytangyuan/public-talks/tree/main/talks/when-machine-learning-toolkit-for-kubernetes-meets-paddlepaddle-wave-summit-2021) | [Yuan Tang](https://terrytangyuan.github.io/about/) | Wave Summit | 2021-12-12 |
-| [KF Community: Using Pipelines in Katib](https://youtu.be/BszcHMkGLgc) | Andrey Velichkevich | 2020-11-10 |
+| [Bridging into Python Ecosystem with Cloud-Native Distributed Machine Learning Pipelines](https://github.com/terrytangyuan/public-talks/tree/main/talks/bridging-into-python-ecosystem-with-cloud-native-distributed-machine-learning-pipelines-argocon-2021) | [Yuan Tang](https://terrytangyuan.github.io/about/) | ArgoCon | 2021-12-08 |
-| [KubeCon 2020: From Notebook to Kubeflow Pipelines with HP Tuning](https://youtu.be/QK0NxhyADpM) | Stefano Fioravanzo, Ilias Katsakioris | 2020-09-04 |
+| [Towards Cloud-Native Distributed Machine Learning Pipelines at Scale](https://github.com/terrytangyuan/public-talks/tree/main/talks/towards-cloud-native-distributed-machine-learning-pipelines-at-scale-pydata-global-2021) | [Yuan Tang](https://terrytangyuan.github.io/about/) | PyData | 2021-10-29 |
-| [Kubeflow Dojo: Distributed Training and HPO Deep Dive](https://youtu.be/KJFOlhD3L1E) | Andrew Butler, Qianyang Yu, Tommy Li, Animesh Singh | 2020-07-17 |
+| [AutoML and Training WG Summit July 2021](https://youtube.com/playlist?list=PL2gwy7BdKoGd9HQBCz1iC7vyFVN7Wa9N2) | Kubeflow Community | Kubeflow Summit | 2021-07-16 |
-| [Kubeflow 101: Hyperparameter Tuning with Katib](https://youtu.be/nIKVlosDvrc) | Stephanie Wong | 2020-06-21 |
+| [MLOps and AutoML in Cloud-Native Way with Kubeflow and Katib](https://youtu.be/33VJ6KNBBvU) | Andrey Velichkevich | MLREPA | 2021-04-25 |
-| [KubeCon 2019: Hyperparameter Tuning Using Kubeflow](https://youtu.be/OkAoiA6A2Ac) | Richard Liu, Johnu George | 2019-07-05 |
+| [A Tour of Katib's new UI for Kubeflow 1.3](https://youtu.be/1DtjB_boWcQ) | Kimonas Sotirchos | Kubeflow Community Meeting | 2021-03-30 |
-| [KF Community: Kubeflow Katib & Hyperparameter Tuning](https://youtu.be/1PKH_D6zjoM) | Richard Liu | 2019-03-29 |
+| [New UI for Kubeflow components](https://youtu.be/OKqx3IS2_G4) | Stefano Fioravanzo | Kubeflow Community Meeting | 2020-12-08 |
-| [KF Community: Neural Architecture Search System on Kubeflow](https://youtu.be/WAK37UW7spo) | Andrey Velichkevich, Kirill Prosvirov, Jinan Zhou, Anubhav Garg | 2019-03-26 |
+| [Using Pipelines in Katib](https://youtu.be/BszcHMkGLgc) | Andrey Velichkevich | Kubeflow Community Meeting | 2020-11-10 |
 | [From Notebook to Kubeflow Pipelines with HP Tuning](https://youtu.be/QK0NxhyADpM) | Stefano Fioravanzo, Ilias Katsakioris | KubeCon | 2020-09-04 |
 | [Distributed Training and HPO Deep Dive](https://youtu.be/KJFOlhD3L1E) | Andrew Butler, Qianyang Yu, Tommy Li, Animesh Singh | Kubeflow Dojo | 2020-07-17 |
 | [Hyperparameter Tuning with Katib](https://youtu.be/nIKVlosDvrc) | Stephanie Wong | Kubeflow 101 | 2020-06-21 |
 | [Hyperparameter Tuning Using Kubeflow](https://youtu.be/OkAoiA6A2Ac) | Richard Liu, Johnu George | KubeCon | 2019-07-05 |
 | [Kubeflow Katib & Hyperparameter Tuning](https://youtu.be/1PKH_D6zjoM) | Richard Liu | Kubeflow Community Meeting | 2019-03-29 |
 | [Neural Architecture Search System on Kubeflow](https://youtu.be/WAK37UW7spo) | Andrey Velichkevich, Kirill Prosvirov, Jinan Zhou, Anubhav Garg | Kubeflow Community Meeting | 2019-03-26 |
--- a/docs/proposals/1214-custom-crd-in-trial/README.md
+++ b/docs/proposals/1214-custom-crd-in-trial/README.md
@ -1,4 +1,4 @@
-# Support custom CRD in Trial Job proposal
+# KEP-1214: Support custom CRD in Trial Job proposal
 <!-- START doctoc generated TOC please keep comment here to allow auto update -->
 <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
@ -124,7 +124,7 @@ For example, for TFJob:
 ```yaml
 . . .
 PrimaryPodLabel:
-  "job-role": "master"
+  "training.kubeflow.org/job-role": "master"
 . . .
 ```
@ -180,7 +180,7 @@ SucceededCondition: Succeeded
 Previously, we had problems with Istio sidecar containers,
 check [kubeflow/issue#1081](https://github.com/kubeflow/kubeflow/issues/4742).
 In some cases, it is unable to properly download datasets in training pod.
-It was fixed by adding annotation `sidecar.istio.io/inject: false` to appropriate Trial job in Katib controller.
+It was fixed by adding label `sidecar.istio.io/inject: false` to appropriate Trial job in Katib controller.
 Various CRD can have unified design and it is hard to understand where annotation must be specified
 to disable Istio injection for the running pods.
--- a/Show More
+++ b/Show More