* Run CI on a single machine by generating cloudbuild files
Instead of running each CI test on a separate Google Cloud machine, generate a cloudbuild.yml file to test all affected packages on a single machine. The benifits of this strategy include the following:
* Fewer machines are required for a single test (2 instead of up to 12), so queuing should happen much less often.
* Packages only need to be built once, instead of once per machine.
* Multithreaded steps have access to more cores.
When CI is run, the root cloudbuild.yml is launched on Google Cloud. This file has only three steps: Install yarn dependencies, run `yarn generate-cloudbuild-for-packages`, and run `run-build.sh`. The second step finds what packages were affected by the change and combines their cloudbuild.yml files into a single `cloudbuild_generated.yml` file, and the third step launches the generated build on Google Cloud. The steps in the generated build are set to `waitFor` each other according to the dependency tree `scripts/package_dependencies.json`, and each package's `build-deps` step is removed in favor of waiting for its dependencies to be built.
Each package's cloudbuild file still works as expected, and there are two new yarn commands at the root:
* `yarn test-packages` tests all packages affected by a change. It performs the same steps as CI.
* `yarn generate-cloudbuild-for-packages [PACKAGE]...` generates the cloudbuild file to test each `PACKAGE` and packages it might affect. If given no packages, it detects what packages have been changed in the current PR.
* Verify the dependency graph is a valid graph
* Add dependencies to e2e
* Use the node:12 docker image for tfjs-node CI
tfjs-node builds N-API bindings that end-to-end tests need to use. End-to-end uses our custom docker image (gcr.io/learnjs-174218/release) since it has to build tfjs-backend-wasm when it's run by itself (which requires emscripten). Our custom image uses node 12, and is incompatible with node 10 N-API bindings. Eventually, we'll want to move everything that's using node:10 to node:12, but that's out of the scope of this commit.
* Remove TODO and allow changes to generate_cloudbuild.js to trigger a full rebuild
* Add tfjs-backend-cpu to tfjs-data in package_dependencies.json
tfjs-backend-cpu is listed in tfjs-data's package.json as a linked dependency, but was missing form package_dependencies.json.
* Don't have tfjs-backend-webgpu build its dependencies in CI
Move tfjs-backend-webgpu's dependency building to the 'build-deps' step so it can be removed during CI.
This commit also updates the test cloudbuild files that should have been updated in f9c1fdd814. It would probably be best to amend that commit, but last time I tried force-pushing in GitHub, I broke the PR I was working on.
* Revert changes in tfjs-backend-webgpu to master
tfjs-backend-webgpu is now pinned at v2.7.0 for all its tfjs dependencies.
* Add the update-cloudbuild-tests script to package.json
update-cloudbuild-tests updates the change detection tests for scripts/generate_cloudbuild.js. It should be run iff a change affects cloudbuild files or the dependency tree.
* Make tfjs-backend-webgpu tests wait for yarn to run
* Add a destination argument to generate_cloudbuild_for_packages.js
Co-authored-by: Ping Yu <4018+pyu10055@users.noreply.github.com>
FEATURE
INTERNAL
* Run CI on a single machine by generating cloudbuild files.
Instead of running each CI test on a separate Google Cloud machine, generate a cloudbuild.yml file to test all affected packages on a single machine. The benifits of this strategy include the following:
* Fewer machines are required for a single test (2 instead of up to 12), so less queueing is required.
* Packages only need to be built once, instead of once per machine.
* Multithreaded steps have access to more cores.
When CI is run, the root cloudbuild.yml is launched on Google Cloud. This file has only two steps: Install yarn dependencies and run `yarn test-affected-packages`. The latter step finds what packages were affected by the change and combines their cloudbuild.yml files into a single `cloudbuild_generated.yml` file. The steps in this file are set to `waitFor` each other according to the dependency tree `scripts/package_dependencies.json`, and each package's `build-deps` step is removed in favor of waiting for its dependencies to be built.
Each package's cloudbuild file still works as expected, and there are a few new yarn commands at the root:
`yarn test-packages` tests all packages affected by a change. It's run during CI.
`yarn test-packages [PACKAGE]...` tests a given set of packages, including all packages they might affect if changed.
* Split generating the cloudbuild from running it
Google Cloud uses a separate image for node and for spawning more gcloud instances. The node image does not have the gcloud command, so we have to generate the cloudbuild file in one step and launch gcloud in another step.
Also fix the `rm` command in find_packages_with_diff.js to remove recursively.
* Add `yarn` to the `waitFor` field of webgpu tests
* Modularize Sum and Softmax. (#4148)
DEV
* Lint files
* Fix package dependencies for inference and vis
* Run CI on a single machine by generating cloudbuild files.
Instead of running each CI test on a separate Google Cloud machine, generate a cloudbuild.yml file to test all affected packages on a single machine. The benifits of this strategy include the following:
* Fewer machines are required for a single test (2 instead of up to 12), so less queueing is required.
* Packages only need to be built once, instead of once per machine.
* Multithreaded steps have access to more cores.
When CI is run, the root cloudbuild.yml is launched on Google Cloud. This file has only two steps: Install yarn dependencies and run `yarn test-affected-packages`. The latter step finds what packages were affected by the change and combines their cloudbuild.yml files into a single `cloudbuild_generated.yml` file. The steps in this file are set to `waitFor` each other according to the dependency tree `scripts/package_dependencies.json`, and each package's `build-deps` step is removed in favor of waiting for its dependencies to be built.
Each package's cloudbuild file still works as expected, and there are a few new yarn commands at the root:
`yarn test-packages` tests all packages affected by a change. It's run during CI.
`yarn test-packages [PACKAGE]...` tests a given set of packages, including all packages they might affect if changed.
* Split generating the cloudbuild from running it
Google Cloud uses a separate image for node and for spawning more gcloud instances. The node image does not have the gcloud command, so we have to generate the cloudbuild file in one step and launch gcloud in another step.
Also fix the `rm` command in find_packages_with_diff.js to remove recursively.
* Add `yarn` to the `waitFor` field of webgpu tests
* Lint files
* Fix package dependencies for inference and vis
* modularize node kernels (#4154)
Modularizes onesLike, zerosLike, unpack/unstack, batchMatMul, slice, isFinite, isInf, IsNan, reciprocal, squaredDifference
DEV
* Enable incremental typescript builds (#4092)
FEATURE
INTERNAL
Enable incremental builds in the root tsconfig.json to increase build speed during development. For example, running `yarn build` in tfjs-backend-cpu takes ~43 seconds without incremental builds and ~30 seconds with incremental builds (after the first build).
Co-authored-by: Ping Yu <4018+pyu10055@users.noreply.github.com>
* Fix typo in .github/stale.yml (#3642)
INTERNAL
* Address some of the comments
* Re-lint everything with VSCode.
* Hoist consts and use UPPER_SNAKE_CASE.
* Require each step to have an id and add ids to all steps in converter and vis.
* Use argparse for 'generate_cloudbuild_for_packages.js'
* Remove inference, vis, and react-native dependencies in package_dependencies.json
tfjs-inference, vis, and react-native use pinned versions of other tfjs packages, so they do not need to be tested when those pinned packages change.
* Fix generate_cloudbuild_for_packages
generate_cloudbuild_for_packages.js was using argparse functions from a later version of argparse that are not available in 1.0.10.
* [webgl] Modularize batchMatMul, sum. (#4140)
FEATURE
* Add Prod kernel to WASM backend. (#4138)
FEATURE
Co-authored-by: Ann Yuan <annyuan@gmail.com>
* Write tests for generate_cloudbuild.js
Write tests to detect if generate_cloudbuild's output changes. These tests are not unit tests, but they alert us if a change unexpectedly changes the project structure or the generation of cloudbuild files.
* Run CI on a single machine by generating cloudbuild files.
Instead of running each CI test on a separate Google Cloud machine, generate a cloudbuild.yml file to test all affected packages on a single machine. The benifits of this strategy include the following:
* Fewer machines are required for a single test (2 instead of up to 12), so less queueing is required.
* Packages only need to be built once, instead of once per machine.
* Multithreaded steps have access to more cores.
When CI is run, the root cloudbuild.yml is launched on Google Cloud. This file has only two steps: Install yarn dependencies and run `yarn test-affected-packages`. The latter step finds what packages were affected by the change and combines their cloudbuild.yml files into a single `cloudbuild_generated.yml` file. The steps in this file are set to `waitFor` each other according to the dependency tree `scripts/package_dependencies.json`, and each package's `build-deps` step is removed in favor of waiting for its dependencies to be built.
Each package's cloudbuild file still works as expected, and there are a few new yarn commands at the root:
`yarn test-packages` tests all packages affected by a change. It's run during CI.
`yarn test-packages [PACKAGE]...` tests a given set of packages, including all packages they might affect if changed.
* Split generating the cloudbuild from running it
Google Cloud uses a separate image for node and for spawning more gcloud instances. The node image does not have the gcloud command, so we have to generate the cloudbuild file in one step and launch gcloud in another step.
Also fix the `rm` command in find_packages_with_diff.js to remove recursively.
* Add `yarn` to the `waitFor` field of webgpu tests
* Lint files
* Fix package dependencies for inference and vis
* Address some of the comments
* Re-lint everything with VSCode.
* Hoist consts and use UPPER_SNAKE_CASE.
* Require each step to have an id and add ids to all steps in converter and vis.
* Use argparse for 'generate_cloudbuild_for_packages.js'
* Remove inference, vis, and react-native dependencies in package_dependencies.json
tfjs-inference, vis, and react-native use pinned versions of other tfjs packages, so they do not need to be tested when those pinned packages change.
* Fix generate_cloudbuild_for_packages
generate_cloudbuild_for_packages.js was using argparse functions from a later version of argparse that are not available in 1.0.10.
* Write tests for generate_cloudbuild.js
Write tests to detect if generate_cloudbuild's output changes. These tests are not unit tests, but they alert us if a change unexpectedly changes the project structure or the generation of cloudbuild files.
Co-authored-by: Na Li <linazhao@google.com>
Co-authored-by: Yannick Assogba <yassogba@google.com>
Co-authored-by: Ping Yu <4018+pyu10055@users.noreply.github.com>
Co-authored-by: Rajeshwar Reddy T <43972606+rthadur@users.noreply.github.com>
Co-authored-by: Ann Yuan <annyuan@gmail.com>
Co-authored-by: miaowzhang <miaowzhang@gmail.com>
BREAKING
This moves the webgl backend out of tfjs-core into its own package. It can be used as a peer dependency alongside tfjs-core and is part of our larger work to modularise tfjs.
BREAKING
This moves the javascript cpu backend out of tfjs-core into its own package. It can be used as a peer dependency alongside tfjs-core and is part of our larger work to modularise tfjs.
DEV
1.4.0 release notes here: https://github.com/tensorflow/tfjs/releases/tag/tfjs-v1.4.0
This change:
- Moves the script out of tfjs.
- Adds support for the monorepo. This now has to filter commits for the folder name. In the future we can support changelogs for other directories.
Each of the following get copied during "prepare" phase, which is on local NPM install (but not when a dependent package installs): https://docs.npmjs.com/misc/scripts
- tfjs-node/src => tfjs-node-gpu/src
- tfjs-node/binding => tfjs-node-gpu/binding
- tfjs-node/scripts => tfjs-node-gpu/scripts
- tfjs-node/binding.gyp => tfjs-node-gpu/binding.gyp
We do this with a copy because symlinks have the incorrect local directory structure.
This PR also fixes a bug where the node backend file caches the backend (this is incorrect behavior because it may switch out during unit tests or if there are accidentally 2 copies of tfjs-node as a dependency).
DEV
The diff script now clones and the current branch and commit as well as master rewinded to the merge base of the current branch and master to only find changes against where the branch was created (only true for branches on the main repo).
For forks, we diff against tensorflow/tfjs:master.
Test PRs:
- From a fork:master => https://github.com/tensorflow/tfjs/pull/1993
- From a fork:branch => https://github.com/tensorflow/tfjs/pull/2005
- This PR: from a branch that is behind master.
Note that this PR is behind tfjs-node version bump from 1.2.8 => 1.2.9
DEV
DEV
This PR also:
- Fixes the test-integration script by running it as a top-level build (for now just runs against core).
- Adds tfjs-data to the cloudbuild.yml
- Makes tfjs share tslint and tsconfig with the monorepo.
- Adds a directory entry for the tfjs cloudbuild.yml file.
Move core to its own folder `tfjs-core` and enable Monorepo build system where each package build runs only if the PR touches a file in that package folder.
DEV
We need to wait for yarn on the parent folder because child directories will look in the parent node_modules. This introduces a race condition that causes the CI to be flaky.
DEV
DEV
This makes the directory structure align with the NPM package structure more closely.
This has the added effect of making syncing to google3 much saner (one BUILD rule for each of the top-level NPM packages).
DEV
We do this with a separate CI test, `yarn test-async-backends` which creates a proxy object around a CPU backend to disallow top-level access to dataSync.
This PR also
- Adds a SYNC_BACKEND predicate which looks for the backend in the registry (and initializes it), checking for whether it is sync and only running sync tests in those environments (so they won't run in webgpu).
- Removes all the dataSync() calls in the backend_cpu and directly calls readSync (since the tests fail without that).
FEATURE
Some backends can only be initialized in an async way (e.g. WebGPU, Wasm etc). However the current backend registry requires that backends are initialized immediately.
This change relaxes that constraint, allowing `tf.registerBackend(factory)` where `factory` can return either a `KernelBackend`, or a `Promise<KernelBackend>`.
It also introduces `tf.ready(): Promise<void>` which users can call to make sure they await for their backend to initialize. If an op is called without awaiting for the backend to be ready, we throw an Error reminding the user to call `await tf.ready()`.
Also:
- Add CI build and lint for webgpu (no karma yet b/c it requires a custom build of Chrome)
- Allow tests in core to be included selectively in a karma run in webgpu
DEV
Add 2 cloud functions to achieve nightly builds of tfjs-core. If the build fails, we will get an email and a hangouts chat message.
### `trigger_nightly`
Programatically triggers a Cloud Build on master. This function is called by the Cloud Scheduler at 3am every day.
You can also trigger the function manually via the Cloud UI.
Command to re-deploy:
```sh
gcloud functions deploy nightly \
--runtime nodejs8 \
--trigger-topic nightly
```
If a build was triggered by nightly, there is a substitution variable `_NIGHTLY=true`.
You can forward the substitution as the `NIGHTLY` environment variable so the scripts can use it, by specifying `env: ['NIGHTLY=$_NIGHTLY']` in `cloudbuild.yml`.
E.g. `test-integration` uses the `NIGHTLY` bit to always run on nightly.
### `send_email`
Sends an email and a chat message with the nightly build status. Every build sends a message to the `cloud-builds` topic with its build information. The `send_email` function is subscribed to that topic and ignores all builds (e.g. builds triggered by pull requests) **except** for the nightly build and sends an email to an internal mailing list with its build status around 3:10am.
Command to re-deploy:
```sh
gcloud functions deploy send_email \
--runtime nodejs8 \
--stage-bucket learnjs-174218_cloudbuild \
--trigger-topic cloud-builds \
--set-env-vars MAILGUN_API_KEY="[API_KEY_HERE]",HANGOUTS_URL="[URL_HERE]"
```
### The pipeline
The pipeline looks like this:
1) At 3am, Cloud Scheduler writes to `nightly` topic
2) That triggers the `nightly` function, which starts a build programatically
3) That build runs and writes its status to `cloud-builds` topic
4) That triggers the `send_email` function, which sends email and chat with the build status.
DEV
This adds a CI that measures the impact of the change against master for the bundle size.
The results below are for dropping backend CPU from index.ts.
```
yarn test-bundle-size
~~~~minified bundle~~~~
==> post-gzip
diff: -12.25 K (-10.96%)
master: 111.73 K
change: 99.48 K
==> pre-gzip
diff: -45.68 K (-9.83%)
master: 464.95 K
change: 419.26 K
~~~~unminified bundle~~~~
==> post-gzip
diff: -20.00 K (-8.67%)
master: 230.60 K
change: 210.60 K
==> pre-gzip
diff: -145.45 K (-9.91%)
master: 1468.07 K
change: 1322.62 K
~~~~esm bundle~~~~
==> post-gzip
diff: -12.31 K (-11.05%)
master: 111.39 K
change: 99.08 K
==> pre-gzip
diff: -45.67 K (-9.84%)
master: 464.35 K
change: 418.67 K
Done in 119.29s.
```
DEV
Add ability to visualize our browser bundle.
<img width="780" alt="Screen Shot 2019-04-06 at 1 15 05 PM" src="https://user-images.githubusercontent.com/2294279/55672856-0aa8eb00-586e-11e9-866c-148005669bce.png">
To produce a visualization, call `yarn rollup -c --visualize`
. It will output `dist/tf-core.min.js.html` with the breakdown.
Details:
- Update rollup and its plugins to latest so we have this ability.
- Change the `build-npm.sh` script to visualize so we always publish the breakdown on npm. That way we can visit http://unpkg.com/@tensorflow/tfjs/dist/tf-core.min.js.html and see the breakdown for any published version.
DEV
Make build logs viewable for those who join our mailing list by writing the logs to a public gcs bucket.
Also add a pull request template that informs people to join our mailing list if they want to see the build logs.