Commit Graph

45 Commits

Author SHA1 Message Date
Googler 066f229e27 fix(rlhf): Supporting adapter only output for reward model training
PiperOrigin-RevId: 608740017
2024-02-20 14:27:59 -08:00
Googler fc183f3acb chore(components): Rename several `_implementation.llm` components
PiperOrigin-RevId: 607487816
2024-02-15 16:18:27 -08:00
Michael Hu 449c304686 fix(components): Use PipelineJob location in AutoSxS components, add init file
PiperOrigin-RevId: 607055407
2024-02-14 11:40:06 -08:00
Googler 0b75afdd8a chore(components): Update AutoSxS and RLHF image tags
PiperOrigin-RevId: 606330905
2024-02-12 12:30:46 -08:00
Michael Hu 1280753eb4 chore(components): Use new module for looking up ReFINED and AutoSxS image tags
PiperOrigin-RevId: 605457014
2024-02-08 16:14:35 -08:00
Googler d4c3f35797 feat(components): Add RLAIF pipeline to preview
PiperOrigin-RevId: 605396378
2024-02-08 12:36:38 -08:00
Googler 87db18e3a1 No public description
PiperOrigin-RevId: 605012378
2024-02-07 09:41:59 -08:00
Michael Hu b6247fb8e4 chore(components): Update component naming in AutoSxS implementation
PiperOrigin-RevId: 604678269
2024-02-06 09:49:04 -08:00
Michael Hu 14193def65 chore(components): Create module containing AutoSxS and RLHF image tag
PiperOrigin-RevId: 603765313
2024-02-02 12:58:35 -08:00
Michael Hu 8c7b5b2bf5 feat(components): Use a single inference component for AutoSxS
PiperOrigin-RevId: 601877680
2024-01-26 15:06:46 -08:00
Googler b9e08ded48 fix(components): Only run `preview.llm.bulk_inference` after tuning third-party models with RLHF
PiperOrigin-RevId: 601226133
2024-01-24 13:59:24 -08:00
Googler 60b66dca0f docs(components): Update AutoSxS pipeline to use "question_answering" as task name instead of "question_answer", where "question_answer" is still supported, but deprecated
chore(components): Update RLHF and AutoSxS image tags

PiperOrigin-RevId: 600532640
2024-01-22 12:17:59 -08:00
Googler 4bb3423889 feat(components): Support scheduling and labels in utils.build_payload
PiperOrigin-RevId: 599904659
2024-01-19 12:20:10 -08:00
Googler a66c5990e4 feat(components): Output errors as a separate table from Arbiter
PiperOrigin-RevId: 592769441
2023-12-21 00:03:12 -08:00
Michael Hu 14de087e74 No public description
PiperOrigin-RevId: 592702825
2023-12-20 17:53:15 -08:00
Googler 075d58f89f fix(components): Resolve unique model display name on each `preview.llm.rlhf_pipeline` run instead of reusing cached result
PiperOrigin-RevId: 591365087
2023-12-15 14:48:17 -08:00
Googler f51a930120 fix(components): Use `large_model_reference` as `model_reference_name` when uploading models from `preview.llm.rlhf_pipeline` instead of hardcoding value as `text-bison@001`
PiperOrigin-RevId: 591346782
2023-12-15 13:36:40 -08:00
Googler 3d62d26727 feat(component): Migrate AutoSxS pipeline to preview and move related files to _implementation/llm directory to help Model Eval team use side by side metrics as part of their pipeline
PiperOrigin-RevId: 590677273
2023-12-13 12:20:20 -08:00
Googler 227eab1c68 fix(components): Use `llama-2-7b` for the base reward model when tuning `llama-2-13` with the `preview.llm.rlhf_pipeline`
PiperOrigin-RevId: 589490121
2023-12-09 18:23:01 -08:00
Googler 685634d4a3 feat(components): Add `num_microbatches` to `_implementation.llm` training components
PiperOrigin-RevId: 589267101
2023-12-08 16:07:09 -08:00
Googler 9007fb0007 feat(components): Bump image tag used by `preview.llm` pipelines
PiperOrigin-RevId: 589253163
2023-12-08 15:10:38 -08:00
Googler 708b8bd623 fix(components): Append `tune-type` label when uploading models tuned by `preview.llm.rlhf_pipeline`
PiperOrigin-RevId: 588172611
2023-12-05 13:30:22 -08:00
Googler c23b720f10 feat(components): Group `preview.llm.rlhf_pipeline` components for more readability
PiperOrigin-RevId: 582733817
2023-11-15 10:56:39 -08:00
Googler bcd59220f4 feat(components): Group `preview.llm.rlhf_pipeline` components for more readability
PiperOrigin-RevId: 582394326
2023-11-14 11:52:02 -08:00
Googler a927984394 feat(components): Group `preview.llm.rlhf_pipeline` components for more readability
PiperOrigin-RevId: 581354060
2023-11-10 13:32:31 -08:00
Googler 4a5cbbfb8d feat(components): Update image tag used by RLHF components
PiperOrigin-RevId: 580698126
2023-11-08 16:27:14 -08:00
Googler a8dd3117d5 No public description
PiperOrigin-RevId: 579961750
2023-11-06 14:40:08 -08:00
Googler f67cbfa81f feat(components): Add ability to tune chat model with `preview.llm.rlhf_pipeline`
PiperOrigin-RevId: 578262705
2023-10-31 12:15:30 -07:00
Googler d8f2c140ce feat(components): Add chat dataset preprocessor to `preview.llm.infer_pipeline`
PiperOrigin-RevId: 577988605
2023-10-30 16:20:21 -07:00
Googler 99fd2017a7 feat(components): Add ability to preprocess chat llama datasets to `_implementation.llm.chat_dataset_preprocessor`
PiperOrigin-RevId: 575004978
2023-10-19 14:35:42 -07:00
Googler 0e240db397 No public description
PiperOrigin-RevId: 574969883
2023-10-19 12:38:31 -07:00
Googler 4d71fdac3f feat(components): Update image tag used by llm pipelines
PiperOrigin-RevId: 573014609
2023-10-12 14:35:05 -07:00
Googler 412216f832 feat(components): Add question_answer support for AutoSxS default instructions
PiperOrigin-RevId: 572677918
2023-10-11 13:56:06 -07:00
Googler 067033762d feat(components): internal change
PiperOrigin-RevId: 571455446
2023-10-06 15:56:19 -07:00
Googler b273aabb89 feat(components): Add LLM implementation component that uploads tensorboard metrics after training
PiperOrigin-RevId: 571359958
2023-10-06 09:36:57 -07:00
Googler 45fe8e8658 feat(components): Use 64 v3 TPUs for llm pipelines
PiperOrigin-RevId: 568282755
2023-09-25 11:49:51 -07:00
Googler b31d8a57ef feat(components): Update default image tag used by LLM implementation components
PiperOrigin-RevId: 566661112
2023-09-19 09:56:42 -07:00
Googler 6468b4db11 feat(components): Use t5-xl reward model when tuning t5-xxl
PiperOrigin-RevId: 565809352
2023-09-15 16:43:19 -07:00
Michael Hu 3b8cea060f fix(components): Have RLHF importer use default image if override is falsy
PiperOrigin-RevId: 563521264
2023-09-07 13:03:38 -07:00
Googler e21174f94a feat(components): Add sampling_strategy parameter to bulk inferrer to support different strategy. By default, we use greedy
PiperOrigin-RevId: 562860317
2023-09-05 12:43:17 -07:00
Googler 110e082481 feat(components): Update policy to reward model name mapping in function based component in _implementation/llm
PiperOrigin-RevId: 560479276
2023-08-27 03:45:37 -07:00
Googler 9ce2866527 feat(components): Update supported large model reference names that can be resolved by function based component in _implementation/llm
PiperOrigin-RevId: 559493244
2023-08-23 11:36:15 -07:00
Connor McCarthy 5b59e4a76f chore(components): add `__init__.py` file to gcpc `_implementation/llm`
PiperOrigin-RevId: 558589929
2023-08-20 10:36:23 -07:00
Connor McCarthy 2451b51070 chore(components): use type generics from typing module in GCPC
PiperOrigin-RevId: 558574225
2023-08-20 08:09:38 -07:00
Googler 611298a8ee feat(components): Add rlhf and infer pipelines to preview/llm. Add llm related components to _implementation/llm
PiperOrigin-RevId: 558430903
2023-08-19 11:38:14 -07:00