mirror of https://github.com/tensorflow/models.git
|
…
|
||
|---|---|---|
| .. | ||
| configs | ||
| modeling | ||
| tasks | ||
| utils | ||
| README.md | ||
| data_loader.py | ||
| train.py | ||
README.md
Language Modelling with Pixels (PIXEL)
TF2 implementation of PIXEL.
Setup
The current setup requires a numpyfied pytorch pixel model and preprocessed data. For the pixel model, we directly convert its state_dict and saved as numpy. For the preprocessed data, we run their pytorch implementation and save the pixel transformed data.
Let's put these data in the directory PATH_TO_PIXEL_DATA_DIR, then
, to convert the numpyfied model to a tensorflow checkpoint, run
python3 utils/convert_numpy_weights_to_tf.py $PATH_TO_PIXEL_DATA_DIR
This will create a pixel_encoder.ckpt. Denote the path to this checkpoint as
PATH_TO_PIXEL_ENCODER_CKPT.
Training
export PATH_TO_PIXEL_DATA_DIR=xxx
export PATH_TO_PIXEL_ENCODER_CKPT=xxx
PATH_TO_TRAINING_RECORD=$PATH_TO_PIXEL_DATA_DIR/train.tf_record-*-of-20 # path to the training record
PATH_TO_TESTING_RECORD=$PATH_TO_PIXEL_DATA_DIR/eval.tf_record # path to the evaluation record
TPU_NAME="<tpu-name>" # The name assigned while creating a Cloud TPU
MODEL_DIR=/tmp/pixel_sst2 # directory to store the experiment
# Now launch the experiment.
python3 -m official.projects.pixel.train \
--experiment=pixel_sst2_finetune \
--params_override="task.train_data.input_path=${PATH_TO_TRAINING_RECORD},task.validation_data.input_path=${PATH_TO_TESTING_RECORD},runtime.distribution_strategy=tpu,init_checkpoint=$PATH_TO_PIXEL_ENCODER_CKPT"
--mode=train_and_eval \
--tpu=$TPU_NAME \
--model_dir=$MODEL_DIR