We train AnyControl on MultiGen-20M and synthetic unaligned data. Please prepare training data following the instructions.
- (Optional) Install gsutil. Skip this step if you prefer to download MultiGen-20M on a browser.
mkdir tools && cd tools
curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-linux-x86_64.tar.gz
tar -xf google-cloud-cli-linux-x86_64.tar.gz
./google-cloud-sdk/install.sh
./google-cloud-sdk/bin/gcloud init
rm google-cloud-cli-linux-x86_64.tar.gz
cd ..- (Optional) Install awscli. Skip this step if you have already downloaded Open Images or prefer to download in other ways.
pip install awscli- Install PowerPaint and download
PowerPaint-v1model. Actually we have include PowerPaint code indata/PowerPaintwith a commit id ofe037c3f2ff62e3fc55072ef91a891c85f419a0cb. Please go to PowerPaint for the latest version. To downloadPowerPaint-v1model, do
cd data
conda install git-lfs
git lfs install
git clone https://huggingface.co/JunhaoZhuang/PowerPaint-v1
cd ..
Step 0. Download MultiGen-20M.
cd AnyControl
sh scripts/download_multigen.shThe folder structure of AnyControl/datasets/MultiGen-20M should be
MultiGen-20M
├── conditions
│ ├── group_0_canny
│ ├── group_0_depth
│ ├── ...
├── images
│ ├── aesthetics_6_plus_0
│ ├── aesthetics_6_plus_1
│ ├── ...
├── json_files
│ ├── aesthetics_plus_all_group_canny_all.json
│ ├── aesthetics_plus_all_group_depth_all.json
│ ├── ...
Step 1. Generate .jsonl file.
python scripts/genereate_jsonl.py --dataset MultiGen-20MStep 0. Download COCO dataset.
cd AnyControl
sh scripts/download_coco.shThe folder structure of AnyControl/datasets/MSCOCO should be
MSCOCO
├── train2017
├── annotations
│ ├── instances_train2017.json
│ ├── captions_train2017.json
│ ├── ...
Step 1. Inpaiting with PowerPaint.
python scripts/prepare_unaligned_coco.pyStep 2. Extract multiple conditions.
python scripts/prepare_conditions --dataset COCOStep 3. Generate .jsonl file.
python scripts/genereate_jsonl.py --dataset COCOStep 0. Download Open Images dataset.
cd AnyControl
sh scripts/download_openimages.shThe folder structure of AnyControl/datasets/OpenImages should be
OpenImages
├── train
├── train-annotations-object-segmentation.csv
├── oidv7-class-descriptions-boxable.csv
├── ...
Step 1. Generate captions for Open Images data with blip2.
pip install salesforce-lavis
python scripts/prepare_openimages_captions.pyStep 2. Inpainting with PowerPaint.
python scripts/prepare_unaligned_openimages.pyStep 3. Extract multiple conditions.
python scripts/prepare_conditions --dataset OpenImagesStep 4. Generate .jsonl file.
python scripts/genereate_jsonl.py --dataset OpenImages