Resource: Dreambooth Training README

Github

README describing proper methods for constructing a dataset.

According to the README Regularization images allow for prior preservation of a models existing dataset.

  • Regularization images should be generated before training from the base model.
  • Image classification should be a generic term ( i.e. person, cat, dog, man, woman ).
  • Most experiments suggest 200 to 300 regularization images per class improve training accuracy.
  • In order to include classification in a data set directory naming convention is as follows: <number of repeats>_<data keyword> <class keyword>.
  • Different data keywords can be used for a single class allowing for varied training data within a class. For example training two sets of images of different characters.
  • Important Note: Previous experiments have shown that without regularization images, or using images not generated by a model, training data will overwrite a model’s data damaging the models ability to generate images other than the training data.