Guides

Data augmentation

DeepPoseKit uses imgaug to apply randomised geometric and photometric augmentations during training, improving robustness to lighting, perspective and noise.

Augmentations are configured as a pipeline of imgaug operations. Geometric transforms are applied jointly to images and keypoints; photometric transforms apply only to images.

Recommended starter pipeline

from imgaug import augmenters as iaa

augmenter = iaa.Sequential([
    iaa.Affine(
        scale=(0.8, 1.2),
        rotate=(-180, 180),
        translate_percent={'x': (-0.05, 0.05), 'y': (-0.05, 0.05)},
    ),
    iaa.AdditiveGaussianNoise(scale=(0, 0.05 * 255)),
    iaa.Multiply((0.8, 1.2)),
    iaa.LinearContrast((0.7, 1.3)),
])

Plug it into the TrainingGenerator

from deepposekit.io import TrainingGenerator

train_generator = TrainingGenerator(
    generator=data_generator,
    augmenter=augmenter,
    downsample_factor=2,
    sigma=5,
)

Sanity-check your pipeline

Always inspect a few augmented samples before kicking off a long training run — it’s the cheapest way to catch a misconfigured rotation or accidentally inverted contrast.