cvpods.data package¶

cvpods.data.build_dataset(config, dataset_names, transforms=[], is_train=True)[source]¶: dataset_names: List[str], in which elemements must be in format of “dataset_task_version”

cvpods.data.build_test_loader(cfg)[source]¶

Similar to build_train_loader. But this function uses the given dataset_name argument (instead of the names in cfg), and uses batch size 1.

Parameters: cfg – a cvpods config dict
Returns: DataLoader – a torch DataLoader, that loads the given detection dataset, with test-time transformation and batching.

cvpods.data.build_train_loader(cfg)[source]¶

A data loader is created by the following steps: 1. Use the dataset names in config to query DatasetCatalog, and obtain a list of dicts. 2. Start workers to work on the dicts. Each worker will:

Map each metadata dict into another format to be consumed by the model.

Batch them by simply putting dicts into a list.

The batched list[mapped_dict] is what this dataloader will return.

Parameters: cfg (config dict) – the config
Returns: an infinite iterator of training data

cvpods.data.build_transform_gens(pipelines)[source]¶

Create a list of TransformGen from config.

Transform list is a list of tuple which includes Transform name and parameters. :param pipelines: cfg.INPUT.TRAIN_PIPELINES and cfg.INPUT.TEST_PIPELINES are used here

Returns: list[TransformGen] – a list of several TransformGen.

class cvpods.data.ConcatDataset(datasets)[source]¶

Bases: torch.utils.data.dataset.ConcatDataset

A wrapper of concatenated dataset. Same as torch.utils.data.dataset.ConcatDataset, but concat the group flag for image aspect ratio. :param datasets: A list of datasets. :type datasets: list[Dataset]

class cvpods.data.RepeatDataset(dataset, times)[source]¶

Bases: object

A wrapper of repeated dataset. The length of repeated dataset will be times larger than the original dataset. This is useful when the data loading time is long but the dataset is small. Using RepeatDataset can reduce the data loading time between epochs. :param dataset: The dataset to be repeated. :type dataset: Dataset :param times: Repeat times. :type times: int

cvpods.data.catalog module¶

cvpods.data.detection_utils module¶

exception cvpods.data.detection_utils.SizeMismatchError[source]¶

Bases: ValueError

When loaded image has difference width/height compared with annotation.

cvpods.data.detection_utils.convert_PIL_to_numpy(image, format)[source]¶

Convert PIL image to numpy array of target format. :param image: a PIL image :type image: PIL.Image :param format: the format of output image :type format: str

Returns: (np.ndarray) – also see read_image

cvpods.data.detection_utils.convert_image_to_rgb(image, format)[source]¶

Convert an image from given format to RGB. :param image: an HWC image :type image: np.ndarray or Tensor :param format: the format of input image, also see read_image :type format: str

Returns: (np.ndarray) – (H,W,3) RGB image in 0-255 range, can be either float or uint8

cvpods.data.detection_utils.read_image(file_name, format=None)[source]¶

Read an image into the given format. Will apply rotation and flipping if the image has such exif information. :param file_name: image file path :type file_name: str :param format: one of the supported image modes in PIL, or “BGR” or “YUV-BT.601”. :type format: str

Returns

image (np.ndarray) –

an HWC image in the given format, which is 0-255, uint8 for: supported image modes in PIL or “BGR”; float (0-1 for Y) for YUV-BT.601.

cvpods.data.detection_utils.check_image_size(dataset_dict, image)[source]¶: Raise an error if the image does not match the size specified in the dict.

cvpods.data.detection_utils.transform_proposals(dataset_dict, image_shape, transforms, min_box_side_len, proposal_topk)[source]¶

Apply transformations to the proposals in dataset_dict, if any.

Parameters

dataset_dict (dict) – a dict read from the dataset, possibly contains fields “proposal_boxes”, “proposal_objectness_logits”, “proposal_bbox_mode”
image_shape (tuple) – height, width
transforms (TransformList) –
min_box_side_len (int) – keep proposals with at least this size
proposal_topk (int) – only keep top-K scoring proposals

The input dict is modified in-place, with abovementioned keys removed. A new key “proposals” will be added. Its value is an Instances object which contains the transformed proposals in its field “proposal_boxes” and “objectness_logits”.

cvpods.data.detection_utils.annotations_to_instances(annos, image_size, mask_format='polygon')[source]¶

Create an Instances object used by the models, from instance annotations in the dataset dict.

Parameters

annos (list[dict]) – a list of instance annotations in one image, each element for one instance.
image_size (tuple) – height, width

Returns

Instances – It will contain fields “gt_boxes”, “gt_classes”, “gt_masks”, “gt_keypoints”, if they can be obtained from annos. This is the format that builtin models expect.

cvpods.data.detection_utils.annotations_to_instances_rotated(annos, image_size)[source]¶

Create an Instances object used by the models, from instance annotations in the dataset dict. Compared to annotations_to_instances, this function is for rotated boxes only

Parameters

annos (list[dict]) – a list of instance annotations in one image, each element for one instance.
image_size (tuple) – height, width

Returns

Instances – Containing fields “gt_boxes”, “gt_classes”, if they can be obtained from annos. This is the format that builtin models expect.

cvpods.data.detection_utils.filter_empty_instances(instances, by_box=True, by_mask=True)[source]¶

Filter out empty instances in an Instances object.

Parameters

instances (Instances) –
by_box (bool) – whether to filter out instances with empty boxes
by_mask (bool) – whether to filter out instances with empty masks

Returns

Instances – the filtered instances.

cvpods.data.detection_utils.create_keypoint_hflip_indices(dataset_names, meta)[source]¶

Parameters: dataset_names (list[str]) – list of dataset names
Returns: ndarray[int] – a vector of size=#keypoints, storing the horizontally-flipped keypoint indices.

cvpods.data.detection_utils.gen_crop_transform_with_instance(crop_size, image_size, instance)[source]¶

Generate a CropTransform so that the cropping region contains the center of the given instance.

Parameters

crop_size (tuple) – h, w in pixels
image_size (tuple) – h, w
instance (dict) – an annotation dict of one instance, in cvpods’s dataset format.

cvpods.data.detection_utils.check_metadata_consistency(key, dataset_names, meta)[source]¶

Check that the datasets have consistent metadata.

Parameters

key (str) – a metadata key
dataset_names (list[str]) – a list of dataset names

Raises

AttributeError – if the key does not exist in the metadata
ValueError – if the given datasets do not have the same metadata values defined by key

cvpods.data.detection_utils.check_sample_valid(args)[source]¶

cvpods.data.detection_utils.imdecode(data, *, require_chl3=True, require_alpha=False)[source]¶: decode images in common formats (jpg, png, etc.) :param data: encoded image data :type data: bytes :param require_chl3: whether to convert gray image to 3-channel BGR image :param require_alpha: whether to add alpha channel to BGR image :rtype: numpy.ndarray

cvpods.data.datasets module¶

class cvpods.data.datasets.CityPersonsDataset(cfg, dataset_name, transforms=[], is_train=True)[source]¶

Bases: cvpods.data.base_dataset.BaseDataset

__getitem__(index)[source]¶: Load data, apply transforms, converto to Instances.

evaluate(predictions)[source]¶: Dataset must provide a evaluation function to evaluate model.

property ground_truth_annotations¶

class cvpods.data.datasets.CityScapesDataset(cfg, dataset_name, transforms=[], is_train=True)[source]¶

Bases: cvpods.data.base_dataset.BaseDataset

__getitem__(index)[source]¶: Load data, apply transforms, converto to Instances.

evaluate(predictions)[source]¶: Dataset must provide a evaluation function to evaluate model.

property ground_truth_annotations¶

class cvpods.data.datasets.COCODataset(cfg, dataset_name, transforms=[], is_train=True)[source]¶

Bases: cvpods.data.base_dataset.BaseDataset

__getitem__(index)[source]¶: Load data, apply transforms, converto to Instances.

evaluate(predictions)[source]¶: Dataset must provide a evaluation function to evaluate model.

property ground_truth_annotations¶

class cvpods.data.datasets.CrowdHumanDataset(cfg, dataset_name, transforms=[], is_train=True)[source]¶

Bases: cvpods.data.base_dataset.BaseDataset

__getitem__(index)[source]¶: Load data, apply transforms, converto to Instances.

evaluate(predictions)[source]¶: Dataset must provide a evaluation function to evaluate model.

property ground_truth_annotations¶

class cvpods.data.datasets.ImageNetDataset(cfg, dataset_name, transforms=[], is_train=True)[source]¶

Bases: cvpods.data.base_dataset.BaseDataset

__getitem__(index)[source]¶: Load data, apply transforms, converto to Instances.

class cvpods.data.datasets.ImageNetLTDataset(cfg, dataset_name, transforms=[], is_train=True)[source]¶

Bases: cvpods.data.base_dataset.BaseDataset

__getitem__(index)[source]¶: Load data, apply transforms, converto to Instances.

class cvpods.data.datasets.LVISDataset(cfg, dataset_name, transforms=[], is_train=True)[source]¶

Bases: cvpods.data.base_dataset.BaseDataset

__getitem__(index)[source]¶: Load data, apply transforms, converto to Instances.

evaluate(predictions)[source]¶: Dataset must provide a evaluation function to evaluate model.

property ground_truth_annotations¶

class cvpods.data.datasets.Objects365Dataset(cfg, dataset_name, transforms=[], is_train=True)[source]¶

Bases: cvpods.data.base_dataset.BaseDataset

__getitem__(index)[source]¶: Load data, apply transforms, converto to Instances.

evaluate(predictions)[source]¶: Dataset must provide a evaluation function to evaluate model.

property ground_truth_annotations¶

class cvpods.data.datasets.CIFAR10Dataset(cfg, dataset_name, transforms, is_train=True, **kwargs)[source]¶: Bases: torchvision.datasets.cifar.CIFAR10

class cvpods.data.datasets.STL10Datasets(cfg, dataset_name, transforms=[], is_train=True, **kwargs)[source]¶: Bases: torchvision.datasets.stl10.STL10

class cvpods.data.datasets.VOCDataset(cfg, dataset_name, transforms=[], is_train=True)[source]¶

Bases: cvpods.data.base_dataset.BaseDataset

__getitem__(index)[source]¶: Load data, apply transforms, converto to Instances.

class cvpods.data.datasets.WiderFaceDataset(cfg, dataset_name, transforms=[], is_train=True)[source]¶

Bases: cvpods.data.base_dataset.BaseDataset

__getitem__(index)[source]¶: Load data, apply transforms, converto to Instances.

evaluate(predictions)[source]¶: Dataset must provide a evaluation function to evaluate model.

property ground_truth_annotations¶

cvpods.data.samplers module¶

class cvpods.data.samplers.DistributedGroupSampler(dataset, samples_per_gpu=1, num_replicas=None, rank=None)[source]¶

Bases: torch.utils.data.sampler.Sampler

Sampler that restricts data loading to a subset of the dataset. It is especially useful in conjunction with torch.nn.parallel.DistributedDataParallel. In such case, each process can pass a DistributedSampler instance as a DataLoader sampler, and load a subset of the original dataset that is exclusive to it. .. note:

Dataset is assumed to be of constant size.

__init__(dataset, samples_per_gpu=1, num_replicas=None, rank=None)[source]¶

Parameters

dataset (Dataset) – Dataset used for sampling.
num_replicas (optional) – Number of processes participating in distributed training.
rank (optional) – Rank of the current process within num_replicas.

set_epoch(epoch)[source]¶

class cvpods.data.samplers.InferenceSampler(size: int)[source]¶

Bases: torch.utils.data.sampler.Sampler

Produce indices for inference. Inference needs to run on the __exact__ set of samples, therefore when the total number of samples is not divisible by the number of workers, this sampler produces different number of samples on different workers.

__init__(size: int)[source]¶

Parameters: size (int) – the total number of data of the underlying dataset to sample from

class cvpods.data.samplers.RepeatFactorTrainingSampler(dataset, repeat_thresh, shuffle=True, seed=None)[source]¶

Bases: torch.utils.data.sampler.Sampler

Similar to TrainingSampler, but suitable for training on class imbalanced datasets like LVIS. In each epoch, an image may appear multiple times based on its “repeat factor”. The repeat factor for an image is a function of the frequency the rarest category labeled in that image. The “frequency of category c” in [0, 1] is defined as the fraction of images in the training set (without repeats) in which category c appears.

See https://arxiv.org/abs/1908.03195 (>= v2) Appendix B.2.

__init__(dataset, repeat_thresh, shuffle=True, seed=None)[source]¶

Parameters

dataset (Dataset) – dataset used for sampling.
repeat_thresh (float) – frequency threshold below which data is repeated.
shuffle (bool) – whether to shuffle the indices or not.
seed (int) – the initial seed of the shuffle. Must be the same across all workers. If None, will use a random seed shared among workers (require synchronization among all workers).

cvpods.data.transforms module¶

class cvpods.data.transforms.ExpandTransform(left, top, ratio, mean=0, 0, 0)[source]¶

Bases: cvpods.data.transforms.transform.Transform

Expand the image and boxes according the specified expand ratio.

__init__(left, top, ratio, mean=0, 0, 0)[source]¶

Parameters

top (left,) – crop the image by img[top: top+h, left:left+w].
ratio (float) – image expand ratio.
mean (tuple) – mean value of dataset.

apply_image(img)[source]¶: Randomly place the original image on a canvas of ‘ratio’ x original image size filled with mean values. The ratio is in the range of ratio_range.

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

Apply expand transform on coordinates.

Parameters: coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns: ndarray – expand coordinates.

class cvpods.data.transforms.AffineTransform(src, dst, output_size, pad_value=[0, 0, 0])[source]¶

Bases: cvpods.data.transforms.transform.Transform

Augmentation from CenterNet

__init__(src, dst, output_size, pad_value=[0, 0, 0])[source]¶: output_size:(w, h)

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

Apply AffineTransform for the image(s).

Parameters: img (ndarray) – of shape HxW, HxWxC, or NxHxWxC. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns: ndarray – the image(s) after applying affine transform.

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

Affine the coordinates.

Parameters: coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns: ndarray – the flipped coordinates.

Note

The inputs are floating point coordinates, not pixel indices. Therefore they are flipped by (W - x, H - y), not (W - 1 - x, H 1 - y).

class cvpods.data.transforms.BlendTransform(src_image: numpy.ndarray, src_weight: float, dst_weight: float)[source]¶

Bases: cvpods.data.transforms.transform.Transform

Transforms pixel colors with PIL enhance functions.

__init__(src_image: numpy.ndarray, src_weight: float, dst_weight: float)[source]¶

Blends the input image (dst_image) with the src_image using formula: src_weight * src_image + dst_weight * dst_image

Parameters

src_image (ndarray) – Input image is blended with this image
src_weight (float) – Blend weighting of src_image
dst_weight (float) – Blend weighting of dst_image

apply_image(img: numpy.ndarray, interp: str = None) → numpy.ndarray [source]¶

Apply blend transform on the image(s).

Parameters

img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
interp (str) – keep this option for consistency, perform blend would not require interpolation.

Returns

ndarray – blended image(s).

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶: Apply no transform on the coordinates.

apply_segmentation(segmentation: numpy.ndarray) → numpy.ndarray [source]¶: Apply no transform on the full-image segmentation.

class cvpods.data.transforms.IoUCropTransform(x0: int, y0: int, w: int, h: int)[source]¶

Bases: cvpods.data.transforms.transform.Transform

Perform crop operations on images.

This crop operation will checks whether the center of each instance’s bbox is in the cropped image.

__init__(x0: int, y0: int, w: int, h: int)[source]¶

Parameters: y0, w, h (x0,) – crop the image(s) by img[y0:y0+h, x0:x0+w].

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

Crop the image(s).

Parameters: img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns: ndarray – cropped image(s).

apply_box(box: numpy.ndarray) → numpy.ndarray [source]¶

Apply the transform on an axis-aligned box. By default will transform the corner points and use their minimum/maximum to create a new axis-aligned box. Note that this default may change the size of your box, e.g. in rotations.

Parameters: box (ndarray) – Nx4 floating point array of XYXY format in absolute coordinates.
Returns: ndarray – box after apply the transformation.

Note

The coordinates are not pixel indices. Coordinates on an image of shape (H, W) are in range [0, W] or [0, H].

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

Apply crop transform on coordinates.

Parameters: coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns: ndarray – cropped coordinates.

apply_polygons(polygons: list) → list [source]¶

Apply crop transform on a list of polygons, each represented by a Nx2 array. It will crop the polygon with the box, therefore the number of points in the polygon might change.

Parameters: polygon (list[ndarray]) – each is a Nx2 floating point array of (x, y) format in absolute coordinates.
Returns: ndarray – cropped polygons.

class cvpods.data.transforms.CropTransform(x0: int, y0: int, w: int, h: int)[source]¶

Bases: cvpods.data.transforms.transform.Transform

Perform crop operations on images.

__init__(x0: int, y0: int, w: int, h: int)[source]¶

Parameters: y0, w, h (x0,) – crop the image(s) by img[y0:y0+h, x0:x0+w].

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

Crop the image(s).

Parameters: img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns: ndarray – cropped image(s).

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

Apply crop transform on coordinates.

Parameters: coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns: ndarray – cropped coordinates.

apply_polygons(polygons: list) → list [source]¶

Apply crop transform on a list of polygons, each represented by a Nx2 array. It will crop the polygon with the box, therefore the number of points in the polygon might change.

Parameters: polygon (list[ndarray]) – each is a Nx2 floating point array of (x, y) format in absolute coordinates.
Returns: ndarray – cropped polygons.

class cvpods.data.transforms.CropPadTransform(x0: int, y0: int, w: int, h: int, new_w: int, new_h: int, img_value=None, seg_value=None)[source]¶

Bases: cvpods.data.transforms.transform.Transform

get_pad_offset(ori: int, tar: int)[source]¶

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

Crop and Pad the image(s).

Parameters: img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns: ndarray – cropped and padded image(s).

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

Apply crop and pad transform on coordinates.

Parameters: coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns: ndarray – cropped and padded coordinates.

apply_polygons(polygons: list) → list [source]¶

Apply crop and pad transform on a list of polygons, each represented by a Nx2 array.

Parameters: polygon (list[ndarray]) – each is a Nx2 floating point array of (x, y) format in absolute coordinates.
Returns: ndarray – cropped and padded polygons.

apply_segmentation(segmentation: numpy.ndarray) → numpy.ndarray [source]¶

Apply crop and pad transform on the full-image segmentation.

Parameters: segmentation (ndarray) – of shape HxW. The array should have integer or bool dtype.
Returns: ndarray – cropped and padded segmentation.

class cvpods.data.transforms.GridSampleTransform(grid: numpy.ndarray, interp: str)[source]¶

Bases: cvpods.data.transforms.transform.Transform

__init__(grid: numpy.ndarray, interp: str)[source]¶

Parameters

grid (ndarray) – grid has x and y input pixel locations which are used to compute output. Grid has values in the range of [-1, 1], which is normalized by the input height and width. The dimension is N x H x W x 2.
interp (str) – interpolation methods. Options include nearest and bilinear.

apply_image(img: numpy.ndarray, interp: str = None) → numpy.ndarray [source]¶

Apply grid sampling on the image(s).

Parameters

img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
interp (str) – interpolation methods. Options include nearest and bilinear.

Returns

ndarray – grid sampled image(s).

apply_coords(coords: numpy.ndarray)[source]¶: Not supported.

apply_segmentation(segmentation: numpy.ndarray) → numpy.ndarray [source]¶

Apply grid sampling on the full-image segmentation.

Parameters: segmentation (ndarray) – of shape HxW. The array should have integer or bool dtype.
Returns: ndarray – grid sampled segmentation.

class cvpods.data.transforms.RotationTransform(h, w, angle, expand=True, center=None, interp=None)[source]¶

Bases: cvpods.data.transforms.transform.Transform

This method returns a copy of this image, rotated the given number of degrees counter clockwise around its center.

__init__(h, w, angle, expand=True, center=None, interp=None)[source]¶

Parameters

w (h,) – original image size
angle (float) – degrees for rotation
expand (bool) – choose if the image should be resized to fit the whole rotated image (default), or simply cropped
center (tuple (width, height)) – coordinates of the rotation center if left to None, the center will be fit to the center of each image center has no effect if expand=True because it only affects shifting
interp – cv2 interpolation method, default cv2.INTER_LINEAR

apply_image(img, interp=None)[source]¶: img should be a numpy array, formatted as Height * Width * Nchannels

apply_coords(coords)[source]¶: coords should be a N * 2 array-like, containing N couples of (x, y) points

apply_segmentation(segmentation)[source]¶

create_rotation_matrix(offset=0)[source]¶

inverse()[source]¶: The inverse is to rotate it back with expand, and crop to get the original shape.

class cvpods.data.transforms.HFlipTransform(width: int)[source]¶

Bases: cvpods.data.transforms.transform.Transform

Perform horizontal flip.

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

Flip the image(s).

Parameters: img (ndarray) – of shape HxW, HxWxC, or NxHxWxC. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns: ndarray – the flipped image(s).

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

Flip the coordinates.

Parameters: coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns: ndarray – the flipped coordinates.

Note

The inputs are floating point coordinates, not pixel indices. Therefore they are flipped by (W - x, H - y), not (W - 1 - x, H 1 - y).

apply_rotated_box(rotated_boxes)¶

Apply the horizontal flip transform on rotated boxes.

Parameters: rotated_boxes (ndarray) – Nx5 floating point array of (x_center, y_center, width, height, angle_degrees) format in absolute coordinates.

class cvpods.data.transforms.VFlipTransform(height: int)[source]¶

Bases: cvpods.data.transforms.transform.Transform

Perform vertical flip.

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

Flip the image(s).

Parameters: img (ndarray) – of shape HxW, HxWxC, or NxHxWxC. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns: ndarray – the flipped image(s).

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

Flip the coordinates.

Parameters: coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns: ndarray – the flipped coordinates.

Note

The inputs are floating point coordinates, not pixel indices. Therefore they are flipped by (W - x, H - y), not (W - 1 - x, H - 1 - y).

class cvpods.data.transforms.NoOpTransform[source]¶

Bases: cvpods.data.transforms.transform.Transform

A transform that does nothing.

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

apply_rotated_box(x)¶

class cvpods.data.transforms.ScaleTransform(h: int, w: int, new_h: int, new_w: int, interp: str = 'BILINEAR')[source]¶

Bases: cvpods.data.transforms.transform.Transform

Resize the image to a target size.

__init__(h: int, w: int, new_h: int, new_w: int, interp: str = 'BILINEAR')[source]¶

Parameters

w (h,) – original image size.
new_w (new_h,) – new image size.
interp (str) – the interpolation method. Options includes: * “NEAREST” * “BILINEAR” * “BICUBIC” * “LANCZOS” * “HAMMING” * “BOX”

apply_image(img: numpy.ndarray, interp: str = None) → numpy.ndarray [source]¶

Resize the image(s).

Parameters: img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns: ndarray – resized image(s).

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

Compute the coordinates after resize.

Parameters: coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns: ndarray – resized coordinates.

apply_segmentation(segmentation: numpy.ndarray) → numpy.ndarray [source]¶

Apply resize on the full-image segmentation.

Parameters: segmentation (ndarray) – of shape HxW. The array should have integer or bool dtype.
Returns: ndarray – resized segmentation.

class cvpods.data.transforms.DistortTransform(hue, saturation, exposure, image_format)[source]¶

Bases: cvpods.data.transforms.transform.Transform

Distort image w.r.t hue, saturation and exposure.

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

Parameters: img (ndarray) – of shape HxW, HxWxC, or NxHxWxC. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns: ndarray – the distorted image(s).

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

apply_segmentation(segmentation: numpy.ndarray) → numpy.ndarray [source]¶

class cvpods.data.transforms.Transform[source]¶

Bases: object

Base class for implementations of __deterministic__ transformations for image and other data structures. “Deterministic” requires that the output of all methods of this class are deterministic w.r.t their input arguments. In training, there should be a higher-level policy that generates (likely with random variations) these transform ops. Each transform op may handle several data types, e.g.: image, coordinates, segmentation, bounding boxes. Some of them have a default implementation, but can be overwritten if the default isn’t appropriate. The implementation of each method may choose to modify its input data in-place for efficient transformation.

abstract apply_image(img: numpy.ndarray)[source]¶

Apply the transform on an image.

Parameters: img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns: ndarray – image after apply the transformation.

abstract apply_coords(coords: numpy.ndarray)[source]¶

Apply the transform on coordinates.

Parameters: coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns: ndarray – coordinates after apply the transformation.

Note

The coordinates are not pixel indices. Coordinates on an image of shape (H, W) are in range [0, W] or [0, H].

apply_segmentation(segmentation: numpy.ndarray) → numpy.ndarray [source]¶

Apply the transform on a full-image segmentation. By default will just perform “apply_image”.

Parameters

segmentation (ndarray) – of shape HxW. The array should have integer
bool dtype. (or) –

Returns

ndarray – segmentation after apply the transformation.

apply_box(box: numpy.ndarray) → numpy.ndarray [source]¶

Apply the transform on an axis-aligned box. By default will transform the corner points and use their minimum/maximum to create a new axis-aligned box. Note that this default may change the size of your box, e.g. in rotations.

Parameters: box (ndarray) – Nx4 floating point array of XYXY format in absolute coordinates.
Returns: ndarray – box after apply the transformation.

Note

The coordinates are not pixel indices. Coordinates on an image of shape (H, W) are in range [0, W] or [0, H].

apply_polygons(polygons: list) → list [source]¶

Apply the transform on a list of polygons, each represented by a Nx2 array. By default will just transform all the points.

Parameters: polygon (list[ndarray]) – each is a Nx2 floating point array of (x, y) format in absolute coordinates.
Returns: list[ndarray] – polygon after apply the transformation.

Note

The coordinates are not pixel indices. Coordinates on an image of shape (H, W) are in range [0, W] or [0, H].

__call__(image, annotations=None, **kwargs)[source]¶: Apply transfrom to images and annotations (if exist)

classmethod register_type(data_type: str, func: Callable)[source]¶

Register the given function as a handler that this transform will use for a specific data type.

Parameters

data_type (str) – the name of the data type (e.g., box)
func (callable) – takes a transform and a data, returns the transformed data.

Examples:

def func(flip_transform, voxel_data):
    return transformed_voxel_data
HFlipTransform.register_type("voxel", func)

# ...
transform = HFlipTransform(...)
transform.apply_voxel(voxel_data)  # func will be called

class cvpods.data.transforms.TransformList(transforms: list)[source]¶

Bases: object

Maintain a list of transform operations which will be applied in sequence. .. attribute:: transforms

type

list[Transform]

__init__(transforms: list)[source]¶

Parameters: transforms (list[Transform]) – list of transforms to perform.

__getattr__(name: str)[source]¶

Parameters: name (str) – name of the attribute.

__add__(other: cvpods.data.transforms.transform.TransformList) → cvpods.data.transforms.transform.TransformList[source]¶

Parameters: other (TransformList) – transformation to add.
Returns: TransformList – list of transforms.

__iadd__(other: cvpods.data.transforms.transform.TransformList) → cvpods.data.transforms.transform.TransformList[source]¶

Parameters: other (TransformList) – transformation to add.
Returns: TransformList – list of transforms.

__radd__(other: cvpods.data.transforms.transform.TransformList) → cvpods.data.transforms.transform.TransformList[source]¶

Parameters: other (TransformList) – transformation to add.
Returns: TransformList – list of transforms.

insert(idx: int, other: cvpods.data.transforms.transform.TransformList) → cvpods.data.transforms.transform.TransformList[source]¶

Parameters

idx (int) – insert position.
other (TransformList) – transformation to insert.

Returns

None

class cvpods.data.transforms.ExtentTransform(src_rect, output_size, interp=2, fill=0)[source]¶

Bases: cvpods.data.transforms.transform.Transform

Extracts a subregion from the source image and scales it to the output size.

The fill color is used to map pixels from the source rect that fall outside the source image.

See: https://pillow.readthedocs.io/en/latest/PIL.html#PIL.ImageTransform.ExtentTransform

__init__(src_rect, output_size, interp=2, fill=0)[source]¶

Parameters

src_rect (x0, y0, x1, y1) – src coordinates
output_size (h, w) – dst image size
interp – PIL interpolation methods
fill – Fill color used when src_rect extends outside image

apply_image(img, interp=None)[source]¶

apply_coords(coords)[source]¶

apply_segmentation(segmentation)[source]¶

class cvpods.data.transforms.ResizeTransform(h, w, new_h, new_w, interp)[source]¶

Bases: cvpods.data.transforms.transform.Transform

Resize the image to a target size.

__init__(h, w, new_h, new_w, interp)[source]¶

Parameters

w (h,) – original image size
new_w (new_h,) – new image size
interp – PIL interpolation methods

apply_image(img, interp=None)[source]¶

apply_coords(coords)[source]¶

apply_segmentation(segmentation)[source]¶

apply_rotated_box(rotated_boxes)¶

Apply the resizing transform on rotated boxes. For details of how these (approximation) formulas are derived, please refer to RotatedBoxes.scale().

Parameters: rotated_boxes (ndarray) – Nx5 floating point array of (x_center, y_center, width, height, angle_degrees) format in absolute coordinates.

class cvpods.data.transforms.GaussianBlurTransform(sigma, p=1.0)[source]¶

Bases: cvpods.data.transforms.transform.Transform

GaussianBlur using PIL.ImageFilter.GaussianBlur

__init__(sigma, p=1.0)[source]¶

Parameters

sigma (List(float)) – sigma of gaussian
p (float) – probability of perform this augmentation

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

class cvpods.data.transforms.GaussianBlurConvTransform(kernel_size, p=1.0)[source]¶

Bases: cvpods.data.transforms.transform.Transform

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

class cvpods.data.transforms.SolarizationTransform(thresh=128, p=0.5)[source]¶

Bases: cvpods.data.transforms.transform.Transform

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

class cvpods.data.transforms.ComposeTransform(tfms)[source]¶

Bases: object

Composes several transforms together.

__init__(tfms)[source]¶

Parameters: transforms (list[Transform]) – list of transforms to compose.

class cvpods.data.transforms.LabSpaceTransform[source]¶

Bases: cvpods.data.transforms.transform.Transform

Convert image from RGB into Lab color space

apply_image(img: numpy.ndarray) → numpy.ndarray [source]¶

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

class cvpods.data.transforms.PadTransform(top: int, left: int, target_h: int, target_w: int, pad_value=0, seg_value=255)[source]¶

Bases: cvpods.data.transforms.transform.Transform

Pad image with pad_value to the specified target_h and target_w.

Adds top rows of pad_value on top, left columns of pad_value on the left, and then pads the image on the bottom and right with pad_value until it has dimensions target_h, target_w.

This op does nothing if top and left is zero and the image already has size target_h by target_w.

__init__(top: int, left: int, target_h: int, target_w: int, pad_value=0, seg_value=255)[source]¶

Parameters

top (int) – number of rows of pad_value to add on top.
left (int) – number of columns of pad_value to add on the left.
target_h (int) – height of output image.
target_w (int) – width of output image.
pad_value (int) – the value used to pad the image.
seg_value (int) – the value used to pad the semantic seg annotaions.

apply_image(img: numpy.ndarray, pad_value=None) → numpy.ndarray [source]¶

apply_coords(coords: numpy.ndarray) → numpy.ndarray [source]¶

apply_segmentation(segmentation: numpy.ndarray) → numpy.ndarray [source]¶

Apply pad transform on the full-image segmentation.

Parameters: segmentation (ndarray) – of shape HxW. The array should have integer or bool dtype.
Returns: ndarray – padded segmentation.

class cvpods.data.transforms.Pad(top, left, target_h, target_w, pad_value=0)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Pad image with pad_value to the specified target_h and target_w.

Adds top rows of pad_value on top, left columns of pad_value on the left, and then pads the image on the bottom and right with pad_value until it has dimensions target_h, target_w.

This op does nothing if top and left is zero and the image already has size target_h by target_w.

__init__(top, left, target_h, target_w, pad_value=0)[source]¶

Parameters

top (int) – number of rows of pad_value to add on top.
left (int) – number of columns of pad_value to add on the left.
target_h (int) – height of output image.
target_w (int) – width of output image.
pad_value (int) – the value used to pad the image.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomScale(output_size, ratio_range=0.1, 2, interp='BILINEAR')[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Randomly scale the image according to the specified output size and scale ratio range.

This transform has the following three steps:

select a random scale factor according to the specified scale ratio range.

recompute the accurate scale_factor using rounded scaled image size.

select non-zero random offset (x, y) if scaled image is larger than output_size.

__init__(output_size, ratio_range=0.1, 2, interp='BILINEAR')[source]¶

Parameters

output_size (tuple) – image output size.
ratio_range (tuple) – range of scale ratio.
interp (str) – the interpolation method. Options includes: * “NEAREST” * “BILINEAR” * “BICUBIC” * “LANCZOS” * “HAMMING” * “BOX”

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.Expand(ratio_range=1, 4, mean=0, 0, 0, prob=0.5)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Random Expand the image & bboxes.

__init__(ratio_range=1, 4, mean=0, 0, 0, prob=0.5)[source]¶

Parameters

ratio_range (tuple) – range of expand ratio.
mean (tuple) – mean value of dataset.
prob (float) – probability of applying this transformation.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.MinIoURandomCrop(min_ious=0.1, 0.3, 0.5, 0.7, 0.9, min_crop_size=0.3)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious.

__init__(min_ious=0.1, 0.3, 0.5, 0.7, 0.9, min_crop_size=0.3)[source]¶

Parameters

min_ious (tuple) – minimum IoU threshold for all intersections with bounding boxes
min_crop_size (float) – minimum crop’s size (i.e. h,w := a*h, a*w, where a >= min_crop_size).

get_transform(img, annotations)[source]¶

Parameters

img (ndarray) – of shape HxWxC(RGB). The array can be of type uint8 in range [0, 255], or floating point in range [0, 255].
annotations (list[dict[str->str]]) –
Each item in the list is a bbox label of an object. The object is
represented by a dict,

which contains:
- bbox (list): bbox coordinates, top left and bottom right.
- bbox_mode (str): bbox label mode, for example: XYXY_ABS,
  XYWH_ABS and so on…

class cvpods.data.transforms.RandomSwapChannels(prob=0.5)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Randomly swap image channels.

__init__(prob=0.5)[source]¶

Parameters: prob (float) – probability of swap channels.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.CenterAffine(boarder, output_size, pad_value=[0, 0, 0], random_aug=True)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Affine Transform for CenterNet

__init__(boarder, output_size, pad_value=[0, 0, 0], random_aug=True)[source]¶: output_size (w, h) shape

get_transform(img, annotations=None)[source]¶

generate_center_and_scale(img_shape)[source]¶: generate center shpae : (h, w)

static generate_src_and_dst(center, scale, output_size)[source]¶

class cvpods.data.transforms.RandomBrightness(intensity_min, intensity_max, prob=1.0)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Randomly transforms image brightness.

Brightness intensity is uniformly sampled in (intensity_min, intensity_max). - intensity < 1 will reduce brightness - intensity = 1 will preserve the input image - intensity > 1 will increase brightness

See: https://pillow.readthedocs.io/en/3.0.x/reference/ImageEnhance.html

__init__(intensity_min, intensity_max, prob=1.0)[source]¶

Parameters

intensity_min (float) – Minimum augmentation.
intensity_max (float) – Maximum augmentation.
prob (float) – probability of transforms image brightness.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomContrast(intensity_min, intensity_max, prob=1.0)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Randomly transforms image contrast.

Contrast intensity is uniformly sampled in (intensity_min, intensity_max). - intensity < 1 will reduce contrast - intensity = 1 will preserve the input image - intensity > 1 will increase contrast

See: https://pillow.readthedocs.io/en/3.0.x/reference/ImageEnhance.html

__init__(intensity_min, intensity_max, prob=1.0)[source]¶

Parameters

intensity_min (float) – Minimum augmentation.
intensity_max (float) – Maximum augmentation.
prob (float) – probability of transforms image contrast.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomCrop(crop_type: str, crop_size, strict_mode=True)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Randomly crop a subimage out of an image.

__init__(crop_type: str, crop_size, strict_mode=True)[source]¶

Parameters

crop_type (str) – one of “relative_range”, “relative”, “absolute”. See config/defaults.py for explanation.
crop_size (tuple[float]) – the relative ratio or absolute pixels of height and width
strict_mode (bool) – if True, the target crop_size must be smaller than the original image size.

get_transform(img, annotations=None)[source]¶

get_crop_size(image_size)[source]¶

Parameters: image_size (tuple) – height, width
Returns: crop_size (tuple) – height, width in absolute pixels

class cvpods.data.transforms.RandomCropWithInstance(crop_type: str, crop_size, strict_mode=True)[source]¶

Bases: cvpods.data.transforms.transform_gen.RandomCrop

Make sure the cropping region contains the center of a random instance from annotations.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomCropWithMaxAreaLimit(crop_type: str, crop_size, strict_mode=True, single_category_max_area=1.0, ignore_value=255)[source]¶

Bases: cvpods.data.transforms.transform_gen.RandomCrop

Find a cropping window such that no single category occupies more than single_category_max_area in sem_seg.

The function retries random cropping 10 times max.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomCropPad(crop_type: str, crop_size, img_value=None, seg_value=None)[source]¶

Bases: cvpods.data.transforms.transform_gen.RandomCrop

Randomly crop and pad a subimage out of an image.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomExtent(scale_range, shift_range)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Outputs an image by cropping a random “subrect” of the source image.

The subrect can be parameterized to include pixels outside the source image, in which case they will be set to zeros (i.e. black). The size of the output image will vary with the size of the random subrect.

__init__(scale_range, shift_range)[source]¶

Parameters

scale_range (l, h) – Range of input-to-output size scaling factor.
shift_range (x, y) – Range of shifts of the cropped subrect. The rect is shifted by [w / 2 * Uniform(-x, x), h / 2 * Uniform(-y, y)], where (w, h) is the (width, height) of the input image. Set each component to zero to crop at the image’s center.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomFlip(prob=0.5, *, horizontal=True, vertical=False)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Flip the image horizontally or vertically with the given probability.

__init__(prob=0.5, *, horizontal=True, vertical=False)[source]¶

Parameters

prob (float) – probability of flip.
horizontal (boolean) – whether to apply horizontal flipping
vertical (boolean) – whether to apply vertical flipping

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomSaturation(intensity_min, intensity_max, prob=1.0)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Randomly transforms image saturation.

Saturation intensity is uniformly sampled in (intensity_min, intensity_max). - intensity < 1 will reduce saturation (make the image more grayscale) - intensity = 1 will preserve the input image - intensity > 1 will increase saturation

See: https://pillow.readthedocs.io/en/3.0.x/reference/ImageEnhance.html

__init__(intensity_min, intensity_max, prob=1.0)[source]¶

Parameters

intensity_min (float) – Minimum augmentation (1 preserves input).
intensity_max (float) – Maximum augmentation (1 preserves input).
prob (float) – probability of transforms image saturation.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomLighting(scale)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Randomly transforms image color using fixed PCA over ImageNet.

The degree of color jittering is randomly sampled via a normal distribution, with standard deviation given by the scale parameter.

__init__(scale)[source]¶

Parameters: scale (float) – Standard deviation of principal component weighting.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomDistortion(hue, saturation, exposure, image_format='BGR')[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Random distort image’s hue, saturation and exposure.

__init__(hue, saturation, exposure, image_format='BGR')[source]¶: RandomDistortion Initialization. :param hue: value of hue :type hue: float :param saturation: value of saturation :type saturation: float :param exposure: value of exposure :type exposure: float

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.Resize(shape, interp=2)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Resize image to a target size

__init__(shape, interp=2)[source]¶

Parameters

shape – (h, w) tuple or a int.
interp – PIL interpolation method.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.ResizeShortestEdge(short_edge_length, max_size=9223372036854775807, sample_style='range', interp=2)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Scale the shorter edge to the given size, with a limit of max_size on the longer edge. If max_size is reached, then downscale so that the longer edge does not exceed max_size.

__init__(short_edge_length, max_size=9223372036854775807, sample_style='range', interp=2)[source]¶

Parameters

short_edge_length (list[int]) – If sample_style=="range", a [min, max] interval from which to sample the shortest edge length. If sample_style=="choice", a list of shortest edge lengths to sample from.
max_size (int) – maximum allowed longest edge length.
sample_style (str) – either “range” or “choice”.
interp – PIL interpolation method.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.ResizeLongestEdge(long_edge_length, sample_style='range', interp=2, jitter=0.0, 32)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Scale the longer edge to the given size.

__init__(long_edge_length, sample_style='range', interp=2, jitter=0.0, 32)[source]¶

Parameters

long_edge_length (list[int]) – If sample_style=="range", a [min, max] interval from which to sample the shortest edge length. If sample_style=="choice", a list of shortest edge lengths to sample from.
sample_style (str) – either “range” or “choice”.
interp – PIL interpolation method.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.ShuffleList(transforms)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Randomly shuffle the transforms order.

__init__(transforms)[source]¶

Parameters: transforms (list[TransformGen]) – List of transform to be shuffled.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RandomList(transforms, num_layers=2, choice_weights=None)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Random select subset of provided augmentations.

__init__(transforms, num_layers=2, choice_weights=None)[source]¶

Parameters

transforms (List[TorchTransformGen]) – list of transforms need to be performed.
num_layers (int) – parameters of np.random.choice.
choice_weights (optional, float) – parameters of np.random.choice.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.RepeatList(transforms, repeat_times)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Forward several times of provided transforms for a given image.

__init__(transforms, repeat_times)[source]¶

Parameters

transforms (list[TransformGen]) – List of transform to be repeated.
repeat_times (int) – number of duplicates desired.

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.TransformGen[source]¶

Bases: object

TransformGen takes an image of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255] as input.

It creates a Transform based on the given image, sometimes with randomness. The transform can then be used to transform images or other data (boxes, points, annotations, etc.) associated with it.

The assumption made in this class is that the image itself is sufficient to instantiate a transform. When this assumption is not true, you need to create the transforms by your own.

A list of TransformGen can be applied with apply_transform_gens().

abstract get_transform(img, annotations=None)[source]¶

__repr__()[source]¶: Produce something like: “MyTransformGen(field1={self.field1}, field2={self.field2})”

__str__()¶: Produce something like: “MyTransformGen(field1={self.field1}, field2={self.field2})”

class cvpods.data.transforms.TorchTransformGen(tfm)[source]¶

Bases: object

Wrapper transfrom of transforms in torchvision. It convert img (np.ndarray) to PIL image, and convert back to np.ndarray after transform.

class cvpods.data.transforms.GaussianBlur(sigma, p=1.0)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Gaussian blur transform.

__init__(sigma, p=1.0)[source]¶

Parameters

sigma (List(float)) – sigma of gaussian
p (float) – probability of perform this augmentation

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.GaussianBlurConv(kernel_size, p)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.Solarization(threshold=128, p=0.5)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

get_transform(img, annotations=None)[source]¶

class cvpods.data.transforms.AutoAugment(name, prob=0.5, magnitude=10, hparams=None)[source]¶

Bases: cvpods.data.transforms.transform_gen.TransformGen

Convert any of AutoAugment into a cvpods-fashion Transform such that can be configured in: config.py

__init__(name, prob=0.5, magnitude=10, hparams=None)[source]¶

Parameters

name (str) – any type of transforms list in _RAND_TRANSFORMS.
prob (float) – probability of perform current augmentation.
magnitude (int) – intensity / magnitude of each augmentation.
hparams (dict) – hyper-parameters required by each augmentation.

get_transform(img, annotations=None)[source]¶