pvt
Pvt2
Bases: SemanticSegmentationModel
Wrapper around PvT2 above
Source code in src/tcd_pipeline/models/pvt.py
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
|
forward(x)
Forward pass of the model. The batch is first run through the processor which constructs a dictionary of inputs for the model. This processor handles varying types of input, for example tensors, numpy arrays, PIL images. Within the pipeline this function is normally called with images pre-converted to tensors as they are tiles sampled from a source (geo) image.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Union[tensor, list[tensor]]
|
List[torch.Tensor] or torch.Tensor |
required |
Returns:
Type | Description |
---|---|
Tensor
|
Interpolated semantic segmentation predictions with softmax applied |
Source code in src/tcd_pipeline/models/pvt.py
166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
|
load_model()
Load model weights from HuggingFace Hub or local storage. The config key model.weights is used. If you want to force using local files only, you can set the environment variable:
HF_FORCE_LOCAL
this can be useful for testing in offline environments.
It is assumed that the image processor has the same name as the
model; if you're providing a local checkpoint then the 'weight'
path should be a directory containing the state dictionary of the
model (saved using save_pretrained) and a preprocessor_config.json
file.
Source code in src/tcd_pipeline/models/pvt.py
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
|
setup()
Performs any setup action - in the case of SegFormer, just checks whether to force HuggingFace to use local files only.
Source code in src/tcd_pipeline/models/pvt.py
135 136 137 138 139 140 141 |
|
PvtV2ForSemanticSegmentation
Bases: Module
FPN segmentation decoder per https://arxiv.org/pdf/1901.02446, see Figure 3.
Backbone features are passed to a feature pyramid network. Each feature output is smoothed before passed into an upscaling sequence:
(conv, gn, relu, 2x upscale)
until it is 1/4 the input size. The scaled feature maps are then summed and convolved with a 1x kernel to form the segmentation output, and the final result is upscaled to the input size (i.e. 4x).
Source code in src/tcd_pipeline/models/pvt.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
|
UpsampleStage
Bases: Module
FPN Upsample stage which performs a 3x3 convolution followed by a group normalisation, GELU activation and optional upscale.
Source code in src/tcd_pipeline/models/pvt.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
|