torchvision model-zoo's image normalization is: ``` mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] ``` CLIP's is: ``` mean=[0.48145466, 0.4578275, 0.40821073], std=[0.26862954, 0.26130258, 0.27577711] ``` what's the story behind the difference? Are CLIP's normalization parameters re-calculated on WebImageText?