PT2 Archive Spec#
Created On: Jul 16, 2025 | Last Updated On: Sep 09, 2025
The following specification defines the archive format which can be produced through the following methods:
torch.export through calling
torch.export.save()
AOTInductor through calling
torch._inductor.aoti_compile_and_package()
The archive is a zipfile, and can be manipulated using standard zipfile APIs.
The following is a sample archive. We will walk through the archive folder by folder.
.
โโโ archive_format
โโโ byteorder
โโโ .data
โ โโโ serialization_id
โ โโโ version
โโโ data
โ โโโ aotinductor
โ โ โโโ model1
โ โ โโโ cf5ez6ifexr7i2hezzz4s7xfusj4wtisvu2gddeamh37bw6bghjw.kernel_metadata.json
โ โ โโโ cf5ez6ifexr7i2hezzz4s7xfusj4wtisvu2gddeamh37bw6bghjw.kernel.cpp
โ โ โโโ cf5ez6ifexr7i2hezzz4s7xfusj4wtisvu2gddeamh37bw6bghjw.wrapper_metadata.json
โ โ โโโ cf5ez6ifexr7i2hezzz4s7xfusj4wtisvu2gddeamh37bw6bghjw.wrapper.cpp
โ โ โโโ cf5ez6ifexr7i2hezzz4s7xfusj4wtisvu2gddeamh37bw6bghjw.wrapper.so
โ โ โโโ cg7domx3woam3nnliwud7yvtcencqctxkvvcafuriladwxw4nfiv.cubin
โ โ โโโ cubaaxppb6xmuqdm4bej55h2pftbce3bjyyvljxbtdfuolmv45ex.cubin
โ โโโ weights
โ โ โโโ model1_weights_config.json
โ โ โโโ model2_weights_config.json
โ โ โโโ weight_0
โ โ โโโ weight_1
โ โ โโโ weight_2
โ โโโ constants
โ โ โโโ model1_constants_config.json
โ โ โโโ model2_constants_config.json
โ โ โโโ tensor_0
โ โ โโโ tensor_1
โ โ โโโ custom_obj_0
โ โ โโโ custom_obj_1
โ โโโ sample_inputs
โ โโโ model1.pt
โ โโโ model2.pt
โโโ extra
โ โโโ ....json
โโโ models
โโโ model1.json
โโโ model2.json
Contents#
Archive Headers#
archive_format
declares the format used by this archive. Currently, it can only be โpt2โ.byteorder
. One of โlittleโ or โbigโ, used by zip file reader/.data/version
contains the archive version. (Notice that this is neither export serializationโs schema version, nor Aten Opset Version)./.data/serialization_id
is a hash generated for the current archive, used for verification.
AOTInductor Compiled Artifact#
Path: /data/aotinductor/<model_name>-<backend>/
AOTInductor compilation artifacts are saved for each model-backend pair. For
example, compilation artifacts for the model1
model on A100 and H100 will be
saved in model1-a100
and model1-h100
folders separately.
The folder typically contains
<uuid>.wrapper.so
: Dynamic library compiled from.cpp. <uuid>.wrapper.cpp
: AOTInductor generated cpp wrapper file.<uuid>.kernel.cpp
: AOTInductor generated cpp kernel file.*.cubin
: Triton kernels compiled from triton codegen kernels<uuid>.wrapper_metadata.json
: Metadata which was passed in from theaot_inductor.metadata
inductor config(optional)
<uuid>.json
: External fallback nodes for custom ops to be executed byProxyExecutor
, serialized according toExternKernelNode
struct. If the model doesnโt use custom ops/ProxyExecutor, this file would be omitted.
Weights#
Path: /data/weights/*
Model parameters and buffers are saved in the /data/weights/
folder. Each
tensor is saved as a separated file. The file only contains the raw data blob,
tensor metadata and mapping from model weight FQN to saved raw data blob are saved separately in the
<model_name>_weights_config.json
.
Constants#
Path: /data/constants/*
TensorConstants, non-persistent buffers and TorchBind objects are saved in the
/data/constants/
folder. Metadata and mapping from model constant FQN to saved raw data blob are saved separately in the
<model_name>_constants_config.json
Sample Inputs#
Path: /data/sample_inputs/<model_name>.pt
The sample_input
used by torch.export
could be included in the archive for
downstream use. Typically, itโs a flattened list of Tensors, combining both args
and kwargs of the forward() function.
The .pt file is produced by torch.save(sample_input)
, and can be loaded by
torch.load()
in python and torch::pickle_load()
in c++.
When the model has multiple copies of sample input, it would be packaged as
<model_name>_<index>.pt
.
Models Definitions#
Path: /models/<model_name>.json
Model definition is the serialized json of the ExportedProgram from
torch.export.save
, and other model-level metadata.
Multiple Models#
This archive spec supports multiple model definitions coexisting in the same
file, with <model_name>
serving as a unique identifier for the models, and
will be used as reference in other folders of the archive.
Lower level APIs like torch.export.pt2_archive._package.package_pt2()
and
torch.export.pt2_archive._package.load_pt2()
allow you to have
finer-grained control over the packaging and loading process.