Plain Quantization
- uniform_precision_quantization(self, input_model_path: str, output_dir: str, dataset_path: str | None, metric: SimilarityMetric = SimilarityMetric.SNR, weight_precision: QuantizationPrecision = QuantizationPrecision.INT8, activation_precision: QuantizationPrecision = QuantizationPrecision.INT8, input_layers: List[Dict[str, int]] | None = None, wait_until_done: bool = True, sleep_interval: int = 30)
Apply uniform precision quantization to a model, specifying precision for weight & activation.
This method quantizes all layers in the model uniformly based on the specified precision levels for weights and activations.
- Parameters:
input_model_path (str) – The file path where the model is located.
output_dir (str) – The local folder path to save the quantized model.
dataset_path (str) – Path to the dataset. Useful for certain quantizations.
metric (SimilarityMetric) – Quantization quality metrics.
weight_precision (QuantizationPrecision) – Weight precision
activation_precision (QuantizationPrecision) – Activation precision
input_layers (List[InputShape], optional) – Target input shape for quantization (e.g., dynamic batch to static batch).
wait_until_done (bool) – If True, wait for the quantization result before returning the function. If False, request the quantization and return the function immediately.
- Raises:
e – If an error occurs during the model quantization.
- Returns:
Quantize metadata.
- Return type:
QuantizerMetadata
Example
from netspresso import NetsPresso
from netspresso.enums import QuantizationPrecision, SimilarityMetric
netspresso = NetsPresso(email="YOUR_EMAIL", password="YOUR_PASSWORD")
quantizer = netspresso.quantizer()
quantization_result = quantizer.uniform_precision_quantization(
input_model_path="./examples/sample_models/test.onnx",
output_dir="./outputs/quantized/uniform_precision_quantization",
dataset_path="./examples/sample_datasets/pickle_calibration_dataset_128x128.npy",
metric=SimilarityMetric.SNR,
weight_precision=QuantizationPrecision.INT8,
activation_precision=QuantizationPrecision.INT8,
)