Uniform Precision Quantization¶
Description¶
            netspresso.quantizer.quantizer.Quantizer
¶
    
              Bases: NetsPressoBase
            uniform_precision_quantization(input_model_path, output_dir, dataset_path, metric=SimilarityMetric.SNR, weight_precision=QuantizationPrecision.INT8, activation_precision=QuantizationPrecision.INT8, input_layers=None, wait_until_done=True, sleep_interval=30)
¶
    Apply uniform precision quantization to a model, specifying precision for weight & activation.
This method quantizes all layers in the model uniformly based on the specified precision levels for weights and activations.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
                input_model_path
             | 
            
                  str
             | 
            
               The file path where the model is located.  | 
            required | 
                output_dir
             | 
            
                  str
             | 
            
               The local folder path to save the quantized model.  | 
            required | 
                dataset_path
             | 
            
                  str
             | 
            
               Path to the dataset. Useful for certain quantizations.  | 
            required | 
                metric
             | 
            
                  SimilarityMetric
             | 
            
               Quantization quality metrics.  | 
            
                  SNR
             | 
          
                weight_precision
             | 
            
                  QuantizationPrecision
             | 
            
               Weight precision  | 
            
                  INT8
             | 
          
                activation_precision
             | 
            
                  QuantizationPrecision
             | 
            
               Activation precision  | 
            
                  INT8
             | 
          
                input_layers
             | 
            
                  List[InputShape]
             | 
            
               Target input shape for quantization (e.g., dynamic batch to static batch).  | 
            
                  None
             | 
          
                wait_until_done
             | 
            
                  bool
             | 
            
               If True, wait for the quantization result before returning the function. If False, request the quantization and return the function immediately.  | 
            
                  True
             | 
          
Raises:
| Type | Description | 
|---|---|
                  e
             | 
            
               If an error occurs during the model quantization.  | 
          
Returns:
| Name | Type | Description | 
|---|---|---|
QuantizerMetadata |             
               Quantize metadata.  | 
          
Examples¶
from netspresso import NetsPresso
from netspresso.enums import QuantizationPrecision, SimilarityMetric
# Login with API key (recommended)
# Get your API token from: https://account.netspresso.ai/api-token
netspresso = NetsPresso(api_key="YOUR_API_KEY")
# Note: Email/password login will be deprecated soon
# netspresso = NetsPresso(email="YOUR_EMAIL", password="YOUR_PASSWORD")
quantizer = netspresso.quantizer()
quantization_result = quantizer.uniform_precision_quantization(
    input_model_path="./examples/sample_models/test.onnx",
    output_dir="./outputs/quantized/uniform_precision_quantization",
    dataset_path="./examples/sample_datasets/pickle_calibration_dataset_128x128.npy",
    metric=SimilarityMetric.SNR,
    weight_precision=QuantizationPrecision.INT8,
    activation_precision=QuantizationPrecision.INT8,
)