Execution providers for onnxruntime

alkshi94 · 2024 年12 月 25 日 13:55

I have written a simple python script that runs an inference on alexnet obtained from onnx model zoo. the aim of this script is to compare what kind of improvement does using the ‘SpaceMITExecutionProvider’ compared to the ‘CPUExecutionProvider’. The surprise came to me when the performance of SpaceMITExecutionProvider performs worse than the CPUExecutionProvider.

this is the script i have:

import argparse
import os
import numpy as np
import onnxruntime as ort
import spacemit_ort

Initialize argument parser

parser = argparse.ArgumentParser(description=“Run ONNX model inference with terminal inputs.”)
parser.add_argument(“–model_name”, type=str, required=True, help=“Name of the ONNX model file (without path).”)
parser.add_argument(“–execution_provider”, type=str, choices=[“cpu”, “space”], required=True,
help=“Execution provider to use: ‘cpu’ or ‘space’.”)
parser.add_argument(“–batch_size”, type=int, default=1, help=“Batch size of the input tensor.”)
parser.add_argument(“–height”, type=int, default=224, help=“Height of the input tensor.”)
parser.add_argument(“–width”, type=int, default=224, help=“Width of the input tensor.”)
parser.add_argument(“–channels”, type=int, default=3, help=“Number of channels in the input tensor.”)
parser.add_argument(“–intra_op_threads”, type=int, default=2, help=“Number of intra-op threads.”)
parser.add_argument(“–inter_op_threads”, type=int, default=1, help=“Number of inter-op threads.”)
args = parser.parse_args()

Model directory and full path

model_dir = “/home/kishki/bme688/modelzoo”
onnx_model_path = os.path.join(model_dir, args.model_name)

Check if the model file exists

if not os.path.isfile(onnx_model_path):
raise FileNotFoundError(f"Model file ‘{args.model_name}’ not found in directory ‘{model_dir}’.")

Set execution provider

if args.execution_provider == “cpu”:
EP = [“CPUExecutionProvider”]
elif args.execution_provider == “space”:
EP = [“SpaceMITExecutionProvider”]
else:
raise ValueError(“Invalid execution provider selected. Choose ‘cpu’ or ‘space’.”)

Configure ONNX runtime session

sess_options = ort.SessionOptions()
sess_options.intra_op_num_threads = args.intra_op_threads
sess_options.inter_op_num_threads = args.inter_op_threads
#sess_options.log_severity_level = 0

Create the inference session

session = ort.InferenceSession(onnx_model_path, sess_options, providers=EP)

Prepare input tensor

batch_size = args.batch_size
height = args.height
width = args.width
channels = args.channels
input_tensor = np.ones((batch_size, channels, height, width), dtype=np.float32)

Perform inference

input_name = session.get_inputs()[0].name
output_names = [output.name for output in session.get_outputs()]
outputs = session.run(output_names, {input_name: input_tensor})

and i run it via a bash script as follows:

#!/bin/bash

Array of thread configurations

intra_op_threads=(1 2 4 6 8)
inter_op_threads=(1 2 4 6 8)
execution_providers=(‘cpu’ ‘space’)

Output file

output_file=“stats_alexnet.txt”

Clear previous output file

$output_file

Loop through each execution provider and thread configuration

for ep in “${execution_providers[@]}”; do
for intra in “${intra_op_threads[@]}”; do
for inter in “${inter_op_threads[@]}”; do
  # Echo execution provider and thread settings
  echo "${ep^}ExecutionProvider - intra_op_threads=$intra, inter_op_threads=$inter" | tee -a $output_file

  # Run perf stat and append the results to the output file
  perf stat -e cycles,instructions,vector_inst python3 benchmark.py \
    --model_name alexnet_Opset16.onnx \
    --execution_provider $ep \
    --intra_op_threads $intra \
    --inter_op_threads $inter 2>> $output_file

done
done
done

Can someone explain why this is the case? Is it a problem with my scripts? I am new to the platform and currently writing my master thesis about the topic of riscv.

kind regards!

rmwei · 2024 年12 月 26 日 08:20

Currently our plugin doesn’t have acceleration for FP32 arithmetic, if you use FP32 model directly, it may become slower, we suggest to use xquant tool for quantization, quantization into Int8 format will have a significant speedup, or use our already quantized modelzoo directly, thank you.
Here is the URL of our modelzoo