Why ONNX Runtime runs 2–3x slower in C++ than Python?

2 min readJan 26, 2023

There are a few reasons why ONNX Runtime might run slower in C++ than in Python:

Interpreter overhead:

Python is an interpreted language, which means that the source code is translated into machine code at runtime. This can add some overhead compared to compiled languages such as C++. For example, consider the following Python code that calculates the factorial of a number:

def factorial(n):
    if n == 0:
        return 1
    else:
        return n * factorial(n-1)

print(factorial(5)) # 120

This code will run fine in Python, but when executed, the interpreter needs to translate the code into machine code, which can take time. In contrast, a C++ version of this code would be compiled ahead of time, so the overhead of the interpreter is avoided.

Memory management:

Python has a built-in garbage collector that automatically manages memory for the developer. This means that when a variable is no longer in use, the memory it occupies is freed up automatically. However, this can add some overhead to the runtime of the program. For example, the following Python code creates a large list of integers and assigns it to the variable a. Once the variable a goes out of scope, the garbage collector will automatically free up the memory used by the list:

a = [i for i in range(1000000)]

In contrast, C++ requires manual memory management, which means the developer needs to explicitly allocate and deallocate memory. This can lead to more efficient memory usage, but it also means that the developer needs to be more careful to avoid memory leaks and other issues.

Different implementations of the ONNX Runtime library:

The ONNX Runtime library for C++ and Python are implemented differently, and the C++ version may not be as optimized as the Python version. For example, the C++ version may not take advantage of certain features of the Python language or the Python runtime that could speed up execution.

Third-party libraries:

Python has a larger ecosystem of third-party libraries and frameworks that are optimized for machine learning, such as NumPy and TensorFlow, which may be faster than the equivalent C++ libraries. For example, NumPy is a library that provides high-performance array operations in Python, which can be faster than manually performing the same operations in C++.

Key Note:

It’s worth noting that performance can vary depending on the specific hardware and operating system you’re using, as well as the specific model and dataset you’re running. Therefore, it’s always a good idea to benchmark your code using representative data and to profile it to identify any performance bottlenecks.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Kamran Ahmed Khan

1 Follower

4 Following

Software Engineer, Tech enthusiast and Devops Engineer

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Recommended from Medium

Alexander Pochanin

One rule to rule them all.

If an unknown system is a black box, then C++ developers today seem to operate a heavy black box to build an even heavier black box…

Feb 8

How I Learned to Love `__init__.py`: A Simple Guide😊

Python in Plain English

Dhruv Ahuja

How I Learned to Love `init.py`: A Simple Guide😊

💡 Heads Up! Click here to unlock this article for free if you’re not a Medium member!

Feb 3

Lists

Predictive Modeling w/ Python

20 stories1856 saves

Practical Guides to Machine Learning

10 stories2225 saves

Coding & Development

11 stories1033 saves

ChatGPT prompts

51 stories2642 saves

5 Powerful F-String Tricks Every Python Developer Should Know!

The Pythoneers

Aashish Kumar

5 Powerful F-String Tricks Every Python Developer Should Know!

Learn five powerful f-string techniques to write cleaner, faster, and more readable Python code.

4d ago

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Level Up Coding

Jacob Bennett

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Jan 7

260

🚅 Information Theory for People in a Hurry

Towards AI

Eyal Kazin PhD

🚅 Information Theory for People in a Hurry

A quick guide to Entropy, Cross-Entropy and KL Divergence. Python code provided. 🐍

6d ago

How I Am Using a Lifetime 100% Free Server

Harendra

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

Oct 26, 2024

170

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams

Why ONNX Runtime runs 2–3x slower in C++ than Python?

Interpreter overhead:

Memory management:

Different implementations of the ONNX Runtime library:

Third-party libraries:

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Kamran Ahmed Khan

No responses yet

More from Kamran Ahmed Khan

Python Convert JSON to SQLite

Here are the detailed steps to convert JSON to SQLite using Python:

Preventing a static method in base class from being called through derived class?

It is not possible to prevent a static method in a base class from being called through a derived class, as static methods are not…

How to update Material UI Breadcrumbs from React Router V6 ?

Recently I was working on Shoe Store website where I wanted to apply breadcrumbs according to the current route. I found many solutions…

In Hive, how to compare array of string with hivevar list?

Introduction:

Recommended from Medium

One rule to rule them all.

If an unknown system is a black box, then C++ developers today seem to operate a heavy black box to build an even heavier black box…

How I Learned to Love `init.py`: A Simple Guide😊

💡 Heads Up! Click here to unlock this article for free if you’re not a Medium member!

Lists

Predictive Modeling w/ Python

Practical Guides to Machine Learning

Coding & Development

ChatGPT prompts

5 Powerful F-String Tricks Every Python Developer Should Know!

Learn five powerful f-string techniques to write cleaner, faster, and more readable Python code.

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

🚅 Information Theory for People in a Hurry

A quick guide to Entropy, Cross-Entropy and KL Divergence. Python code provided. 🐍

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

Why ONNX Runtime runs 2–3x slower in C++ than Python?

Interpreter overhead:

Memory management:

Different implementations of the ONNX Runtime library:

Third-party libraries:

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Kamran Ahmed Khan

No responses yet

More from Kamran Ahmed Khan

Python Convert JSON to SQLite

Here are the detailed steps to convert JSON to SQLite using Python:

Preventing a static method in base class from being called through derived class?

It is not possible to prevent a static method in a base class from being called through a derived class, as static methods are not…

How to update Material UI Breadcrumbs from React Router V6 ?

Recently I was working on Shoe Store website where I wanted to apply breadcrumbs according to the current route. I found many solutions…

In Hive, how to compare array of string with hivevar list?

Introduction:

Recommended from Medium

One rule to rule them all.

If an unknown system is a black box, then C++ developers today seem to operate a heavy black box to build an even heavier black box…

How I Learned to Love `__init__.py`: A Simple Guide😊

💡 Heads Up! Click here to unlock this article for free if you’re not a Medium member!

Lists

Predictive Modeling w/ Python

Practical Guides to Machine Learning

Coding & Development

ChatGPT prompts

5 Powerful F-String Tricks Every Python Developer Should Know!

Learn five powerful f-string techniques to write cleaner, faster, and more readable Python code.

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

🚅 Information Theory for People in a Hurry

A quick guide to Entropy, Cross-Entropy and KL Divergence. Python code provided. 🐍

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

How I Learned to Love `init.py`: A Simple Guide😊