Commit 15fde1de by Chekov2k Committed by Henry Schreiner

Add `PYBIND11_SIMPLE_GIL_MANAGEMENT` option (cmake, C++ define) (#4216)

* Add option to force the use of the PYPY GIL scoped acquire/release logic to support nested gil access, see https://github.com/pybind/pybind11/issues/1276 and https://github.com/pytorch/pytorch/issues/83101

* Apply suggestions from code review

* Update CMakeLists.txt

* docs: update upgrade guide

* Update docs/upgrade.rst

* All bells & whistles.

* Add Reminder to common.h, so that we will not forget to purge `!WITH_THREAD` branches when dropping Python 3.6

* New sentence instead of semicolon.

* Temporarily pull in snapshot of PR #4246

* Add `test_release_acquire`

* Add more unit tests for nested gil locking

* Add test_report_builtins_internals_keys

* Very minor enhancement: sort list only after filtering.

* Revert change in docs/upgrade.rst

* Add test_multi_acquire_release_cross_module, while also forcing unique PYBIND11_INTERNALS_VERSION for cross_module_gil_utils.cpp

* Hopefully fix apparently new ICC error.

```
2022-10-28T07:57:54.5187728Z -- The CXX compiler identification is Intel 2021.7.0.20220726
...
2022-10-28T07:58:53.6758994Z icpc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message.
2022-10-28T07:58:54.5801597Z In file included from /home/runner/work/pybind11/pybind11/include/pybind11/detail/../detail/type_caster_base.h(15),
2022-10-28T07:58:54.5803794Z                  from /home/runner/work/pybind11/pybind11/include/pybind11/detail/../cast.h(15),
2022-10-28T07:58:54.5805740Z                  from /home/runner/work/pybind11/pybind11/include/pybind11/detail/../attr.h(14),
2022-10-28T07:58:54.5809556Z                  from /home/runner/work/pybind11/pybind11/include/pybind11/detail/class.h(12),
2022-10-28T07:58:54.5812154Z                  from /home/runner/work/pybind11/pybind11/include/pybind11/pybind11.h(13),
2022-10-28T07:58:54.5948523Z                  from /home/runner/work/pybind11/pybind11/tests/cross_module_gil_utils.cpp(13):
2022-10-28T07:58:54.5949009Z /home/runner/work/pybind11/pybind11/include/pybind11/detail/../detail/internals.h(177): error #2282: unrecognized GCC pragma
2022-10-28T07:58:54.5949374Z       PYBIND11_TLS_KEY_INIT(tstate)
2022-10-28T07:58:54.5949579Z       ^
2022-10-28T07:58:54.5949695Z
```

* clang-tidy fixes

* Workaround for PYPY WIN exitcode None

* Revert "Temporarily pull in snapshot of PR #4246"

This reverts commit 23ac16e859150f27fda25ca865cabcb4444e0770.

* Another workaround for PYPY WIN exitcode None

* Clean up how the tests are run "run in process" Part 1: uniformity

* Clean up how the tests are run "run in process" Part 2: use `@pytest.mark.parametrize` and clean up the naming.

* Skip some tests `#if defined(THREAD_SANITIZER)` (tested with TSAN using the Google-internal toolchain).

* Run all tests again but ignore ThreadSanitizer exitcode 66 (this is less likely to mask unrelated ThreadSanitizer issues in the future).

* bug fix: missing common.h include before using `PYBIND11_SIMPLE_GIL_MANAGEMENT`

For the tests in the github CI this does not matter, because
`PYBIND11_SIMPLE_GIL_MANAGEMENT` is always defined from the command line,
but when monkey-patching common.h locally, it matters.

* if process.exitcode is None: assert t_delta > 9.9

* More sophisiticated `_run_in_process()` implementation, clearly reporting `DEADLOCK`, additionally exercised via added `intentional_deadlock()`

* Wrap m.intentional_deadlock in a Python function, for `ForkingPickler` compatibility.

```
>       ForkingPickler(file, protocol).dump(obj)
E       TypeError: cannot pickle 'PyCapsule' object
```

Observed with all Windows builds including mingw but not PyPy, and macos-latest with Python 3.9, 3.10, 3.11 but not 3.6.

* Add link to potential solution for WOULD-BE-NICE-TO-HAVE feature.

* Add `SKIP_IF_DEADLOCK = True` option, to not pollute the CI results with expected `DEADLOCK` failures while we figure out what to do about them.

* Add COPY-PASTE-THIS: gdb ... command (to be used for debugging the detected deadlock)

* style: pre-commit fixes

* Do better than automatic pre-commit fixes.

* Add `PYBIND11_SIMPLE_GIL_MANAGEMENT` to `pytest_report_header()` (so that we can easily know when harvesting deadlock information from the CI logs).

Co-authored-by: Arnim Balzer <arnim@seechange.ai>
Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rwgk@google.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
parent 5b395c9b
......@@ -102,10 +102,12 @@ jobs:
run: python -m pip install pytest-github-actions-annotate-failures
# First build - C++11 mode and inplace
# More-or-less randomly adding -DPYBIND11_SIMPLE_GIL_MANAGEMENT=ON here.
- name: Configure C++11 ${{ matrix.args }}
run: >
cmake -S . -B .
-DPYBIND11_WERROR=ON
-DPYBIND11_SIMPLE_GIL_MANAGEMENT=ON
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
-DCMAKE_CXX_STANDARD=11
......@@ -129,10 +131,12 @@ jobs:
run: git clean -fdx
# Second build - C++17 mode and in a build directory
# More-or-less randomly adding -DPYBIND11_SIMPLE_GIL_MANAGEMENT=OFF here.
- name: Configure C++17
run: >
cmake -S . -B build2
-DPYBIND11_WERROR=ON
-DPYBIND11_SIMPLE_GIL_MANAGEMENT=OFF
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
-DCMAKE_CXX_STANDARD=17
......
......@@ -91,10 +91,16 @@ endif()
option(PYBIND11_INSTALL "Install pybind11 header files?" ${PYBIND11_MASTER_PROJECT})
option(PYBIND11_TEST "Build pybind11 test suite?" ${PYBIND11_MASTER_PROJECT})
option(PYBIND11_NOPYTHON "Disable search for Python" OFF)
option(PYBIND11_SIMPLE_GIL_MANAGEMENT
"Use simpler GIL management logic that does not support disassociation" OFF)
set(PYBIND11_INTERNALS_VERSION
""
CACHE STRING "Override the ABI version, may be used to enable the unstable ABI.")
if(PYBIND11_SIMPLE_GIL_MANAGEMENT)
add_compile_definitions(PYBIND11_SIMPLE_GIL_MANAGEMENT)
endif()
cmake_dependent_option(
USE_PYTHON_INCLUDE_DIR
"Install pybind11 headers in Python include directory instead of default installation prefix"
......
......@@ -206,6 +206,7 @@
#endif
#include <Python.h>
// Reminder: WITH_THREAD is always defined if PY_VERSION_HEX >= 0x03070000
#if PY_VERSION_HEX < 0x03060000
# error "PYTHON < 3.6 IS UNSUPPORTED. pybind11 v2.9 was the last to support Python 2 and 3.5."
#endif
......@@ -229,6 +230,10 @@
# undef copysign
#endif
#if defined(PYPY_VERSION) && !defined(PYBIND11_SIMPLE_GIL_MANAGEMENT)
# define PYBIND11_SIMPLE_GIL_MANAGEMENT
#endif
#if defined(_MSC_VER)
# if defined(PYBIND11_DEBUG_MARKER)
# define _DEBUG
......
......@@ -9,6 +9,12 @@
#pragma once
#include "common.h"
#if defined(WITH_THREAD) && defined(PYBIND11_SIMPLE_GIL_MANAGEMENT)
# include "../gil.h"
#endif
#include "../pytypes.h"
#include <exception>
......@@ -49,7 +55,7 @@ inline PyObject *make_object_base_type(PyTypeObject *metaclass);
// `Py_LIMITED_API` anyway.
# if PYBIND11_INTERNALS_VERSION > 4
# define PYBIND11_TLS_KEY_REF Py_tss_t &
# ifdef __GNUC__
# if defined(__GNUC__) && !defined(__INTEL_COMPILER)
// Clang on macOS warns due to `Py_tss_NEEDS_INIT` not specifying an initializer
// for every field.
# define PYBIND11_TLS_KEY_INIT(var) \
......@@ -169,10 +175,12 @@ struct internals {
PyTypeObject *default_metaclass;
PyObject *instance_base;
#if defined(WITH_THREAD)
// Unused if PYBIND11_SIMPLE_GIL_MANAGEMENT is defined:
PYBIND11_TLS_KEY_INIT(tstate)
# if PYBIND11_INTERNALS_VERSION > 4
PYBIND11_TLS_KEY_INIT(loader_life_support_tls_key)
# endif // PYBIND11_INTERNALS_VERSION > 4
// Unused if PYBIND11_SIMPLE_GIL_MANAGEMENT is defined:
PyInterpreterState *istate = nullptr;
~internals() {
# if PYBIND11_INTERNALS_VERSION > 4
......@@ -408,6 +416,10 @@ PYBIND11_NOINLINE internals &get_internals() {
return **internals_pp;
}
#if defined(WITH_THREAD)
# if defined(PYBIND11_SIMPLE_GIL_MANAGEMENT)
gil_scoped_acquire gil;
# else
// Ensure that the GIL is held since we will need to make Python calls.
// Cannot use py::gil_scoped_acquire here since that constructor calls get_internals.
struct gil_scoped_acquire_local {
......@@ -417,6 +429,8 @@ PYBIND11_NOINLINE internals &get_internals() {
~gil_scoped_acquire_local() { PyGILState_Release(state); }
const PyGILState_STATE state;
} gil;
# endif
#endif
error_scope err_scope;
PYBIND11_STR_TYPE id(PYBIND11_INTERNALS_ID);
......
......@@ -10,7 +10,10 @@
#pragma once
#include "detail/common.h"
#include "detail/internals.h"
#if defined(WITH_THREAD) && !defined(PYBIND11_SIMPLE_GIL_MANAGEMENT)
# include "detail/internals.h"
#endif
PYBIND11_NAMESPACE_BEGIN(PYBIND11_NAMESPACE)
......@@ -21,7 +24,9 @@ PyThreadState *get_thread_state_unchecked();
PYBIND11_NAMESPACE_END(detail)
#if defined(WITH_THREAD) && !defined(PYPY_VERSION)
#if defined(WITH_THREAD)
# if !defined(PYBIND11_SIMPLE_GIL_MANAGEMENT)
/* The functions below essentially reproduce the PyGILState_* API using a RAII
* pattern, but there are a few important differences:
......@@ -62,11 +67,11 @@ public:
if (!tstate) {
tstate = PyThreadState_New(internals.istate);
# if defined(PYBIND11_DETAILED_ERROR_MESSAGES)
# if defined(PYBIND11_DETAILED_ERROR_MESSAGES)
if (!tstate) {
pybind11_fail("scoped_acquire: could not create thread state!");
}
# endif
# endif
tstate->gilstate_counter = 0;
PYBIND11_TLS_REPLACE_VALUE(internals.tstate, tstate);
} else {
......@@ -87,20 +92,20 @@ public:
PYBIND11_NOINLINE void dec_ref() {
--tstate->gilstate_counter;
# if defined(PYBIND11_DETAILED_ERROR_MESSAGES)
# if defined(PYBIND11_DETAILED_ERROR_MESSAGES)
if (detail::get_thread_state_unchecked() != tstate) {
pybind11_fail("scoped_acquire::dec_ref(): thread state must be current!");
}
if (tstate->gilstate_counter < 0) {
pybind11_fail("scoped_acquire::dec_ref(): reference count underflow!");
}
# endif
# endif
if (tstate->gilstate_counter == 0) {
# if defined(PYBIND11_DETAILED_ERROR_MESSAGES)
# if defined(PYBIND11_DETAILED_ERROR_MESSAGES)
if (!release) {
pybind11_fail("scoped_acquire::dec_ref(): internal error!");
}
# endif
# endif
PyThreadState_Clear(tstate);
if (active) {
PyThreadState_DeleteCurrent();
......@@ -178,12 +183,14 @@ private:
bool disassoc;
bool active = true;
};
#elif defined(PYPY_VERSION)
# else // PYBIND11_SIMPLE_GIL_MANAGEMENT
class gil_scoped_acquire {
PyGILState_STATE state;
public:
gil_scoped_acquire() { state = PyGILState_Ensure(); }
gil_scoped_acquire() : state{PyGILState_Ensure()} {}
gil_scoped_acquire(const gil_scoped_acquire &) = delete;
gil_scoped_acquire &operator=(const gil_scoped_acquire &) = delete;
~gil_scoped_acquire() { PyGILState_Release(state); }
......@@ -194,19 +201,39 @@ class gil_scoped_release {
PyThreadState *state;
public:
gil_scoped_release() { state = PyEval_SaveThread(); }
gil_scoped_release() : state{PyEval_SaveThread()} {}
gil_scoped_release(const gil_scoped_release &) = delete;
gil_scoped_release &operator=(const gil_scoped_acquire &) = delete;
~gil_scoped_release() { PyEval_RestoreThread(state); }
void disarm() {}
};
#else
# endif // PYBIND11_SIMPLE_GIL_MANAGEMENT
#else // WITH_THREAD
class gil_scoped_acquire {
public:
gil_scoped_acquire() {
// Trick to suppress `unused variable` error messages (at call sites).
(void) (this != (this + 1));
}
gil_scoped_acquire(const gil_scoped_acquire &) = delete;
gil_scoped_acquire &operator=(const gil_scoped_acquire &) = delete;
void disarm() {}
};
class gil_scoped_release {
public:
gil_scoped_release() {
// Trick to suppress `unused variable` error messages (at call sites).
(void) (this != (this + 1));
}
gil_scoped_release(const gil_scoped_release &) = delete;
gil_scoped_release &operator=(const gil_scoped_acquire &) = delete;
void disarm() {}
};
#endif
#endif // WITH_THREAD
PYBIND11_NAMESPACE_END(PYBIND11_NAMESPACE)
......@@ -210,4 +210,5 @@ def pytest_report_header(config):
f" {pybind11_tests.compiler_info}"
f" {pybind11_tests.cpp_std}"
f" {pybind11_tests.PYBIND11_INTERNALS_ID}"
f" PYBIND11_SIMPLE_GIL_MANAGEMENT={pybind11_tests.PYBIND11_SIMPLE_GIL_MANAGEMENT}"
)
......@@ -6,9 +6,15 @@
All rights reserved. Use of this source code is governed by a
BSD-style license that can be found in the LICENSE file.
*/
#if defined(PYBIND11_INTERNALS_VERSION)
# undef PYBIND11_INTERNALS_VERSION
#endif
#define PYBIND11_INTERNALS_VERSION 21814642 // Ensure this module has its own `internals` instance.
#include <pybind11/pybind11.h>
#include <cstdint>
#include <string>
#include <thread>
// This file mimics a DSO that makes pybind11 calls but does not define a
// PYBIND11_MODULE. The purpose is to test that such a DSO can create a
......@@ -21,8 +27,54 @@
namespace {
namespace py = pybind11;
void gil_acquire() { py::gil_scoped_acquire gil; }
std::string gil_multi_acquire_release(unsigned bits) {
if ((bits & 0x1u) != 0u) {
py::gil_scoped_acquire gil;
}
if ((bits & 0x2u) != 0u) {
py::gil_scoped_release gil;
}
if ((bits & 0x4u) != 0u) {
py::gil_scoped_acquire gil;
}
if ((bits & 0x8u) != 0u) {
py::gil_scoped_release gil;
}
return PYBIND11_INTERNALS_ID;
}
struct CustomAutoGIL {
CustomAutoGIL() : gstate(PyGILState_Ensure()) {}
~CustomAutoGIL() { PyGILState_Release(gstate); }
PyGILState_STATE gstate;
};
struct CustomAutoNoGIL {
CustomAutoNoGIL() : save(PyEval_SaveThread()) {}
~CustomAutoNoGIL() { PyEval_RestoreThread(save); }
PyThreadState *save;
};
template <typename Acquire, typename Release>
void gil_acquire_inner() {
Acquire acquire_outer;
Acquire acquire_inner;
Release release;
}
template <typename Acquire, typename Release>
void gil_acquire_nested() {
Acquire acquire_outer;
Acquire acquire_inner;
Release release;
auto thread = std::thread(&gil_acquire_inner<Acquire, Release>);
thread.join();
}
constexpr char kModuleName[] = "cross_module_gil_utils";
struct PyModuleDef moduledef = {
......@@ -30,6 +82,9 @@ struct PyModuleDef moduledef = {
} // namespace
#define ADD_FUNCTION(Name, ...) \
PyModule_AddObject(m, Name, PyLong_FromVoidPtr(reinterpret_cast<void *>(&__VA_ARGS__)));
extern "C" PYBIND11_EXPORT PyObject *PyInit_cross_module_gil_utils() {
PyObject *m = PyModule_Create(&moduledef);
......@@ -37,8 +92,16 @@ extern "C" PYBIND11_EXPORT PyObject *PyInit_cross_module_gil_utils() {
if (m != nullptr) {
static_assert(sizeof(&gil_acquire) == sizeof(void *),
"Function pointer must have the same size as void*");
PyModule_AddObject(
m, "gil_acquire_funcaddr", PyLong_FromVoidPtr(reinterpret_cast<void *>(&gil_acquire)));
ADD_FUNCTION("gil_acquire_funcaddr", gil_acquire)
ADD_FUNCTION("gil_multi_acquire_release_funcaddr", gil_multi_acquire_release)
ADD_FUNCTION("gil_acquire_inner_custom_funcaddr",
gil_acquire_inner<CustomAutoGIL, CustomAutoNoGIL>)
ADD_FUNCTION("gil_acquire_nested_custom_funcaddr",
gil_acquire_nested<CustomAutoGIL, CustomAutoNoGIL>)
ADD_FUNCTION("gil_acquire_inner_pybind11_funcaddr",
gil_acquire_inner<py::gil_scoped_acquire, py::gil_scoped_release>)
ADD_FUNCTION("gil_acquire_nested_pybind11_funcaddr",
gil_acquire_nested<py::gil_scoped_acquire, py::gil_scoped_release>)
}
return m;
......
......@@ -89,6 +89,12 @@ PYBIND11_MODULE(pybind11_tests, m) {
#endif
m.attr("cpp_std") = cpp_std();
m.attr("PYBIND11_INTERNALS_ID") = PYBIND11_INTERNALS_ID;
m.attr("PYBIND11_SIMPLE_GIL_MANAGEMENT") =
#if defined(PYBIND11_SIMPLE_GIL_MANAGEMENT)
true;
#else
false;
#endif
bind_ConstructorStats(m);
......
......@@ -293,7 +293,6 @@ TEST_CASE("Threads") {
{
py::gil_scoped_release gil_release{};
REQUIRE(has_pybind11_internals_static());
auto threads = std::vector<std::thread>();
for (auto i = 0; i < num_threads; ++i) {
......
......@@ -11,6 +11,13 @@
#include "pybind11_tests.h"
#include <string>
#include <thread>
#define CROSS_MODULE(Function) \
auto cm = py::module_::import("cross_module_gil_utils"); \
auto target = reinterpret_cast<void (*)()>(PyLong_AsVoidPtr(cm.attr(Function).ptr()));
class VirtClass {
public:
virtual ~VirtClass() = default;
......@@ -28,6 +35,16 @@ class PyVirtClass : public VirtClass {
};
TEST_SUBMODULE(gil_scoped, m) {
m.attr("defined_THREAD_SANITIZER") =
#if defined(THREAD_SANITIZER)
true;
#else
false;
#endif
m.def("intentional_deadlock",
[]() { std::thread([]() { py::gil_scoped_acquire gil_acquired; }).join(); });
py::class_<VirtClass, PyVirtClass>(m, "VirtClass")
.def(py::init<>())
.def("virtual_func", &VirtClass::virtual_func)
......@@ -37,11 +54,91 @@ TEST_SUBMODULE(gil_scoped, m) {
m.def("test_callback_std_func", [](const std::function<void()> &func) { func(); });
m.def("test_callback_virtual_func", [](VirtClass &virt) { virt.virtual_func(); });
m.def("test_callback_pure_virtual_func", [](VirtClass &virt) { virt.pure_virtual_func(); });
m.def("test_cross_module_gil", []() {
auto cm = py::module_::import("cross_module_gil_utils");
auto gil_acquire = reinterpret_cast<void (*)()>(
PyLong_AsVoidPtr(cm.attr("gil_acquire_funcaddr").ptr()));
m.def("test_cross_module_gil_released", []() {
CROSS_MODULE("gil_acquire_funcaddr")
py::gil_scoped_release gil_release;
target();
});
m.def("test_cross_module_gil_acquired", []() {
CROSS_MODULE("gil_acquire_funcaddr")
py::gil_scoped_acquire gil_acquire;
target();
});
m.def("test_cross_module_gil_inner_custom_released", []() {
CROSS_MODULE("gil_acquire_inner_custom_funcaddr")
py::gil_scoped_release gil_release;
target();
});
m.def("test_cross_module_gil_inner_custom_acquired", []() {
CROSS_MODULE("gil_acquire_inner_custom_funcaddr")
py::gil_scoped_acquire gil_acquire;
target();
});
m.def("test_cross_module_gil_inner_pybind11_released", []() {
CROSS_MODULE("gil_acquire_inner_pybind11_funcaddr")
py::gil_scoped_release gil_release;
gil_acquire();
target();
});
m.def("test_cross_module_gil_inner_pybind11_acquired", []() {
CROSS_MODULE("gil_acquire_inner_pybind11_funcaddr")
py::gil_scoped_acquire gil_acquire;
target();
});
m.def("test_cross_module_gil_nested_custom_released", []() {
CROSS_MODULE("gil_acquire_nested_custom_funcaddr")
py::gil_scoped_release gil_release;
target();
});
m.def("test_cross_module_gil_nested_custom_acquired", []() {
CROSS_MODULE("gil_acquire_nested_custom_funcaddr")
py::gil_scoped_acquire gil_acquire;
target();
});
m.def("test_cross_module_gil_nested_pybind11_released", []() {
CROSS_MODULE("gil_acquire_nested_pybind11_funcaddr")
py::gil_scoped_release gil_release;
target();
});
m.def("test_cross_module_gil_nested_pybind11_acquired", []() {
CROSS_MODULE("gil_acquire_nested_pybind11_funcaddr")
py::gil_scoped_acquire gil_acquire;
target();
});
m.def("test_release_acquire", [](const py::object &obj) {
py::gil_scoped_release gil_released;
py::gil_scoped_acquire gil_acquired;
return py::str(obj);
});
m.def("test_nested_acquire", [](const py::object &obj) {
py::gil_scoped_release gil_released;
py::gil_scoped_acquire gil_acquired_outer;
py::gil_scoped_acquire gil_acquired_inner;
return py::str(obj);
});
m.def("test_multi_acquire_release_cross_module", [](unsigned bits) {
py::set internals_ids;
internals_ids.add(PYBIND11_INTERNALS_ID);
{
py::gil_scoped_release gil_released;
auto thread_f = [bits, &internals_ids]() {
py::gil_scoped_acquire gil_acquired;
auto cm = py::module_::import("cross_module_gil_utils");
auto target = reinterpret_cast<std::string (*)(unsigned)>(
PyLong_AsVoidPtr(cm.attr("gil_multi_acquire_release_funcaddr").ptr()));
std::string cm_internals_id = target(bits >> 3);
internals_ids.add(cm_internals_id);
};
if ((bits & 0x1u) != 0u) {
thread_f();
}
if ((bits & 0x2u) != 0u) {
std::thread non_python_thread(thread_f);
non_python_thread.join();
}
if ((bits & 0x4u) != 0u) {
thread_f();
}
}
return internals_ids;
});
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment