My setup:
- Given some Legacy™ C++ code
- Using wrappers that use the Python ↔ C/C++ API, this is compiled into a
.pydfile - There is a bunch of Legacy™ Python code that uses this
.pydfile - This Python code is then called from MATLAB
- MATLAB R2018a / Python 3.6 / C++11 / Windows 10 / MSVC 2017 Community
- This setup is rigid, i.e., all this code is used by 100s of people in all sorts of different contexts; this non-ideal setup is already the best possible trade-off
The problem:
- MATLAB crashes due to an Access Violation® somewhere in the C++ code.
Obviously, MATLAB can't "look through" the .pyd binary to determine the root cause, so this is all I have to go on.
What I've tried:
- Using MSVC2017, build the
.pydinDebugmode (setup.py build --debug). - In MATLAB:
pyversion 'c:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\python.exe' - Be really annoyed by having to restart MATLAB before you can actually do that.
- After restarting MATLAB and running said command, in MSVC:
Debug→Attach to Process→ selectMATLAB.exe - Run the MATLAB code that causes the crash.
- MATLAB/Python complains:
Python Error: ImportError: DLL load failed: The specified module could not be found.
- Using MSVC2017, build the
- Try the same as in 1., this time renaming the
my_pylib.pydfile tomy_pylib_d.pyd(as found here of all places...) - MATLAB/Python complains:
Python Error: ImportError: cannot import name 'my_pylib'
- Try the same as in 1., this time renaming the
- Try the same as 2., this time stating
pyversion 'c:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\python_d.exe'(first making sure that a Python3.6 debugging environment has been installed in the MSVC context). - Be REALLY annoyed by having to restart MATLAB Every. Fucken. Time. you run that command!
- Go to The MathWorks Support site and put in a Feature Request to try and get that fixed in R2019a
- Restart MATLAB, re-attach MSVC to
MATLAB.exeand repeat from the top - MATLAB/Python still complains:
Python Error: ImportError: cannot import name 'my_pylib'
- Try the same as 2., this time stating
- Repeat 3., this time forcing MATLAB to also use
python36_d.dllbecause somehow that mechanism seems broken in MATLAB. - SCREAM IN RAGE because you have to restart MATLAB AGAIN and not forget to re-attach to the
MATLAB.exeprocess in MSVC AGAIN. - This time it at least continues with Python code execution, but it trips on every imported module (
numpy,scipy, etc.) with the sameImportErroras before ...
- Repeat 3., this time forcing MATLAB to also use
- Give up on Python3.6.
- Try again, from the top, using a fresh, non-MSVC installation of Python 3.7, including debugging symbols etc.
- in MATLAB:
pyversion 'c:\wherever\Python37_64\python.exe' <speak_angrily_through_teeth>Oh. Yeah. I forgot. Restart MATLAB AGAIN, runpyversioncommand above AGAIN, run offending MATLAB code, SCREAM WHILE PULLING OUT HAIR because you forgot to re-attach MSVC, restart MATLAB AGAIN because it obviously crashed, re-attach MSVC, run offending MATLAB code</speak_angrily_through_teeth>Success! MSVC goes to my API code after MATLAB triggers a breakpoint. The breakpoint is triggered here:
PyObject *module = PyModule_Create(&moduledef);where
static struct PyModuleDef moduledef = { PyModuleDef_HEAD_INIT, "my_pylib", NULL, sizeof(struct module_state), my_pylib_methods, NULL, my_pylib_traverse, my_pylib_clear, NULL };After downloading the Python 3.7 source code, I can dig a bit deeper. The
PyModule_Createcall is a wrapper that calls the following function inObjects/moduleobject.c:PyObject * PyModule_Create2(struct PyModuleDef* module, int module_api_version) { if (!_PyImport_IsInitialized(PyThreadState_GET()->interp)) Py_FatalError("Python import machinery not initialized"); return _PyModule_CreateInitialized(module, module_api_version); }where the breakpoint is inside the
if()clause. This means that themoduledefdoesn't even matter._PyImport_IsInitialized()is a function inPython/import.c:int _PyImport_IsInitialized(PyInterpreterState *interp) { if (interp->modules == NULL) return 0; return 1; }which doesn't seem like a very likely candidate to be causing my Access Violation®. Going into the
PyThreadState_GET()finally made me realize: I'm actually debugging the Python/C++ API instead of my code...
Questions:
- What the heck!! Why is debugging in MATLAB using Python3.6 so damn difficult?! What is the "proper" way to do it? I can't find much in the (online) documentation for it...
- Is there a known issue with the Python 3.7 C/C++ API that could be causing this?
- Am I doing something wrong / stupidly? Any tips/pointers on how I can find the problem in my C++ code more effectively?