Cython internals

Reversed-Engineering to understand performance costs

Published on the: 21.01.2023
Last modified on the: 06.04.2023
Estimated reading time: ~ 10 min.

tl;dr: in scikit-learn, we hit limits of Cython for the lowest level interfaces. In some scenario, abstractions of the langage become costly and we are trying to resolve this. This aims at unrolling implementations of Cython’s main langage constructs to understand why such cost exists and how we could eventually work around them

Cython implementation of extension types

Performed by analysing source of generated code using cython==0.29.32 (conda-forge build number: py39h5a03fae_1).

Empty Cython module code

Empty Cython source, e.g. a source containing:

NULL

get compiled to a _cython_magic_9f4ad119bbe416df9bdf1771145affcf.c source.

This naming is built on informations about the machine, the version of cython, the executable and the source (see the code) and is also present in several places in the sources, e.g.:

#define __Pyx_MODULE_NAME "_cython_magic_9f4ad119bbe416df9bdf1771145affcf"

:::info To ease readability in what follows, part of the mangling of symbols based on __Pyx_MODULE_NAME is removed.

This can be done with:

# __Pyx_MODULE_NAME changes at each compilation.
__Pyx_MODULE_NAME=_cython_magic_9f4ad119bbe416df9bdf1771145affcf
sed -i -e 's/$__Pyx_MODULE_NAME//g' $__Pyx_MODULE_NAME.c

:::

A preamble is injected. This preamble contains:

  • C macros (to deal with compilers specific behavior for inlining, etc.)
  • Cython macros (to abstract C macros)
  • C external functions declarations
  • Custom C functions declarations and definitions for coercing data from C to CPython world

This preamble is generated thanks to several templates present in several C files in the codebase of cython in Cython/Utility.

In some context, (e.g. for cpdef functions) a C function has Python wrapper so that it can be called in CPython world (and even sometimes in Python world). The declaration of Python wrapper is directly preceded by /* Python wrapper */, which makes them easily searcheable.

A Python module is defined as follows (not that the name if "" since we remove occurence of __Pyx_MODULE_NAME)

static PyMethodDef __pyx_methods[] = {
  {0, 0, 0, 0}
};

#if PY_MAJOR_VERSION >= 3
#if CYTHON_PEP489_MULTI_PHASE_INIT
static PyObject* __pyx_pymod_create(PyObject *spec, PyModuleDef *def); /*proto*/
static int __pyx_pymod_exec_(PyObject* module); /*proto*/
static PyModuleDef_Slot __pyx_moduledef_slots[] = {
  {Py_mod_create, (void*)__pyx_pymod_create},
  {Py_mod_exec, (void*)__pyx_pymod_exec_},
  {0, NULL}
};
#endif

static struct PyModuleDef __pyx_moduledef = {
    PyModuleDef_HEAD_INIT,
    "",
    0, /* m_doc */
  #if CYTHON_PEP489_MULTI_PHASE_INIT
    0, /* m_size */
  #else
    -1, /* m_size */
  #endif
    __pyx_methods /* m_methods */,
  #if CYTHON_PEP489_MULTI_PHASE_INIT
    __pyx_moduledef_slots, /* m_slots */
  #else
    NULL, /* m_reload */
  #endif
    NULL, /* m_traverse */
    NULL, /* m_clear */
    NULL /* m_free */
};
#endif

Empty Extension type

Cython source containing an empty simple extension type (here empty), e.g.:

cdef class SimpleExtensionType:
    pass

Get compiled and include a similar preamble.

The extension type get compiled as:

/*--- Type declarations ---*/
struct __pyx_obj_46_SimpleExtensionType;

/* ".pyx":2
 *
 * cdef class SimpleExtensionType:             # <<<<<<<<<<<<<<
 *     pass
 */
struct __pyx_obj_46_SimpleExtensionType {
  PyObject_HEAD
};

One can inspect all the function that are declare filtering lines containing the /* proto */-comment.

cat _cython_magic_5e6b2fb93df2d874fc9643f210523cbd.c | grep proto
/* Refnanny.proto */
/* PyErrExceptionMatches.proto */
/* PyThreadStateGet.proto */
/* PyErrFetchRestore.proto */
/* PyObjectGetAttrStr.proto */
/* GetAttr.proto */
/* GetAttr3.proto */
/* GetBuiltinName.proto */
/* PyDictVersioning.proto */
/* GetModuleGlobalName.proto */
/* RaiseArgTupleInvalid.proto */
/* RaiseDoubleKeywords.proto */
/* ParseKeywords.proto */
/* PySequenceContains.proto */
/* Import.proto */
/* ImportFrom.proto */
/* PyCFunctionFastCall.proto */
/* PyFunctionFastCall.proto */
/* PyObjectCall.proto */
/* PyObjectCall2Args.proto */
/* PyObjectCallMethO.proto */
/* PyObjectCallOneArg.proto */
/* RaiseException.proto */
/* HasAttr.proto */
/* GetItemInt.proto */
/* PyObject_GenericGetAttrNoDict.proto */
/* PyObject_GenericGetAttr.proto */
/* PyObjectGetAttrStrNoError.proto */
/* SetupReduce.proto */
/* CLineInTraceback.proto */
/* CodeObjectCache.proto */
/* AddTraceback.proto */
/* GCCDiagnostics.proto */
/* CIntFromPy.proto */
/* CIntToPy.proto */
/* CIntFromPy.proto */
/* FastTypeChecks.proto */
/* CheckBinaryVersion.proto */
/* InitStrings.proto */
static PyObject *__pyx_f_46___pyx_unpickle_SimpleExtensionType__set_state(struct __pyx_obj_46_SimpleExtensionType *, PyObject *); /*proto*/
static PyObject *__pyx_pf_46_19SimpleExtensionType___reduce_cython__(struct __pyx_obj_46_SimpleExtensionType *__pyx_v_self); /* proto */
static PyObject *__pyx_pf_46_19SimpleExtensionType_2__setstate_cython__(struct __pyx_obj_46_SimpleExtensionType *__pyx_v_self, PyObject *__pyx_v___pyx_state); /* proto */
static PyObject *__pyx_pf_46___pyx_unpickle_SimpleExtensionType(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v___pyx_type, long __pyx_v___pyx_checksum, PyObject *__pyx_v___pyx_state); /* proto */
static PyObject *__pyx_tp_new_46_SimpleExtensionType(PyTypeObject *t, PyObject *a, PyObject *k); /*proto*/
static PyObject *__pyx_pw_46_19SimpleExtensionType_1__reduce_cython__(PyObject *__pyx_v_self, CYTHON_UNUSED PyObject *unused); /*proto*/
static PyObject *__pyx_pw_46_19SimpleExtensionType_3__setstate_cython__(PyObject *__pyx_v_self, PyObject *__pyx_v___pyx_state); /*proto*/
static PyObject *__pyx_pw_46_1__pyx_unpickle_SimpleExtensionType(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/
static PyObject* __pyx_pymod_create(PyObject *spec, PyModuleDef *def); /*proto*/
static int __pyx_pymod_exec_(PyObject* module); /*proto*/
static CYTHON_SMALL_CODE int __Pyx_modinit_global_init_code(void); /*proto*/
static CYTHON_SMALL_CODE int __Pyx_modinit_variable_export_code(void); /*proto*/
static CYTHON_SMALL_CODE int __Pyx_modinit_function_export_code(void); /*proto*/
static CYTHON_SMALL_CODE int __Pyx_modinit_type_init_code(void); /*proto*/
static CYTHON_SMALL_CODE int __Pyx_modinit_type_import_code(void); /*proto*/
static CYTHON_SMALL_CODE int __Pyx_modinit_variable_import_code(void); /*proto*/
static CYTHON_SMALL_CODE int __Pyx_modinit_function_import_code(void); /*proto*/
__Pyx_PyMODINIT_FUNC init(void) CYTHON_SMALL_CODE; /*proto*/
__Pyx_PyMODINIT_FUNC PyInit_(void) CYTHON_SMALL_CODE; /*proto*/

The first /* *.proto */ sections are containing various declarations and definitions for proper handling of Python objects in Cython. They are generated in every case.

SimpleExtensionTypes Type object structure is also defined. It is a PyVarObject which register data relevant for SimpleExtensionType called slots. Slots can be pointers to PyObject, or function pointers. For instance tp_dealloc is a slot whose value is __pyx_tp_dealloc_46_SimpleExtensionType, a destructor of SimpleExtensionType)

static PyTypeObject __pyx_type_46_SimpleExtensionType = {
  PyVarObject_HEAD_INIT(0, 0)
  ".SimpleExtensionType", /*tp_name*/
  sizeof(struct __pyx_obj_46_SimpleExtensionType), /*tp_basicsize*/
  0, /*tp_itemsize*/
  __pyx_tp_dealloc_46_SimpleExtensionType, /*tp_dealloc*/
  #if PY_VERSION_HEX < 0x030800b4
  0, /*tp_print*/
  #endif
  #if PY_VERSION_HEX >= 0x030800b4
  0, /*tp_vectorcall_offset*/
  #endif
  0, /*tp_getattr*/
  0, /*tp_setattr*/
  #if PY_MAJOR_VERSION < 3
  0, /*tp_compare*/
  #endif
  #if PY_MAJOR_VERSION >= 3
  0, /*tp_as_async*/
  #endif
  0, /*tp_repr*/
  0, /*tp_as_number*/
  0, /*tp_as_sequence*/
  0, /*tp_as_mapping*/
  0, /*tp_hash*/
  0, /*tp_call*/
  0, /*tp_str*/
  0, /*tp_getattro*/
  0, /*tp_setattro*/
  0, /*tp_as_buffer*/
  Py_TPFLAGS_DEFAULT|Py_TPFLAGS_HAVE_VERSION_TAG|Py_TPFLAGS_CHECKTYPES|Py_TPFLAGS_HAVE_NEWBUFFER|Py_TPFLAGS_BASETYPE, /*tp_flags*/
  0, /*tp_doc*/
  0, /*tp_traverse*/
  0, /*tp_clear*/
  0, /*tp_richcompare*/
  0, /*tp_weaklistoffset*/
  0, /*tp_iter*/
  0, /*tp_iternext*/
  __pyx_methods_46_SimpleExtensionType, /*tp_methods*/
  0, /*tp_members*/
  0, /*tp_getset*/
  0, /*tp_base*/
  0, /*tp_dict*/
  0, /*tp_descr_get*/
  0, /*tp_descr_set*/
  0, /*tp_dictoffset*/
  0, /*tp_init*/
  0, /*tp_alloc*/
  __pyx_tp_new_46_SimpleExtensionType, /*tp_new*/
  0, /*tp_free*/
  0, /*tp_is_gc*/
  0, /*tp_bases*/
  0, /*tp_mro*/
  0, /*tp_cache*/
  0, /*tp_subclasses*/
  0, /*tp_weaklist*/
  0, /*tp_del*/
  0, /*tp_version_tag*/
  #if PY_VERSION_HEX >= 0x030400a1
  0, /*tp_finalize*/
  #endif
  #if PY_VERSION_HEX >= 0x030800b1 && (!CYTHON_COMPILING_IN_PYPY || PYPY_VERSION_NUM >= 0x07030800)
  0, /*tp_vectorcall*/
  #endif
  #if PY_VERSION_HEX >= 0x030800b4 && PY_VERSION_HEX < 0x03090000
  0, /*tp_print*/
  #endif
  #if CYTHON_COMPILING_IN_PYPY && PY_VERSION_HEX >= 0x03090000
  0, /*tp_pypy_flags*/
  #endif
};

More over, a few functions are created for pickling and un-pickling instances of this SimpleExtensionType (they current states and their extension in following sections will not be covered).

Also Python wrapper

Adding an attribute

cdef class SimpleExtensionType:
    cdef int an_attribute

Get compiled to:

/*--- Type declarations ---*/
struct __pyx_obj_46_SimpleExtensionType;

/* ".pyx":2
 *
 * cdef class SimpleExtensionType:             # <<<<<<<<<<<<<<
 *     pass
 */
struct __pyx_obj_46_SimpleExtensionType {
  PyObject_HEAD
  int an_attribute;
};

An some handling from Python (long) integers to C (long) integers are injected.

Adding a Python constructor

The following two prototypes are generated:

static int __pyx_pf_46_19SimpleExtensionType___init__(struct __pyx_obj_46_SimpleExtensionType *__pyx_v_self, int __pyx_v_a_parameter); /* proto */
static int __pyx_pw_46_19SimpleExtensionType_1__init__(PyObject *__pyx_v_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/

The first is a C constructor for SimpleExtensionType:

static int __pyx_pf_46_19SimpleExtensionType___init__(struct __pyx_obj_46_SimpleExtensionType *__pyx_v_self, int __pyx_v_a_parameter) {
  int __pyx_r;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("__init__", 0);

  /* ".pyx":6
 *
 *     def __init__(self, int a_parameter):
 *         self.an_attribute = a_parameter             # <<<<<<<<<<<<<<
 */
  __pyx_v_self->an_attribute = __pyx_v_a_parameter;

  /* ".pyx":5
 *     cdef int an_attribute
 *
 *     def __init__(self, int a_parameter):             # <<<<<<<<<<<<<<
 *         self.an_attribute = a_parameter
 */

  /* function exit code */
  __pyx_r = 0;
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}

The second is a Python wrapper of the first constructor for SimpleExtensionType (referenced as the Python via the tp_init slot ). This is the constructor called by callers in Python world.

/* ".pyx":5
 *     cdef int an_attribute
 *
 *     def __init__(self, int a_parameter):             # <<<<<<<<<<<<<<
 *         self.an_attribute = a_parameter
 */

/* Python wrapper */
static int __pyx_pw_46_19SimpleExtensionType_1__init__(PyObject *__pyx_v_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/
static int __pyx_pw_46_19SimpleExtensionType_1__init__(PyObject *__pyx_v_self, PyObject *__pyx_args, PyObject *__pyx_kwds) {
  int __pyx_v_a_parameter;
  int __pyx_lineno = 0;
  const char *__pyx_filename = NULL;
  int __pyx_clineno = 0;
  int __pyx_r;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("__init__ (wrapper)", 0);
  {
    static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_a_parameter,0};
    PyObject* values[1] = {0};
    if (unlikely(__pyx_kwds)) {
      Py_ssize_t kw_args;
      const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args);
      switch (pos_args) {
        case  1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0);
        CYTHON_FALLTHROUGH;
        case  0: break;
        default: goto __pyx_L5_argtuple_error;
      }
      kw_args = PyDict_Size(__pyx_kwds);
      switch (pos_args) {
        case  0:
        if (likely((values[0] = __Pyx_PyDict_GetItemStr(__pyx_kwds, __pyx_n_s_a_parameter)) != 0)) kw_args--;
        else goto __pyx_L5_argtuple_error;
      }
      if (unlikely(kw_args > 0)) {
        if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, "__init__") < 0)) __PYX_ERR(0, 5, __pyx_L3_error)
      }
    } else if (PyTuple_GET_SIZE(__pyx_args) != 1) {
      goto __pyx_L5_argtuple_error;
    } else {
      values[0] = PyTuple_GET_ITEM(__pyx_args, 0);
    }
    __pyx_v_a_parameter = __Pyx_PyInt_As_int(values[0]); if (unlikely((__pyx_v_a_parameter == (int)-1) && PyErr_Occurred())) __PYX_ERR(0, 5, __pyx_L3_error)
  }
  goto __pyx_L4_argument_unpacking_done;
  __pyx_L5_argtuple_error:;
  __Pyx_RaiseArgtupleInvalid("__init__", 1, 1, 1, PyTuple_GET_SIZE(__pyx_args)); __PYX_ERR(0, 5, __pyx_L3_error)
  __pyx_L3_error:;
  __Pyx_AddTraceback(".SimpleExtensionType.__init__", __pyx_clineno, __pyx_lineno, __pyx_filename);
  __Pyx_RefNannyFinishContext();
  return -1;
  __pyx_L4_argument_unpacking_done:;
  __pyx_r = __pyx_pf_46_19SimpleExtensionType___init__(((struct __pyx_obj_46_SimpleExtensionType *)__pyx_v_self), __pyx_v_a_parameter);

  /* function exit code */
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}

Adding a cdef method

cdef class SimpleExtensionType:
    cdef int an_attribute

    def __init__(self, int a_parameter):
        self.an_attribute = a_parameter

    cdef int a_method(self, int another_parameter):
        cdef int a_sum = self.an_attribute + another_parameter
        return a_sum

The extension type get compiled as (with some handwritten comment):

/*--- Type declarations ---*/
struct __pyx_obj_46_SimpleExtensionType;

/* ".pyx":2
 *
 * cdef class SimpleExtensionType:             # <<<<<<<<<<<<<<
 *     cdef int an_attribute
 *
 */
struct __pyx_obj_46_SimpleExtensionType {
  PyObject_HEAD
  struct __pyx_vtabstruct_46_SimpleExtensionType *__pyx_vtab;
  int an_attribute;
};


static int __pyx_f_46_19SimpleExtensionType_a_method(struct __pyx_obj_46_SimpleExtensionType *__pyx_v_self, int __pyx_v_another_parameter); /* proto*/

struct __pyx_vtabstruct_46_SimpleExtensionType {
  int (*a_method)(struct __pyx_obj_46_SimpleExtensionType *, int);
};
static struct __pyx_vtabstruct_46_SimpleExtensionType *__pyx_vtabptr_46_SimpleExtensionType;

// ... just before the `tp_new` implementation
static struct __pyx_vtabstruct_46_SimpleExtensionType __pyx_vtable_46_SimpleExtensionType;

The implementation of a_method naturally maps to C without CPython overhead (there are Cython macro for RefNanny but those are not injecting anything after their expansion).

__pyx_vtabptr_46_SimpleExtensionType is the unique global (static) pointer referencing __pyx_vtable_46_SimpleExtensionType the unique global (static) virtual table for __pyx_obj_46_SimpleExtensionType.

When the extension types of the modules are initialised in __Pyx_modinit_type_init_code, non exhaustively: - the pointer is initialized as pointing to the SimpleExtensionType vtable - the implementation of SimpleExtensionType.a_method is registered in SimpleExtensionType vtable - the SimpleExtensionType vtable is registered in SimpleExtensionType tp_dict slot.

/// in __Pyx_modinit_type_init_code
// ...

/*--- Type init code ---*/
  __pyx_vtabptr_46_SimpleExtensionType = &__pyx_vtable_46_SimpleExtensionType;
  __pyx_vtable_46_SimpleExtensionType.a_method = (int (*)(struct __pyx_obj_46_SimpleExtensionType *, int))__pyx_f_46_19SimpleExtensionType_a_method;
// ...
  if (__Pyx_SetVtable(__pyx_type_46_SimpleExtensionType.tp_dict, __pyx_vtabptr_46_SimpleExtensionType) < 0) __PYX_ERR(0, 2, __pyx_L1_error)
// ...

Moreover, when instances of SimpleExtensionType are dynamicaly created via tp_new, the SimpleExtensionType vtable is registered — my guess is that this comes for safety if instances are created via this interface.

static PyObject *__pyx_tp_new_46_SimpleExtensionType(PyTypeObject *t, CYTHON_UNUSED PyObject *a, CYTHON_UNUSED PyObject *k) {
  struct __pyx_obj_46_SimpleExtensionType *p;
  PyObject *o;
  if (likely((t->tp_flags & Py_TPFLAGS_IS_ABSTRACT) == 0)) {
    o = (*t->tp_alloc)(t, 0);
  } else {
    o = (PyObject *) PyBaseObject_Type.tp_new(t, __pyx_empty_tuple, 0);
  }
  if (unlikely(!o)) return 0;
  p = ((struct __pyx_obj_46_SimpleExtensionType *)o);
  p->__pyx_vtab = __pyx_vtabptr_46_SimpleExtensionType;
  return o;
}