Experimenting with NumPy internals
Registering user-defined type which can be specified in the dtype argument
First I tried to create a user-defined type. It is summarized as follows:
- Create such a PyTypeObject that is a subclass of PyGenericArrType_Type for the user-defined type.
PyTypeObject *PyMyArrType_Type; Py_INCREF(&PyGenericArrType_Type); { PyObject *args = PyTuple_Pack(3, PyString_InternFromString("myarray.myarray"), PyTuple_Pack(1, &PyGenericArrType_Type), PyDict_New()); PyMyArrType_Type = (PyTypeObject *)PyType_Type.tp_new( &PyType_Type, args, NULL); if (!PyMyArrType_Type) return; Py_DECREF(args); }
- Create a PyArray_Descr object that specifies the type object created in the previous step for typeobject field.
PyArray_Descr *PyMyArray_Descr; PyMyArray_Descr = PyObject_New(PyArray_Descr, &PyArrayDescr_Type); PyMyArray_Descr->typeobj = PyMyArrType_Type; PyMyArray_Descr->kind = PyArray_DOUBLELTR; PyMyArray_Descr->type = PyArray_DOUBLELTR; PyMyArray_Descr->byteorder = '<'; PyMyArray_Descr->hasobject = 0; PyMyArray_Descr->type_num = PyArray_USERDEF; PyMyArray_Descr->elsize = sizeof(double); PyMyArray_Descr->alignment = sizeof(double); PyMyArray_Descr->subarray = NULL; PyMyArray_Descr->fields = NULL; PyMyArray_Descr->names = NULL; PyMyArray_Descr->f = &_PyMyArr_ArrFuncs; - Register the descriptor with PyArray_RegisterDataType()
if (!PyArray_RegisterDataType(PyMyArray_Descr)) return;
- And finally put the type object created in the first step to the module object.
PyObject *m = Py_InitModule("myarray", myarray_methods); { PyObject *d = PyModule_GetDict(m); PyDict_SetItemString(d, "myarray", (PyObject *)PyMyArrType_Type); }
Arrays of the newly defined user type can be instantiated like below:
import myarray; import numpy; a = numpy.ndarray(10, myarray.myarray);
Element retrieval sequence
Next, I looked through the NumPy? multiarray code to see how the value of an element is retrieved. The call sequence is like the following:
- PyEval_*()
- array_subscript_nice()
through PyMappingMethods. - PyArray_GetPtr()
Calculating the pointer to the element in question. - PyArray_Scalar()
Retrieving the element as a Python object. - PyArray_Descr.f->PyArray_GetItemFunc()
This function gets called only when PyArray_Descr.hasobject & NPY_USE_GETITEM != 0
Conclusion
NumPy doesn't allow us to define user-defined array types that have data of non-standard memory layouts though it defines accessor interface in PyArray_ArrFuncs, which accepts the pointer to the element, not the indices that refer to the element.
