• 19Nov

    Hi there!

    In my current work I’m working on optimizing some parallel software, and basically tries to make programs run faster. Within this work there are focus areas such as I/O and memory utilization, which are key areas when trying to optimize software. Generally when people think of highly optimized software, they think of C/C++ and possible assembly.

    Python is very simple in its syntax, and its speed is not at all bad. From a HPC (High Performance Computing) perspective Python may not look that interesting, but combining Python’s simplicity with C’s speed, you’ll get the best from both worlds.

    Since many of the core modules, and Python itself is written in C, it is possible to further extend Python in C. Even the official Python documentation (http://docs.python.org) have a whole section covering the Python C API. By downloading the Python development files (and of course the GNU compiler; apt-get install build-essential), then you’re ready to create C extensions for Python. The overall goal for your C extensions should be to only do the very compute intensive tasks, and keep the other stuff in Python. A key observation here is to identify which parts of your program you may need to create a C extension for – and for this to be found you can use a profiler. Included in Python (“batteries included”) there are a couple of profilers:

    • import profile
    • import cProfile
    • import hotshot

    I will not cover  these python profilers here, but rather inspire you to try them out. They might save you a lot of work and give you valuable knowledge of your software. Do some reading of python profiling here.

    Now lets create a very simple (and extremely stupid) python extension in C. We want it to contain two methods:

    1. a method which allows you to run shell commands by passing a string
    2. a method which returns some text based on your input

    To create this extension we follow these steps: a) decide the python module name b) create the C file with the same name as the python module name c) start programming C! Let’s name our module for “cool” (not the best name)..

    This is our cool.c file:

    #include <stdlib.h>
    #include <stdio.h>
    #include <Python.h>
     
    /*
     * Takes a string argument, which is the shell command, and runs it.
     * Returns the return code of the system(cmd) call.
     * */
    static PyObject *cool_command(PyObject *self, PyObject *args) {
        const char *cmd;
        int retval;
     
        if(!PyArg_ParseTuple(args, "s", &cmd))
            return NULL;
     
        retval = system( cmd );
        return Py_BuildValue("i", retval);
    }
     
    /*
     * Takes a string argument, and assembles it into a new python string
     * object and returns it.
     * */
    static PyObject *cool_greet(PyObject *self, PyObject *args) {
        char *input;
        char *resp = "Hi there: ";
     
        if(!PyArg_ParseTuple(args, "s", &input))
            return NULL;
     
        char *retstr = (char *) malloc( sizeof(char)* (strlen(input)+strlen(resp)) + 1 ); // +1 &lt;- null termination
        if (retstr == NULL)
            return NULL;
     
        PyObject *retString;
     
        strcpy( retstr, resp );
        strcat(retstr, input);
     
        retString = Py_BuildValue("s", retstr);
        Py_INCREF(retString);
        free(retstr);
     
        return retString;
    }
     
    /* an array of PyMethodDef structure. A PyMethodDef structure is
    used to describe a method of an extension type. */
    static PyMethodDef CoolMethods[] = {
        {"command", cool_command, METH_VARARGS, "Execute a shell command." },
        {"greet", cool_greet, METH_VARARGS, "Returns a greeting to the caller."},
        {NULL, NULL, 0, NULL}
    };
     
    PyMODINIT_FUNC initcool(void)
    {
        (void) Py_InitModule( "cool", CoolMethods );
    }

    This C extension file is built up with the following key sections:

    • Inclusion of headers.
    • static PyObject methods which constitutes the visible python functions of our extension module.
    • An array of PyMethodDef structures. A PyMethodDef structure describes a method of an extension type.
    • A initialization method which is named “init<modulename>“. This method has the sole purpose of calling the Py_InitModule function which takes the name and the PyMethodDef array.

    So there you go, this is our Python extension implemented in C with the help of the Python development headers. Now we’ll have a look at our Distutils script (setup.py) that will assist us with the compilation and creation of the actual python module.

    This is our setup.py file:

    from distutils.core import setup, Extension
     
    module1 = Extension('cool',
            sources = ['cool.c'] )
     
    setup ( name = 'CoolPackage',
            version = '0.1',
            description = 'A descriptive and informal C extension to Python',
            ext_modules = [module1] )

    jada..

    This file is much self-explained. We have basically one method named “setup” which takes a bunch of arguments. These arguments is the package name, version, description and a set of Extension objects. These Extension objects describes our C extension files. By simply running this command:

    python setup.py build

    When the compilation is done, you’ll have a build folder in the same directory as the setup.py. Go into the “build/lib.linux-i686-2.6/“. Then type “python” or “ipython”, and then “import cool”. Now you actually have loaded the “cool” module, and you may call “cool.greet(’Alex’)” and/or “cool.command(’ls /’)” and the actual computing happens in the C world instead of the Python world.

    Now, keep in mind that this C extension isn’t actual doing anything useful. But, given that you have some algorithm or other problem to solve, and the time is of the essence, then utilizing the power which lays in this C extensions can give you significant time savings.

  • 20Aug

    One day I needed to make an embedded Python interpreter in a Fortran/C program aware of its path, since it should search for a predefined python source code file in the same directory from the binary is located.

    Now, this introduces some issues, as the Fortran/C program can be runned in many ways:

    1. Absolute path (/home/asbjorn/test1/main)
    2. Standing in the directory and run ./main
    3. Having the /home/asbjorn/test1/ in your $PATH variable (Linux/UNIX), and type ‘main’.

    Now, the simplest approach is to use the “int argc, char *argv[]” variables inside your ‘int main()’ method. Then the “argv[0]” would contain the binary file name, including any absolute path if that what being used. But, if your application is in your $PATH variable, it wont work.

    A ‘dirty’ trick could be to use:

    char path[255];
    path = system("which main");

    but, its not recommended using this approach. Another very hackish and cool solution is the following:

    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/param.h>
    int main(int argc, char* argv[])
    {
       char path[MAXPATHLEN];
       int length;
       length = readlink("/proc/self/exe", path, sizeof(path));
       if(length<0) {
          fprintf(stderr,"error resolving /proc/self/exe!\n");
          exit(1);
       }
       path[length] = '\0';
       printf("The absolute path to this running binary is: %s\n", path);
     
       return 0;
    }

    Another approach which is often used is to include some kind of configuration file that contains this path. Or, in my case, I could specify another folder which should hold my Python files. This approach would probably make more sense in the long run, as it is more flexible.

    So, in case you find yourself in this situation, then feel free to copy-paste the code and save the day ;) Have a good summer people!

  • 16Jul
    Categories: development Comments: 2

    Hi there!

    “Wow” you may think. Are people still using the Fortran programming language?? I know, I was shocked too. But, apparently, Fortran is very much used within large mathematical problems / scientific work, and can therefor be seen within HPC communities. As I’m currently an IT-consultant which is working within this kind of environment, I’ve got some hands-on experience with C and Fortran.

    One of the applications I’m working on is a mix of Fortran and some C. Since I’m a software developer at heart and I have almost never ever touched Fortran, I prefer C code over Fortran. However, rewriting all the Fortran code is not an option as it is very time consuming, and that one should respect the saying “if it works, don’t change it”. But, one of my tasks is to try to optimize the code, and running the whole application through the GNU gprof tools reveals some bad code. I’m not going to dive into exactly what it was, but to solve it I wanted to use C instead of Fortran. So, then I discovered that I could simply reimplement the subroutine (basically just a function in C terms) in C and append a “_” at the end of the function name.

    This fortran program for example calls the C function “my_c_func”:

    program testme
        implicit none
        integer*4 dx, dy, direction, i, j
        !real*4 dimension(:,:), allocateable :: data
        real*4 data(5,5)
        dx = 5
        dy = 5
        direction = 1
     
        do i = 1, dy
          do j = 1, dx
              data(i,j) = (i+j)**2
          end do
        end do
     
        ! call our C library function
        call my_func(dx, dy, data, direction)
     
        write (*,*) "Hello world!"
    end program testme

    And then this is the C function “my_c_func”:

    #include <stdlib.h>
    #include <stdio.h>
     
    void my_func_(int *dx, int *dy, float *data, int *direction)
    {
      int i,j;
     
      printf("You are now inside the C function.\nAwesome, right?\n\n");
      printf("dx=%d, dy=%d, direction=%d\n", *dx, *dy, *direction);
     
      printf("Lets go through our data matrix..\n");
     
      for(i=0;i<(*dy);i++)
      {
          for(j=0;j<(*dx);j++)
          {
              printf("data[%d][%d] = %.2f\n", i+1, j+1, data[(*dy)*i+j]);
          }
      }
    }

    Lets call the fortran file for test.f90 and the C file for c_lib.c, and lets create a Scons file instead of a Makefile to compile this example. Also, I have only tested this with the “gfortran v4.2″ compiler, and gcc v4.3.3.

    import os
    env = Environment( env = os.environ )
     
    env.Program( target = "program1", source = ["for.f90", "c_lib.c"])

    And now for the grand finally. Type scons, and the compilation should run. There should now be a binary named “program1″. Run it, and have a closer look at the info.

    So, to summarize, the only requirement for Fortran files to call C functions is that the respective C function has a underscore right after its function name. Also, before the fortran compilation, the C file containing the function needs to be compiled into a object file and be reachable.