• 18Nov

    Hi everyone.

    A couple of times now I’ve been amazed over how many people who is still unaware of the IPython. From the IPython webpage, a very short summary of what IPython is “Enhanced interactive Python shell”. The python programming language is surrounded its interpreter, which facilitates dynamic typing and execution. This feature sufficiently increases productivity as there is no problem to test and try code snippets on-the-fly.

    The IPython is a further extension of the standard python interpreter, as IPython provides more features in the python shell, such as auto-completion of imported modules, syntax highlightning, colors, and a variouse other usefull commands and features.

    IPython is highly flexible in terms of providing the user with the possibility to extend the python shell even further with custom commands (called magic commands). There is also an even tigther integration between the python interpreter and the underlying shell, such as bash,csh, etc. It is for instance much simpler to list files and folders, by just typing “ls” directly into the shell. Even commands such as “mkdir”, “mv”, “rm” is builtin, and its trivial to further extend the shell command vocabulary with more complex commands. We’ll show an example for howto extend with custom commands below.

    As every flexible software, IPython comes with a main configuration file ($HOME/.ipython/ipythonrc). If we wanted a custom command, such as “chmod <mod> <file>” (chmod 755 myfile.py), we could add this to the “ipythonrc” file:


    # my custom chmod alias. By typing '>>> chmod 755 myprog.py' or
    # '>>> chmod a+rx myprog.py' IPython will execute this
    # statement as a shell command.
    alias chmod chmod %s %s

    Also, debugging lists (tuples, dictionaries, etc) is more readable within the IPython, as it wraps all such print statements inside the “pprint” (pretty-print) module, and therefore a comprehensible representation will find place.

    So, if you often find yourself in the python interpreter, I would highly recommend you spending a half an hour to get to known IPython. I promise you - it will save you a lot of headaches in the future.

  • 27Oct
    Categories: python Comments: 0

    Finally!

    Its time for the monthly edition of the Python Magazine, which is a highly interesting and technial magazine regarding the python programming language.

    python magazine

    python magazine

    Everyone who is in to Python should subscribe to this magazine, as it covers many “hot” topics, as well as presenting many howto tutorials for everyday challenges. From cutting edge web applications and frameworks, to desktop applications and backbone server implementations.

  • 03Oct
    Categories: python Comments: 0

    Hi everyone!

    2. October the python developmeant team released the Python v2.6. At http://planet.python.org there is a whole lot of other blogs listing all the new cool features, so I won’t use any space outlining them here.

    However, I will encourage everyone to have a look at the documentation, as there is some valuable key features in this release which will in the long run revolutionize the way we program in Python.

    In time I will cover some of these features.

  • 08Sep

    Hi.

    There has been some delay since the last time I’ve presented some stuff in this blog - sorry about that. I guess I can somewhat blame it on my work which is requiring a whole lot of my time nowadays. However, today I have an interesting concept to discuss. Namely the concept of applications and cache.

    Nowadays, delays in software is a factor which could mean the difference between failure and success. So, how does most applications cope with this challenge? Well, some does a significant amount of work in regards to create more efficient algorithms, others focus primarily on the overall design of the application in order to reduce time delays, and also the widespread usage of compression of the data flow between systems. Not that these things is not worth thinking of, but an often forgot concept is the usage of a caching system.

    A caching system would in this context mean a system specifically designed to hold “fresh data”, in a much similar way as a standard database. Now you may think, “why not just use the database?”. Well, we will in most cases use a database. However, when your web page (or application) has a wide range of users, and a large set of those users request the same set of data, then your database would be overwhelmed by the amount of incoming requests. A database has a large overhead as a result of all the features which it needs to support (think of JOIN operations, RELATIONS, CONSTRAINTS, etc), and therefore there would be a need to have a much faster retrieval of “fresh data” which is frequently requested.

    How do you think Digg, Reddit or Slashdot handles its requests? 1. Route HTTP request to web application. 2. Execute a SQL SELECT statement to retrieve the articles 3. Render and return the HTML ? That would not scale very well for such large web sites. Instead, they take advantage of a caching system to hold the result of a SQL SELECT statement for a defined amount of time, and then re-initiate the SQL SELECT statement. With such an approach the database server would be much less overwhelmed by the frequent requests, and the caching system, which just holds the data, would take over the load. Since the caching system is built to hold data in memory, and to make it easy and fast retrievable, the data would be much faster loaded into the application, thus reducing the time delay previously involved.

    Enough “mombo-jombo”, lets talk technical. The Danga Interactive memcached is a “high-performance, distributed memory object caching system”. Its main purpose is to hold objects in a distributed manner, and to provide fast retrieval of those objects. This is done with the help of a distributed hash table across the nodes running memcached. Setting up and running a memcached server on Ubuntu and Debian is trivial:


    #$ aptitude install memcached
    #$ memcached -d -m 2048 -l 192.168.0.10 -p 12345

    This would fire up a memcached server with 2GB of memory on IP 192.168.0.10:12345. Now, given the usage of python and the python memcached client library (libmcache):

    1
    2
    3
    4
    5
    6
    7
    8
    
    import sys
    try:
       import memcache
    except ImportError:
       sys.exit(0)
    mc = memcache.Client( ['192.168.0.10:12345'] )
    mc.set("fellinghaug", "rocks")
    assert mc.get("fellinghaug") == "rocks"

    This may not be the best example of the usage of memcached, but the basic principle is shown. I would highly recommend reading the FAQ/Wiki at Danga’s homepage, and further these articles:

    As a final note: if your application shares the characteristics described in this article, then you should consider using a caching system.

  • 27Jun

    Since my master thesis is now delivered, I will dedicate some time to clean up the code and thoroughly document it. When I’m finished and the code is clean, I will make it freely available to the Apache Lucene community.

    I will also make my master thesis freely available for download, so documentation regarding the code is somewhat covered by the thesis. Also, the abstract goals for the code (since the code reflects the experiment) is outlined in the thesis, in addition to a presentation regarding the results and observations made.

    Since I’m a huge fan of Python, I also thought of experiment with the performance of python and my bigram index. I would love to further enhance and maybe introduce some new improvements and such.. In time, I will create a project page for the “Bigram index” beneath my future django/turbogears website http://asbjorn.fellinghaug.com/

  • 05May

    I’ve been taken by the Twitter storm these days.. Damn, I should focus a hole lot more on my master report. Well, this took me only one little hour, so it’s not that waste of time.. :) So, I guess you have heard about the new “facebook” called Twitter? Well, its this new web community thing were people can write their current status for what they are doing in the world.. And, of course, one can follow friends and pay attention to were / what they are doing.. Now, after some time I found it rather heavy to enter the twitter webpage, login, and then post a new twitter message for each time I want to update my status. So, as a python fan I am, I created myself a little python script to capture this problem. It relies on the python-twitter module available at the Google Code pages. So, lets have a look at the code. I have named this file “update.py”, however feel free to rename it.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    
    #!/usr/bin/env python
    import twitter
    import sys
     
    USERNAME=""
    PASSWORD=""
     
    def postNewMessage(msg):
        api = twitter.Api()
        api = twitter.Api(username="", password="")
     
        if isinstance(msg, list):
            msg = " ".join(msg)
        msg = unicode(msg, "utf-8")
        if len(msg) > 140:
            print "ERROR: Message can't be over 140 chars."
            return
        try:
            api.PostUpdate(msg)
            print "OK. Was %i chars in msg." % len(msg)
        except Exception, e:
            print "FUck.."
     
        api.ClearCredentials()
     
    if __name__ == "__main__":
        if len(sys.argv) > 1:
            t = sys.argv[1:]
            if len(t) == 1 and len(t[0]) > 10:
                # writes ./update "hi there mate"
                postNewMessage(t[0])
            else:
                # writes ./update hello world
                postNewMessage(sys.argv[1:])
        else:
            print "fuck"
  • 19Apr
    Categories: python Comments: 0

    Have you ever had a need to extract some information for a webpage in a automatic fashion? Well, I have multiple times. And, lazy as I am and given that I needed the process to be runned many times each day, I wanted to create a fancy script to do this. Now, I’m not gonna go into which webpage I needed to extract info from, by lets say the wanted information site is http://example.org. Now, the people behind this website is clever, they don’t want automatic scripts to download and use their information. So, if the HTTP headers is showing that a HTTP request is not from a web browser such as Firefox, Opera, Safari, IE, they would deny the request.

    Our nerdy curiosity would tell us not to give up! So, how would we cope with this kind of obstacle? Well, the Python programming language is a fantastic language providing us with modules for everything. One builtin module is the “urllib2″ which has capabilities to “mimic” a web browser. Here is a simple python code for achieving this:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    
    #!/usr/bin/env python
    import urllib2
    main_url = "http://example.org"
    txheaders = {'User-agent': 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3'}</code>
     
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
    urllib2.install_opener(opener)
    req = urllib2.Request(main_url, '', txheaders)
    handle = urllib2.open(req)
    print handle.readlines()

    And that’s all there is to it. The website http://example.org inspects the headers which is valid and tells the webserver at example.org that the client is a Firefox web browser running on Linux OS.

    One important note: It is illegal to steel information from sites and present it on your own site without an agreement of the source website and refering to the source. Remember to always “be nice”.

  • 19Apr

    Hepp hepp.. Jeg har en stor lidenskap, nerding på høyt nivå. Falt pladask for nerding i en tidlig alder, og det er nå blant mine hverdagslige gjøremål.

    En ting som har brøytet seg frem den siste tiden er web programmering, og teknikker innefor dette området. Her har jeg fått veldig sansen for rammeverk som er modulære og tilpasser seg MVC-arkitekturen. Og siden jeg er en stor fans av programmeringsspråket Python, så var Turbogears et rammeverk som utmerket seg godt. Også en god konkurrent er rammeverket Django.

    Hvertfall, jeg vil anbefale alle som har lyst til å lære seg morsomme og smarte måter å utvikle store / middels / små web-applikasjoner å ta en titt på Turbogears. Skal prøve å få ut noen få morsomheter med Turbogears etterhvert..Cool