Inspired by this
I wrote a little script to create antialiased circles in python
The Gist:
Example output:
Wednesday, October 12, 2011
Monday, October 3, 2011
difflib
Found another standard library gem: difflib, which allows you to do intelligent string comparisons.
For example:
will find the longest palindrome in a string, e.g.
For example:
will find the longest palindrome in a string, e.g.
>>> from palindrome import * >>> longest_palindrome('asdfracecarasdf') 'racecar' >>> longest_palindrome('asdffasdfaipreferpiasdfasdfa') 'aipreferpia'
Saturday, September 3, 2011
SWIG and numpy
I've been looking into C/C++ interface creator known as SWIG for use with a molecular dynamics program that is in the works. It is nice in that from an interface file, that you can create entirely by hand or just use a standard header file, you can create a python module that looks and feels like python, but is really some nifty C that you've written. There are some drawbacks of course in that if a function returns a pointer, it is actually a wrapped object that can be passed to other wrapped functions, but will have little use in python without some other work. Of course for the purposes of this post I would like these pointers to be an array of floating point numbers, or a numpy array. Wouldn't it be great if you could easily pass numpy arrays back and forth to C/C++ without having to deal with writing your own wrapper using Python.h?
Let's now provide a python interface to these using SWIG. Here, we specify the python module name and the functions that make up the module's functions. Then we include the numpy.i interface. Next, we have to say how we want our C arguments treated. The details of what's written in the file can be found here but I have used most of the common ones: modifying an array in place, taking an input array, and returning an array as output. NOTE: the far right section of the line matches verbatim to the arguments in the C code. The interface file is as follows:
Returns
Wahoo, we are actually slightly faster than the built-in numpy dot product with just a little bit of work. Next time, moving on to SWIGing classes and incorporating numpy with them as well.
Well you can with a little gem called numpy.i which is a SWIG interface wrapper for the numpy array. With it, you specify certain arguments you want to be treated as numpy arrays and it does so automatically.
So on with the example. I have two C functions which fill an array sequentially and can take the dot product of two arrays. They are found below in two files
Then, we need to build the module and test. To make this module we used a setup.py script (which will not be the case for future SWIG posts). I'm not sure of the details here, but it is listed for your information.
After that, let's build with `python setup.py build` and run some speed tests. For example,
Returns
Numpy says: 3.33332833333e+17
SWIG says: 3.33332833333e+17
Time of numpy: 0.0032000541687
Time of SWIG: 0.00278091430664
Wahoo, we are actually slightly faster than the built-in numpy dot product with just a little bit of work. Next time, moving on to SWIGing classes and incorporating numpy with them as well.
Wednesday, August 24, 2011
A Dictionary of Links
I wanted a dictionary of links for an MD type simulation, and I wanted the links to be unique, and I didn't want to have to check for the order the guys appeared in the dictionary keys every time, so I wrote my own dictionary by extending the native one.
Where I put that code on gist. To pull it off, you see that I subclass dict, and define a decorator that pulls out and sorts the first argument. After that it is just decorating the '__setitem__','__getitem__',and '__delitem__' methods. This lets us do nice things now:
Where I put that code on gist. To pull it off, you see that I subclass dict, and define a decorator that pulls out and sorts the first argument. After that it is just decorating the '__setitem__','__getitem__',and '__delitem__' methods. This lets us do nice things now:
>>> z = Links() >>> z[1,2] = 5 >>> z[4,3] = 10 >>> z {(1, 2): 5, (3, 4): 10} >>> z[2,1] 5 >>> z[1,2] 5 >>> z[2,1] = 11 >>> z {(1, 2): 11, (3, 4): 10} >>> del z[2,1] >>> (3,4) in z True >>> (4,3) in z True >>> z {(3, 4): 10} >>> z[5] = 1 Traceback (most recent call last): File "", line 1, in File "links.py", line 10, in orderedversion assert hasattr(first,"__len__"), "Key must be iterable" AssertionError: Key must be iterable >>> z[4,4,3] = 1 Traceback (most recent call last): File " ", line 1, in File "links.py", line 11, in orderedversion assert len(first)==2, "Key must be 2-tuple" AssertionError: Key must be 2-tuple >>> z {(3, 4): 10}
Labels:
decorators,
dict,
extension,
inheritance,
links,
python,
super
Tuesday, August 23, 2011
Setting up ROOT and pyROOT on a new Mac
I just got a Mac and need to re-do the setup of my software stack. I'm keeping a log of anything tricky here, both for my own future reference and so others can use it as a reference. These instructions work for Lion and may work for other versions of OS X, but I haven't tested them.
Getting ROOT and pyROOT up and running:
Note: you can't use the pre-compiled ROOT because it's linked against the wrong version of python.
- Install Xcode from the Mac App Store. I got a few errors the first few times I tried to install it, but eventually it worked.
- Download and install the newest version of the enthought python distribution from http://www.enthought.com. If you hunt around, you can find a link to download it for free for academic use.
- Set up some environment variables. You can set ROOTSYS to the directory you want to install ROOT to.
export PYTHONDIR=/Library/Frameworks/EPD64.framework/Versions/Current export ROOTSYS=/usr/local/root export LD_LIBRARY_PATH=$ROOTSYS/lib:$PYTHONDIR/lib/:$LD_LIBRARY_PATH export PATH=$ROOTSYS/bin:$PYTHONDIR/bin:$PATH export PYTHONPATH=$ROOTSYS/lib:$PYTHONPATH
- Download the source for the newest production version of ROOT from http://root.cern.ch/drupal/content/downloading-root
- Unpack the tarball:
tar -xf root_blah_blah.tar
cd root
- Configure ROOT installation and compile
./configure macosx64 --enable-python --with-python-incdir=$PYTHONDIR/include --with-python-libdir=$PYTHONDIR/lib make sudo su export ROOTSYS=/usr/local/root make install exit
Tuesday, August 16, 2011
Planet Walk 2.0
We can make our own real-time planet walk.
Planetary positions are obtained according to this NASA brief
Code available as a gist here
Results: here
Planetary positions are obtained according to this NASA brief
Code available as a gist here
Results: here
Monday, August 8, 2011
Making animated gifs
I always lose this command, so I'm putting here so I can find it again
And for sticking around, here is an animated gif for you:
convert pics/*.png [-delay 100 -layers Optimize] anim.gif
And for sticking around, here is an animated gif for you:
Sunday, August 7, 2011
Dow Jones Industrial Lottery
I remember reading Malcolm X in highschool. In it, at one point Malcolm X gets involved with a local lottery scheme, where tickets are sold on the closing price for the Dow Jones Industrial average.
I decided to investigate. Turns out you could do fairly well at such a lottery. The below script displays a lot that you can do with python, including grabbing data from the web, csv parsing, generator expressions, use of counters, scipy stuff, matplotlib plotting, etc.
One of the things I really like about Python is that you can do stuff like this, analyise the cents on the closing of the Dow Jones Industrial average for 20,806 days in only a few lines. Really only the first four lines (ignoring imports and comments) does all of the work, the rest is me doing some math and making my plots pretty.
The Code
The output:
In this last plot, I've plotted the odds { (1-p)/p ) for both the Dow Jones numbers with error, as well as randomly chosen numbers between 00-99. Notice that the effects are real. Perhaps most tellingly demonstrated in the chisquare test results, where the random draws get a p value of 0.24, the Dow Jones numbers have a p-value of 3e-228, definitely not a fair lottery.
I decided to investigate. Turns out you could do fairly well at such a lottery. The below script displays a lot that you can do with python, including grabbing data from the web, csv parsing, generator expressions, use of counters, scipy stuff, matplotlib plotting, etc.
One of the things I really like about Python is that you can do stuff like this, analyise the cents on the closing of the Dow Jones Industrial average for 20,806 days in only a few lines. Really only the first four lines (ignoring imports and comments) does all of the work, the rest is me doing some math and making my plots pretty.
The Code
import csv, urllib2 from collections import Counter import scipy as sp import pylab as py from scipy.stats import chisquare #Link format for yahoo finance historical data yahoo_finance_link = "http://ichart.finance.yahoo.com/table.csv?s={symbol}&a=09&b=1&c=1928&d=07&e=7&f=2011&g=d&ignore=.csv" #Use urllib2 to open the historical data for the dow jones #and use the csv module to read each line dow_data = csv.reader( urllib2.urlopen( yahoo_finance_link.format(symbol='^DJI'))) #The first line is the names of the columns, find the column # for closing prices close_index = next(dow_data).index('Close') #extract the closing prices as floats closing_prices = (float(row[close_index]) for row in dow_data) #grab the last two digits of the closing price last_digits = [int( (price%1) * 100 ) for price in closing_prices] #Count the appearance of each closing digit lotto = Counter(last_digits) #Do some statistics lotto_counts = sp.array([lotto[x] for x in range(100)]) total_data_points = sum(lotto.values()) lotto_probs = lotto_counts*1./total_data_points lotto_err = sp.sqrt(lotto_counts)/total_data_points lotto_odds = (1.- lotto_probs )/ lotto_probs lotto_odds_err = sp.sqrt((1./lotto_probs**2)**2 * lotto_err**2) #perform a chisquared test print ( "chisquare results for the Dow Jones closing cents:\n", " chisquare:{0}, p-value: {1}".format(*chisquare(lotto_counts)) ) #draw random numbers to compare random_choices = sp.random.randint(0,100,total_data_points) random = Counter(random_choices) random_counts = sp.array([random[x] for x in range(100)]) random_probs = random_counts*1./total_data_points random_err = sp.sqrt( random_counts ) / total_data_points random_odds = (1.-random_probs )/ random_probs random_odds_err = sp.sqrt((1./random_probs**2)**2 * random_err**2) print ( "chisquare results for the random closing cents:\n", " chisquare:{0}, p-value: {1}".format(*chisquare(random_counts))) #Plot the odds and the errors py.figure(1) py.clf() py.errorbar(sp.arange(100),lotto_odds,lotto_odds_err,fmt=None,label='DJI') py.errorbar(sp.arange(100),random_odds,random_odds_err,fmt=None,label='RANDOM') py.hlines(100,0,100,linestyle='dashed') py.xlim((-1,100)) py.xlabel('Last Two Digits') py.ylabel('Odds') py.title('Dow Lottery and random drawing odds results') py.savefig('oddsplot.png') #Make a heatmap of the lucky and unlucky numbers py.figure(2) py.clf() py.imshow( lotto_odds.reshape((10,10)) , interpolation='nearest') py.colorbar() py.xlabel('Second Digit') py.ylabel('First Digit') py.xticks(range(10)) py.yticks(range(10)) py.ylim((-0.5,9.5)) py.title('Dow Lotto Odds') py.savefig('heatmap.png') #Print the most common digits print "The ten most common digits are: ", [ num for num,count in lotto.most_common(10) ]
The output:
chisquare results for the Dow Jones closing cents: chisquare:1397.69124291, p-value: 3.47521670702e-228 chisquare results for the random closing cents: chisquare:108.207440161, p-value: 0.247576090193 The ten most common digits are: [12, 75, 87, 25, 50, 0, 62, 37, 9, 59]
In this last plot, I've plotted the odds { (1-p)/p ) for both the Dow Jones numbers with error, as well as randomly chosen numbers between 00-99. Notice that the effects are real. Perhaps most tellingly demonstrated in the chisquare test results, where the random draws get a p value of 0.24, the Dow Jones numbers have a p-value of 3e-228, definitely not a fair lottery.
Friday, August 5, 2011
Sets
Sets are python hash tables. You can think of them as just a dictionary without any values. So, they are a lot like lists, except they
- Do not preserve order
- Have fast lookups
- Must be composed of hashable (read: immutable) elements
- Must have unique elements
>>> foo = range(10000) >>> timeit int(rand()*100000) in foo 10000 loops, best of 3: 146 us per loop >>> boo = set(range(10000)) >>> timeit int(rand()*100000) in boo 1000000 loops, best of 3: 479 ns per loopNotice that the lookup time for a set is ~300 times faster. Sets also have nicely overloaded operators, so that we can do set operations
>>> A = set(range(5)) >>> B = set(range(2,10)) >>> 5 in A #testing for element False >>> A | B # A union B set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> A - B # set difference set([0,1]) >>> A & B # set intersection set([2,3,4]) >>> A <= B # testing subsets FalseOther nice functions to now are add and update. Add adds a single element, update will take the union of a set with something else.
>>> A set([0,1,2,3,4]) >>> A.add(10) >>> A set([0,1,2,3,4,10]) >>> A.update(range(12)) >>> A set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])Full documentation here
Including Python Source Code
Following the advice here, we should have syntax highlighting now.
If we put in
We get:
In short use <pre> tags for output and <pre class="prettyprint"> for source.
If we put in
<pre class="prettyprint">
def hello_world():
print "Hello World"
hello_world()
</pre>
returns
<pre>
Hello World
</pre>
We get:
def hello_world(): print "Hello World" hello_world()
returns
Hello World
In short use <pre> tags for output and <pre class="prettyprint"> for source.
Subscribe to:
Posts (Atom)