[Dailydave] A small fun Python puzzle

ergosum ergosum at neurosecurity.com
Tue Apr 1 10:08:41 EDT 2008


Hi all, 

> results in:
> >>> for l in [100000, 1000000, 5000000, 10000000]:
>
> ...   print '%10d %f' % (l, test(l))
> ...
>     100000 0.006711
>    1000000 0.764886
>    5000000 28.554786
>   10000000 111.738498
>
> (wow - so not linear ...)
>

This is very interesting. I'm not a python ninja but AFAIK (quoting from: 
http://etutorials.org/Programming/Python+tutorial/Part+III+Python+Library+and+Extension+Modules/Chapter+17.
+Testing,+Debugging,+and+Optimizing/17.4+Optimization/):

"Chaining two lists of length N1 and N2 is O(N1+N2). Multiplying a list of 
length N by the number M is O(N*M). Accessing or rebinding any list item is 
O(1) (also known as constant time, meaning that the time taken does not 
depend on how many items are in the list). len( ) on a list is also O(1). 
Accessing any slice of length M is O(M). Rebinding a slice of length M with 
one of identical length is also O(M). Rebinding a slice of length M1 with one 
of different length M2 is O(M1+M2+N1), where N1 is the number of items after 
the slice in the target list."

So if this is true, slicing data[1024:] should be O(M) where M = 100000, 
1000000, 5000000, 10000000 respectively, while slicing data[i:i+1024] should 
always be O(N) where N = 1024. There is a huge gain here that accounts for 
the more or less homogeneous times of the second algorithm. Nevertheless it 
puzzles me why the slicing operation isn't linear. Anyone with a better 
python internal knowledge can throw some light?


-- 
http://www.neurosecurity.com

"We must be the change we wish to see in the world"
Mahatma Gandhi


More information about the Dailydave mailing list