[Dailydave] A small fun Python puzzle

Daryl Tester dt-dailydave at handcraftedcomputers.com.au
Mon Mar 31 16:34:49 EDT 2008


Dave Aitel wrote:

> This is part of our smb file putter. With small files it works great. 
> With larger files, it uses 100% of the CPU and takes forever. Can anyone 
> spot why? (Answer forthcoming, of course)

Depending on the value of "large", I suspect slicing and garbage
collection become expensive operations. Given -

import time
def test(l):
  data = ' ' * l
  t = time.time()
  while data != "":
    data = data[1024:]
  return time.time() - t

results in:

>>> for l in [100000, 1000000, 5000000, 10000000]:
...   print '%10d %f' % (l, test(l))
... 
    100000 0.006711
   1000000 0.764886
   5000000 28.554786
  10000000 111.738498

(wow - so not linear ...)

An iterative version appears a lot faster (which, admittedly,
probably rules out the slicing operation) -

def test2(l):
  data = ' ' * l
  i = 0
  t = time.time()
  while i < l:
    data2 = data[i:i+1024]
    i += 1024
  return time.time() - t

>>> for l in [100000, 1000000, 5000000, 10000000]:
...   print '%10d %f' % (l, test2(l))
... 
    100000 0.000320
   1000000 0.003319
   5000000 0.012145
  10000000 0.021329


-- 
Regards,
  Daryl Tester

"We are sexy, sexy Von Neumann machines."  -- http://www.xkcd.org/387/


More information about the Dailydave mailing list