python - Large list generation optimization -


i needed python function take list of strings in form of:

seq = ['a[0]','b[2:5]','a[4]'] 

and return new list of "expanded" elements preserved order, so:

expanded = ['a[0]', 'b[2]', 'b[3]', 'b[4]', 'b[5]', 'a[4]'] 

to achieve goal wrote simple function:

def expand_seq(seq):     #['item[i]' item in seq xrange in item]     return ['%s[%s]'%(item.split('[')[0],i) item in seq in xrange(int(item.split('[')[-1][:-1].split(':')[0]),int(item.split('[')[-1][:-1].split(':')[-1])+1)] 

when dealing sequence generate less 500k items works well, slows down quite bit when generating large lists (more 1 million). example:

# let's generate 10 million items! seq = ['a[1:5000000]','b[1:5000000]'] t1 = time.clock() seq = expand_seq(seq) t2 = time.clock() print round(t2-t1, 3) # result: 9.541 seconds 

i'm looking ways improve function , speed when dealing large lists. if has suggestions, love hear them!

the following seems give 35% speedup:

import re  r = re.compile(r"(\w+)\[(\d+)(?::(\d+))?\]")  def expand_seq(seq):     result = []     item in seq:         m = r.match(item)         name, start, end = m.group(1), int(m.group(2)), m.group(3)         rng = xrange(start, int(end)) if end else (start,)         t = name + "["         result.extend(t + str(i) + "]" in rng)     return result 

with code:

  • we compile regular expression use in function.
  • we concatenate our strings directly.

Comments

Popular posts from this blog

node.js - Mongoose: Cast to ObjectId failed for value on newly created object after setting the value -

[C++][SFML 2.2] Strange Performance Issues - Moving Mouse Lowers CPU Usage -

ios - Possible to get UIButton sizeThatFits to work? -