455,584 Members | 1,634 Online
Need help? Post your question and get tips & solutions from a community of 455,584 IT Pros & Developers. It's quick & easy.

# Handling lists

 P: n/a I have a question on python lists. Suppose I have a 2D list list = [[10,11,12,13,14,78,79,80,81,300,301,308]] how do I convert it so that I arrange them into bins . so If i hvae a set of consecutive numbers i would like to represent them as a range in the list with max and min val of the range alone. I shd get something like list = [[10,14],[78,81],[300,308]] Jul 19 '05 #1
5 Replies

 P: n/a su*******@gmail.com wrote: I have a question on python lists.Suppose I have a 2D listlist = [[10,11,12,13,14,78,79,80,81,300,301,308]]how do I convert it so that I arrange them into bins .so If i hvae a set of consecutive numbers i would like to representthem as a range in the list with max and min val of the range alone.I shd get something likelist = [[10,14],[78,81],[300,308]] Maybe: list = [10,11,12,13,14,78,79,80,81,300,301,308] new_list = [] start = 0 for i in range(1,len(list) + 1): if i == len(list) or list[i] - list[i-1] <> 1: new_list.append([list[start],list[i-1]]) start = i print new_list Jul 19 '05 #2

 P: n/a yes that makes sense.But the problem I am facing is if list= [300,301,303,305] I want to consider it as one cluster and include the range as [300,305] so this is where I am missing the ranges. so If the list has l = [300,301,302,308,401,402,403,408] i want to include it as [[300,308],[401,408]]. Jul 19 '05 #3

 P: n/a On Saturday 23 April 2005 12:50 pm, su*******@gmail.com wrote: I have a question on python lists. Suppose I have a 2D list list = [[10,11,12,13,14,78,79,80,81,300,301,308]] how do I convert it so that I arrange them into bins *. so If i hvae a set of consecutive numbers i would like to represent them as a range in the list with max and min val of the range alone. I shd get something like list = [[10,14],[78,81],[300,308]] Here is an interesting way: a = iter([1,2,3,4]) [(b,a.next()) for b in a] [(1, 2), (3, 4)] James -- James Stroud UCLA-DOE Institute for Genomics and Proteomics Box 951570 Los Angeles, CA 90095 http://www.jamesstroud.com/ Jul 19 '05 #4

 P: n/a su*******@gmail.com wrote: .... list = [[10,11,12,13,14,78,79,80,81,300,301,308]] how do I convert it so that I arrange them into bins . so If i hvae a set of consecutive numbers i would like to represent them as a range in the list with max and min val of the range alone. I shd get something like list = [[10,14],[78,81],[300,308]] Mage: Maybe: list = [10,11,12,13,14,78,79,80,81,300,301,308] new_list = [] start = 0 for i in range(1,len(list) + 1): if i == len(list) or list[i] - list[i-1] <> 1: new_list.append([list[start],list[i-1]]) start = i print new_list su*******@gmail.com wrote: yes that makes sense.But the problem I am facing is if list= [300,301,303,305] I want to consider it as one cluster and include the range as [300,305] so this is where I am missing the ranges. so If the list has l = [300,301,302,308,401,402,403,408] i want to include it as [[300,308],[401,408]]. Mage's solution meets the requirements that you initially stated of treating *consecutive* numbers as a group. Now you also want to consider [300,301,303,305] as a cluster. You need to specify your desired clustering rule, or alternatively specify ho many bins you want to create, but as an example, here is a naive approach, that could be adapted easily to other clustering rules and (a bit less easily) to target a certain number of bins def lstcluster(lst): # Separate neighbors that differ by more than the mean difference lst.sort() diffs = [(b-a, (a, b)) for a, b in zip(lst,lst[1:])] mean_diff = sum(diff[0] for diff in diffs)/len(diffs) breaks = [breaks for diff, breaks in diffs if diff > mean_diff] groups = [lst[0]] + [i for x in breaks for i in x] + [lst[-1]] igroups = iter(groups) # Pairing mechanism due to James Stroud return [[i, igroups.next()] for i in igroups] Note this is quite inefficient due to creating several intermediate lists. But it's not worth optimizing yet, since I'm only guessing at your actual requirement. lst0 = [10,11,12,13,14,78,79,80,81,300,301,308] lst1 = [10,12,16,24,26,27,54,55,80,100, 105] lst3 = [1,5,100,1000,1005,1009,10000, 10010,10019] lst0 = [10,11,12,13,14,78,79,80,81,300,301,308] lst1 = [10,12,16,24,26,27,54,55,80,100, 105] lst2 = [1,5,100,1000,1005,1009,10000, 10010,10019] lstcluster(lst0) [[10, 14], [78, 81], [300, 308]] lstcluster(lst1) [[10, 27], [54, 55], [80, 80], [100, 105]] lstcluster(lst2) [[1, 1009], [10000, 10019]] Michael Jul 19 '05 #5

 P: n/a That helps.Thanks much. Jul 19 '05 #6

### This discussion thread is closed

Replies have been disabled for this discussion.