Timeranges overlapping algorithm in Python -
i have list of different ids, start dates , end dates, let's :
[ (5, d.datetime(2010, 9, 19, 0, 0, 0), d.datetime(2010, 9, 19, 0, 5, 10)), (6, d.datetime(2010, 9, 19, 0, 0, 0), d.datetime(2010, 9, 19, 12, 59, 59)), (4, d.datetime(2010, 9, 19, 10, 30, 17), d.datetime(2010, 9, 19, 20, 20, 59)), (6, d.datetime(2010, 9, 19, 14, 12, 0), d.datetime(2010, 9, 19, 23, 59, 59)), (5, d.datetime(2010, 9, 19, 17, 0, 22), d.datetime(2010, 9, 19, 19, 14, 20)) ]
i need somehow find overlapping timerange , prepare new list ids under coverage @ specific timerange, example list above result should :
[ ('5,6', d.datetime(2010, 9, 19, 0, 0, 0), d.datetime(2010, 9, 19, 0, 5, 10), ('6', d.datetime(2010, 9, 19, 0, 5, 10), d.datetime(2010, 9, 19, 10, 30, 17), ('4,6', d.datetime(2010, 9, 19, 10, 30, 17), d.datetime(2010, 9, 19, 12, 59, 59), ('4', d.datetime(2010, 9, 19, 12, 59, 59), d.datetime(2010, 9, 19, 14, 12, 0), ('4,6', d.datetime(2010, 9, 19, 14, 12, 0), d.datetime(2010, 9, 19, 17, 0, 22), ('4,5,6', d.datetime(2010, 9, 19, 17, 0, 22), d.datetime(2010, 9, 19, 19, 14, 20), ('4,6', d.datetime(2010, 9, 19, 19, 14, 20), d.datetime(2010, 9, 19, 20, 20, 59), ('6', d.datetime(2010, 9, 19, 20, 20, 59), d.datetime(2010, 9, 19, 23, 59, 59) ]
visual concept:
actually i've solution this: i'm getting minimal , maximum dates of whole range, start iterate min_date max_date each 1 second, when in particular second match of intervals target list, save matched ids dictionary key , append time iterator list value, save parent list, next , next. @ final go on dicts in parent list , ids keys , first, last date in value list range need find. solution works slow when count ranges in month. because it's takes time iterate 1 month in seconds.
here code:
def delta(start, end, delta): cur = start while cur < end: yield cur cur += delta final_ranges = [] last_result = none = -1 checker_date in delta( sorted_ranges_by_start[0]['start'], sorted_ranges_by_end[-1]['end'], relativedelta(seconds=1)): aggregator = [] rng in ranges: if rng['start'] <= checker_date <= rng['end']: aggregator.append(str(rng['id'])) if len(aggregator) > 0: ids = ','.join(set(aggregator)) if last_result != ids: final_ranges.append({}) last_result = ids += 1 if ids not in final_ranges[i]: final_ranges[i][ids] = [] final_ranges[i][ids].append(checker_date)
but said it's working slow in big ranges.
in way please me find algorithm can without iteration through month or maybe advice way improve iteration speed ( not sure, maybe try write part on c , embed python )
thanks.
i've make work code below.
basic explanation first detect cut points between provided periods, is, everytime period starts on ends. second, iterate between cutpoints only, not periods, , check if overlap see if active between cut points. accumulate active periods.
process time depends on number of cutpoints , periods, , not on elapsed time.
from datetime import datetime sortedcontainers import sortedset periods = [ (5, datetime(2010, 9, 19, 0, 0, 0), datetime(2010, 9, 19, 0, 5, 10)), (6, datetime(2010, 9, 19, 0, 0, 0), datetime(2010, 9, 19, 12, 59, 59)), (4, datetime(2010, 9, 19, 10, 30, 17), datetime(2010, 9, 19, 20, 20, 59)), (6, datetime(2010, 9, 19, 14, 12, 0), datetime(2010, 9, 19, 23, 59, 59)), (5, datetime(2010, 9, 19, 17, 0, 22), datetime(2010, 9, 19, 19, 14, 20)) ] cutpoints = sortedset() period in periods: cutpoints.add(period[1]) cutpoints.add(period[2]) ranges = [] start_cutpoint = none end_cutpoint in cutpoints: if not start_cutpoint: # skip first start_cutpoint = end_cutpoint continue cut_point_active_periods = [] period in periods: # check if period , cutpoint range overlap start_overlap = max(start_cutpoint, period[1]) end_overlap = min(end_cutpoint, period[2]) if start_overlap < end_overlap: cut_point_active_periods.append(period[0]) ranges.append((cut_point_active_periods, start_cutpoint, end_cutpoint)) start_cutpoint = end_cutpoint
Comments
Post a Comment