Practice exercise: Ranges
10 of May, 2017
Today I’ve got an exercise to complete in a language on my choice. I did it in Scala and there was some time constraint, so solution is not perfect, but interesting, I think.
Goal
Given a number of half open ranges [x, y)
, when provided with number z
return number of ranges it is member of.
There are up to million ranges and up to billions requests.
Sample ranges implementation
An example was given in other language, but equivalent to this:
Is a naive implementation with linear performance O(N) of lookup. It iterates through all ranges at each request. Moreover, the best case scenario = worst case scenario O(N).
Optimisation considerations
The structure is immutable after build and receives orders of magnitude more lookups than it’s size. It would make optimise for quicker lookups by using more efficient data structure with index.
Prototype of optimised ranges
The idea is to plot all range bound points into sorted tree, where each point would have indexed count at it’s point and after it until next point.
The performance or lookup on sorted tree is O(Log(N)).
Practically on my laptop it handled 1 million ranges sustaining 20 million of lookups per second.
The prototype could be generalised further:
- to work with other kind of data, not just count (generalise data and indexing mechanism);
- to work with other types of keys.
Full code with built and run instructions are in my github account.
What do you think, did I do well?
comments powered by Disqus