This work was sponsored by enioka.

The task one is:

Given a finite set of objects pool, and an object query. Look, which ever comes first, within approximately t time, or at most n objects subset of pool that are near query according to the metric m, call that subset candidates.

Most popular approach nowadays, and the one I could start to get under control is to rely on Hierarchical Navigable Small Worlds. HNSW is reminiscent of Ben Goertzel glocals from opencog fame.

HNSW datastructure use a deterministic heuristic with the distance function to construct clusters, still, it use random to spread pression on memory, and compute. It looks like a skip-list with multiple path ways. A regular natural sort will not speed queries, because of that glocals are not sorted, instead nodes are related to points in a higher level layer. The higher the level the less points they are. The highest level has only one point. Each point has a maximum x neighbor in a lower level. Horizontal navigation must be done through the parent point.

While I put that whole on a flashcard, you can delice with the following brute force algorithm that will generate a pool of points nearest to a point query:

(import (chezscheme))
     (import (letloop r999))
     (import (scheme generator))
     ;; brute force algorithm
     (define total (string->number (or (getenv "HYPER_TOTAL") "65536")))
     (define exponent (string->number (or (getenv "HYPER_EXPONENT") "8")))
     (define pk
       (lambda args
         (write args)
         (car (reverse args))))
     (define point-distance
       (lambda (a b) (abs (- a b))))
     (define point-random (lambda () (random (expt 2 exponent))))
     (define pool (map (lambda _ (point-random)) (iota total)))
     (define query (point-random))
     (define ~.two
       (lambda (x)
         (inexact (/ (round (* x 100)) 100))))
     (define candidates
       (let* ((objects (map (lambda (x) (cons (point-distance query x) x)) pool))
              (objects* (sort (lambda (u v) (< (car u) (car v))) objects))
              (distance-max (apply max (map car objects*))))
         (map (lambda (x) (cons (~.two (/ (- distance-max (car x)) distance-max))
                                (cdr x)))
     (pk 'query (string-upcase (number->string query 36)) query)
     (for-each pk (map (lambda (x) (list (car x) (number->string (cdr x) 36) (cdr x))) candidates))