Searching for Fixed-Radius Near Neighbors with Cell Lists Algorithm in Julia Language

Table of Contents
Introduction
This article defines the fixed-radius near neighbors search problem and why efficient algorithms are essential. We present two algorithms for solving the problem, Brute Force and Cell Lists. We explain both algorithms in-depth and analyze their performance. We accompanied this article with a Julia implementation of the Cell Lists algorithm, located in CellLists.jl repository. We can use the algorithm in a simulation or as a reference for implementing it in another language. Cell Lists is applicable in simulations with finite range interaction potentials where we are required to compute all close pairs in each time step, such as particle simulations. For long-range interaction potentials, we recommend reading about Barnes-Hut simulation and Fast multipole method.
Fixed-radius near neighbors is an old computer science problem closely related to sorting. We define the problem as finding all pairs of points within a certain distance apart, referred to as all close pairs. The author of 1 has surveyed different methods for solving the fixed-radius near neighbors problem. These methods include brute force, projection that is, dimension-wise sorting and querying using binary search, cell techniques, and k-d trees. The paper 2 analyzes the brute force, projection, and cell techniques. In the context of particle simulations, 3 discusses Verlet Lists, a method with similarities to Cell Lists. These methods have also been applied to large scale particle simulations using GPU by 4.
Similar to sorting, the computational complexities of the different methods in respect to the number of particles
Regarding analyzing algorithms, we recommend Introduction to Algorithms 5 textbook. Foundations, sorting, hash tables, and computational geometry sections are relevant to this article. The computational geometry section contains a chapter (33.4) about finding the closest pair of points in two dimensions, which is related to finding all close pairs.
Fixed-Radius Near Neighbors Problem
Let
Using the definition of near neighbors, we define all close pairs as
Because the norm is symmetric, that is
The next sections discuss how to implement algorithms for fixed-radius near neighbors search.
Brute Force Algorithm
The Brute Force is the simplest algorithm for computing the near neighbors. It serves as a benchmark and building block for the Cell Lists algorithm, and it is useful for testing the correctness of more complex algorithms.
The Brute Force algorithm directly follows the definition symmetric all close pairs. It iterates over the points pairs, checks the distance condition, and yields all pairs that satisfy the condition. The following Julia code demonstrates the Brute Force approach.
@inline function distance_condition(p1::Vector{T}, p2::Vector{T}, r::T) where T <: AbstractFloat
sum(@. (p1 - p2)^2) β€ r^2
end
function brute_force(p::Array{T, 2}, r::T) where T <: AbstractFloat
ps = Vector{Tuple{Int, Int}}()
n, d = size(p)
for i in 1:(n-1)
for j in (i+1):n
if @inbounds distance_condition(p[i, :], p[j, :], r)
push!(ps, (i, j))
end
end
end
return ps
end
n, d = 10, 2
r = 0.1
p = rand(n, d)
brute_force(p, r)
The Brute Force algorithm performs a total number of iterations:
Since Brute Force has quadratic complexity, it does not scale well for many points, motivating the search of more efficient algorithms.
Cell Lists Algorithm
Definition
The Cell Lists algorithm works by partitioning the space into a grid with cell size
The Julia code is available as a GitHub repository at CellLists.jl.
We implement the Cell Lists algorithm in Julia language. The algorithm has two parts. The first one constructs the Cell List, and the second one iterates over it to produce the symmetric all close pairs.
Constructor
The CellList
constructor maps each cell to the indices of points in that cell.
struct CellList{d}
indices::Dict{CartesianIndex{d}, Vector{Int}}
end
function CellList(p::Array{T, 2}, r::T, offset::Int=0) where T <: AbstractFloat
@assert r > 0
n, d = size(p)
cells = @. Int(fld(p, r))
data = Dict{CartesianIndex{d}, Vector{Int}}()
for j in 1:n
@inbounds cell = CartesianIndex(cells[j, :]...)
if haskey(data, cell)
@inbounds push!(data[cell], j + offset)
else
data[cell] = [j + offset]
end
end
CellList{d}(data)
end
Near Neighbors
We define the indices to the neighboring cells as follows:
@inline function neighbors(d::Int)
n = CartesianIndices((fill(-1:1, d)...,))
return n[1:fld(length(n), 2)]
end
We only include half of the neighboring cells to avoid double counting when iterating over the neighboring cells of all non-empty cells. We also define functions for brute-forcing over a set of points.
@inline function distance_condition(p1::Vector{T}, p2::Vector{T}, r::T) where T <: AbstractFloat
sum(@. (p1 - p2)^2) β€ r^2
end
@inline function brute_force!(ps::Vector{Tuple{Int, Int}}, is::Vector{Int}, p::Array{T, 2}, r::T) where T <: AbstractFloat
for (k, i) in enumerate(is[1:(end-1)])
for j in is[(k+1):end]
if @inbounds distance_condition(p[i, :], p[j, :], r)
push!(ps, (i, j))
end
end
end
end
@inline function brute_force!(ps::Vector{Tuple{Int, Int}}, is::Vector{Int}, js::Vector{Int}, p::Array{T, 2}, r::T) where T <: AbstractFloat
for i in is
for j in js
if @inbounds distance_condition(p[i, :], p[j, :], r)
push!(ps, (i, j))
end
end
end
end
Finding all cell list pairs works by iterating over all non-empty cells and finding the pairs within that cell using Brute Force. Iterating over its neighboring cells checks if the cell is non-empty and finds the pairs between the current and the non-empty neighboring cell using Brute Force.
function near_neighbors(c::CellList{d}, p::Array{T, 2}, r::T) where d where T <: AbstractFloat
ps = Vector{Tuple{Int, Int}}()
offsets = neighbors(d)
# Iterate over non-empty cells
for (cell, is) in c.data
# Pairs of points within the cell
brute_force!(ps, is, p, r)
# Pairs of points with non-empty neighboring cells
for offset in offsets
neigh_cell = cell + offset
if haskey(c.data, neigh_cell)
@inbounds js = c.data[neigh_cell]
brute_force!(ps, is, js, p, r)
end
end
end
return ps
end
Usage
We can use the Cell Lists by first constructing the CellList
and then query the symmetric all close pairs with near_neighbors
.
n, d = 10, 2
r = 0.1
p = rand(n, d)
c = CellList(p, r)
indices = near_neighbors(c, p, r)
We receive the same output as for brute force.
indices2 = brute_force(p, r)
@assert Set(Set.(indices)) == Set(Set.(indices2))
In the next section, we analyze the Cell Lists algorithmβs performance compared to Brute Force and theoretical limits and discuss the assumptions when Cell Lists is practical.
Analysis
Number of Iterations
We denote the number of non-empty cells using
Each non-empty cell has
where
Method 1: For each non-empty cell
Method 2: For each non-empty cell near_neighbors
code), which takes
We derive the dimensionality condition such that method 2 performs fewer iterations compared to method 1:
When true, finding neighboring cells grows linearly, necessary for the Cell Lists to be efficient compared to Brute Force. Note that higher-dimensional spaces require exponentially higher numbers of non-empty cells
Worst-Case Instances
If one cell contains all the points, we have
If all points are in unique cells and we do not satisfy the dimensionality condition, we have
Curse of Dimensionality
Cell Lists also suffers from the curse of dimensionality. As we increase the dimension, the number of possible cells grows exponentially. That is,
where
Practical Instances
In many practical instances, we can assume a limit to how many points can occupy a cell. For example, two and three-dimensional physical simulations particles have finite size and non-overlapping and satisfy the dimensionality condition.
For a problem instance that satisfies the dimensionality condition
With fixed dimension and maximum density, the number of iterations is linearly proportional to the number of points
Benchmarks
Method
We performed each benchmark instance by setting fixed parameter values for size
You can find the benchmarking code from the CellListsBenchmarks.jl GitHub repository and the scripts to run the benchmarks and plotting from cell-lists-benchmarks GitHub repository.
Cell Lists vs. Brute Force
The figures above show how Cell Lists compares to Brute Force as we increase dimension with a constant radius. As can be seen, Cell Lists performs better in low dimensions and beat Brute Force with a small number of points, but requires more points with a larger dimension.
Effects of Dimensionality
The figure above shows the relation between Cell Lists with increasing dimension and constant radius. As we can see, the increase in time scales roughly to
Conclusions
I first encountered the fixed-radius near neighbors problem while developing the Crowd Dynamics simulation, which is mechanistically similar to molecular simulation with finite distance interactions (local interactions). The algorithms presented in this article can be used for research purposes or building small scale simulations. Large scale simulations may require additional improvements such as parallelism, which we discuss more in Multithreading in Julia Language Applied to Cell Lists Algorithm.
Contribute
If you enjoyed or found benefit from this article, it would help me share it with other people who might be interested. If you have feedback, questions, or ideas related to the article, you can contact me via email. For more content, you can follow me on YouTube or join my newsletter. Creating content takes time and effort, so consider supporting me with a one-time donation.
References
Bentley, J. L. (1975). A Survey of techniques for fixed radius near neighbor searching, 37. β©οΈ
Mount, D. (2005). Geometric Basics and Fixed-Radius Near Neighbors, 1β8. β©οΈ
Yao, Z., Wang, J. S., Liu, G. R., & Cheng, M. (2004). Improved neighbor list algorithm in molecular simulations using cell decomposition and data sorting method. Computer Physics Communications, 161(1β2), 27β35. https://doi.org/10.1016/j.cpc.2004.04.004 β©οΈ
Hoetzlein, R. (2014). Fast Fixed-Radius Nearest Neighbors: Interactive Million-Particle Fluids. GPU Technology Conference. β©οΈ
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms, Third Edition (3rd ed.). The MIT Press. β©οΈ