Multithreaded Boids is a Unity Project exploring steering behaviours with data-oriented design. The project simulates 10,000 boids while maintaining a 16.6ms frame time budget.
Due to browser limitations, this WebGL viewer is singlethreaded with 2k boids.
Download the Windows version for multithreaded performance with 10k boids.
Tools: Unity 6, Unity Jobs, Burst, Profiler and Profile Analyzer
Timeline: 80 Hours (Jan 5 – Jan 23)
This project instantiates 10,000 boids steering as a flock contained within an invisible sphere. The goal was to keep the frame time lower than 16.6ms by adopting data-oriented design.
I’ve implemented this project using a Game Object workflow to prove that, even without the highly performing ECS paradigm, we can achieve great performance.
Obstacle avoidance contains the boids within a spherical volume. This steering behaviour requires boids to check intersections between themselves and an obstacle.
Since the raycast methods from Unity's built-in physics system are incompatible with Unity Jobs and the Burst compiler, I implemented the algebraic solution for ray-sphere intersection that is multithread compatible and burst-friendly.
Each boid uses 5 directional probes to detect boundaries and calculates perpendicular steering forces.
A naïve proximity lookup costs O(n2), because each boid needs to check every other boid to assess whether they are “neighbours” — for 10K boids, that's 100M checks.
By partioning the space, such that each boid only queries adjacent cells for other boids, I was able to reduce the O(n2) complexity down to O(nk).
I applied a spatial hash grid by using the NativeMultiHashMap data-type, because it is a one-to-many associative array and a blittable type compatible with Unity Jobs and the Burst compiler.
Singlethreaded performance was the primary bottleneck for meeting the 16.6ms frame budget, which is why I chose to use Unity Jobs to create multithreaded code.
By distributing the logic across the worker threads, I significantly improved the frame time.
These performance metrics capture the 'PlayerLoop' marker via the Profile Analyzer across a 1,999-frame sample.
The unoptimized approach proved non-scalable. At 10k boids, the simulation became unresponsive, dropping below measurable framerates.
The spatial partitioning method significantly improved the unoptimized script and rivaled multithreaded performance at smaller scales. However, frame time increased drastically when simulating 10k boids.
To achieve high-performance simulations at scale, an approach combining spatial partitioning and multithreading is essential.
Profiled on a Intel Core i7-9750H CPU @ 2.60GHz.
This project demonstrates the critical impact of data layout on maximizing cache hits and instruction throughput. Leveraging data-oriented design, spatial partitioning, and multithreading was essential to meeting the performance thresholds required for large-scale simulation.
CppCon 2014: Mike Acton "Data-Oriented Design and C++" by Mike Acton
Intro to Data Oriented Design for Games by Nic Barker
Steering Behaviors For Autonomous Characters by Craig Reynolds
Spatial Hash Grids & Tales from Game Development by SimonDev
Ray-Sphere Intersection by scratchapixel