I’m working on a web tower defense game based on three.js. Now I’m stuck at optimizing the performance.
My game loads two GLTF models as enemy and tower, both have skinned-mesh. When the player creates a tower or the game spawns an enemy, I use THREE.AnimationUtils.clone to clone the loaded model. Then I add this cloned model to the scene. For animation, I use THREE.AnimationObjectGroup to animate all the enemies.
This results in an average of 370 draw-calls per frame in the performance test with the scene loaded with 45 towers and 70 enemies, which is a nightmare for the game.
I think maybe using instancing can optimize the performance because every tower and enemy share the same model and state in each frame, but only rotation and position are different. But after I studied some examples using instancing, there is no example using instancing with skinned-mesh. (There is a discussion here, but the result here doesn't mention any method with instancing.)
Is there any chance that this can be done with three.js, or some other solution for this situation?
Update
After researched more I found some concepts maybe can help me to implement instancing with skinned-mesh.
Concept
The original post here implement skinned-mesh with instancing in Unity. (It's written in Chinese, I translated the main concept in the following.)
After loaded a skinned-mesh, it has an initial state with all vertices (for clarity, each initial vertex denote as PLT in the following). In any frame of the animation, the final position of PLT (denote as PI) equals to a series of matrix multiplication PI = (M_rootlocal * ... * M_2_3 * M_1_2 * M_bind_1 * PLT) + (M_rootlocal * ... * M_2_3 * M_1_2 * M_bind_2 * PLT) + (...)
M_bind_1is the bone-binding matrix of bone 1.M_m_nmeans the transformation of bonemrelative to it's initial state under the coordinate system of bonen.
For simplify, use M_f_i = M_rootlocal * ... * M_2_3 * M_1_2 * M_bind_i to represent the transformation. M_f_i means bone-binding matrix of bone i after multiplication at frame f, so PI = (M_f_1 * PLT) + (M_f_2 * PLT) + (...) Once we know M_f_i, we can calculate the position of every vertex in frame f.
The process above can be done inside GPU by passing M_f_i which wrap as a texture. (Under the premise that the skinned-mesh needs to animate around 10 animations and less amount of bones, the require memory is about 0.75Mb.). Finally, we can pass different frame number f to each instance to render skinned-mesh with animation in one draw-call.
Implement with three.js
I haven't build an example code yet because I don't know the concept can work on WebGL or not (also I'm not familiar with GLSL), but I think the way to implement it with three.js can done as the following.
- Follow here to get
M_f_i. - Use
THREE.InstancedBufferGeometryandTHREE.RawShaderMaterial.- In
uniformspass initial geometry,M_f_iand texture. - In
vertexShaderprocessPI = (M_f_1 * PLT) + (M_f_2 * PLT) + (...). - In
fragmentShaderprocess texture (I have no idea how to do it).
- In
- Pass
fand other instance attribute usingTHREE.InstancedBufferAttribute.
Problem
- Where is
M_fand how to get it byTHREE.AnimationClipin step 1? - How to index each
PLT(vertex in geometry)? - How to deal with texture?
- How to deal with hierarchy
Object3D(Object3D.childrenhaveTHREE.MeshandTHREE.SkinnedMeshat the same time)?
I need someone to tell me this idea works in three.js or not, and how to solve the problem above.

