-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update studio_util neon #383
base: master
Are you sure you want to change the base?
Conversation
@@ -143,8 +143,10 @@ void CrossProduct( const float *v1, const float *v2, float *cross ) | |||
memcpy(&v1_reg, v1, sizeof(float) * 3); | |||
memcpy(&v2_reg, v2, sizeof(float) * 3); | |||
|
|||
float32x4_t yzxy_a = vextq_f32(vextq_f32(v1_reg, v1_reg, 3), v1_reg, 2); // [aj, ak, ai, aj] | |||
float32x4_t yzxy_b = vextq_f32(vextq_f32(v2_reg, v2_reg, 3), v2_reg, 2); // [bj, bk, bi, bj] | |||
float32x2_t xy_a = vget_low_f32(v1_reg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't check this code without actually running it. But did you at least tested if it compiles? Because the last time it spewed errors due to invalid data types on Android and Switch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should compile with msvc and clang. Shuffling with qword is faster so I've changed here.
@@ -349,13 +348,28 @@ void CStudioModelRenderer::StudioSlerpBones( vec4_t q1[], float pos1[][3], vec4_ | |||
|
|||
s1 = 1.0f - s; | |||
|
|||
switch (m_pStudioHeader->numbones % 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we have only more than 4 bones but not dividable by 4?
For example, 6? With this code, only first 4 bones will be interpolated.
Also, the previous set of patches turned out to be glitchy on AArch64 computer running Linux, with GCC 10. Could you check it? |
No description provided.