-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set LU matrices to zero when jacobian is a zero element #666
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@K20shores Thanks for fixing this bug. It will make me sleep much better for the weekend.
Just a few minor comments but otherwise the PR looks good to me. Great job!
@@ -204,6 +204,64 @@ void testRandomMatrix(std::size_t number_of_blocks) | |||
A, b, x, [&](const FloatingPointType a, const FloatingPointType b) -> void { EXPECT_NEAR(a, b, 1.0e-5); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use a smaller tolerance here now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I only have a minor change request. I think the test this PR brings in are very valuable.
a230323
to
11bcedc
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #666 +/- ##
==========================================
+ Coverage 92.69% 93.33% +0.64%
==========================================
Files 53 53
Lines 3585 3603 +18
==========================================
+ Hits 3323 3363 +40
+ Misses 262 240 -22
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
include/micm/util/vector_matrix.hpp
Outdated
void print() const { | ||
for (std::size_t i = 0; i < x_dim_; ++i) | ||
{ | ||
for (std::size_t j = 0; j < y_dim_; ++j) | ||
{ | ||
std::cout << (*this)[i][j] << " "; | ||
} | ||
std::cout << std::endl; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like it is used anywhere, could you confirm it? Since we don't execute clang tidy automatically, could you name the function in CamelCase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was useful when I was debugging. I'd like to keep it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's fine with me. Would you be able to make it in CamelCase though for the consistency in our code base?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing the comments
* Add CUDA Rosenbrock tests (#579) * add sync functions to state variable add cuda rosenbrock tests * fix all the compilation errors analytical tests do not work for CUDA rosenbrock * fix call to the base class function; bug fix for CuLudecompose and add singularity check * fix the compilation error for CUDA decomposition class * remove unnecessary calls to the base class functions * fix all the compilation errors * add crtp to allow calls to function from either base or derived class * fix more compilation errors about abstract rosenbrock solver now the cuda test passes for Troe case * add lambda functions as arguments for CPU/JIT/CUDA tests * initialize Yerror on the GPU every time and pass all the analytical tests * turn off the cuda memory check for the integration tests * revert back to the original process class * clean up unused header * update JIT test interface * extend state class to cudastate class * remove unnecessary cuda device sync * add cuda state class and address compilation errors * fix broken CI tests * more bug fix for CI tests * fix the compiler warning for cuda code * more fix for broken CI tests * resolve the cuda compiler warnings * address Matt's PR comments --------- Co-authored-by: Jian Sun <[email protected]> * Auto-format code changes (#586) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Use Fill to reset the L and U matrices in Rosenbrock solve (#588) use Fill in Rosenbrock solve * In-place linear solve (#585) * removing condensing x and b in nonvectorizable matrix code for linear solve * adding alias back * adding back comment * spacing * adding back comment * moving comment * vectorize version no longer segfaults but something is wrong * vectorized passes * removing b from jit linear solver * removing b from cuda linear solver * usin function pointer alias * adding a comment * fix conflict resolve typo --------- Co-authored-by: Jian Sun <[email protected]> Co-authored-by: Jian Sun <[email protected]> * Auto-format code changes (#589) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * 498 mimic camchem substep convergence failure integration acceptance (#582) * trying to continue on with current solution * mimicing camchem * testing backward euler against hires, e5 * updating citations * oregonator is too stiff for backward euler * addressing PR comments * collecting solver stats * Update include/micm/solver/backward_euler.inl Co-authored-by: Matt Dawson <[email protected]> * removing backward euler for oregonator test * removing cerr in favor of a solver state --------- Co-authored-by: Matt Dawson <[email protected]> * Auto-format code changes (#590) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * 304 reorganize include folder (#591) * reorganizing files * correcting cuda imports * Auto-format code changes (#592) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * 577 test all parameter types of the dense matrix cpu rosenbrock on the analytical policy tests (#593) Converts HIRES, Oregonator, E5 to chemical equations so that they can be tested on the GPU All analytical tests are tested with CPU and GPU rosenbrock. Backward euler as well (except oregonator). Renaming to match naming schemes for test files * Auto-format code changes (#597) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Fix GPU memory leak for the CUDA unit tests (#600) * fix most GPU memory leak * allocate a device pointer in the device struct * remove unused cuda mem copy * use swap in the move constructor and assignment of CUDA class initialize the null pointer in the struct definition pass the cuda memory check for all the unit tests * remove unnecessary nullptr * fix the broken CI tests * more bug fixes --------- Co-authored-by: Jian Sun <[email protected]> * Auto-format code changes (#601) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Backware Euler with vectorizable matrix types (#596) * starting to test all solver parameter types * saving progress * saving progress * testing all stages analytically * updating all interfaces * correcting cuda build I hope * testing jit against hires, e5, oregonator * adding cuda solver builder test * removing hires, e5, oregonator from cuda tests; they need their own kernels * testing e5 from a configuration * testing e5 jit integration * testing e5 properly * removing reset of L and U matrices (#594) * oregonator from a configuration * renaming things * using different tolerances? * moving state onto and off of host * saving gpu changes * updating cuda tests * adding some better tolerances for cuda tests * adding different tolerances for e5 * adding citation to e5 * thing * formed hires equations * using passing tolerances for cpu tests * jit tolerances * backward euler tests * configuration for hires * add AddToDiagonal function on sparse matrix * use ForEach in Backward Euler * add convergence check function to backward euler * fix merge problems * add vector matrix to analytical solver tests * update JIT analytical tests * set up general use analytical test function * add general function for stiff analytical tests * fix jit analytical tests * update remaining analytical tests * address review comments * update cuda analytical tests * update tolerances for cuda analytical tests --------- Co-authored-by: Kyle Shores <[email protected]> Co-authored-by: Kyle Shores <[email protected]> * Auto-format code changes (#605) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * 572 check for singularity when the solver parameters flag is turned on (#603) Add tests to check for singularity in the U matrix after the LU decomposition. If the check for singularity flag is turned on, decrease the timestep and try again. Fixes a bug where a zero in the bottom right of the U matrix would not have been detected * Auto-format code changes (#606) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Provide a way to access the processes_ data member (#607) return the process_ member Co-authored-by: Jian Sun <[email protected]> * Auto-format code changes (#608) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * adding headers * Auto-format code changes (#609) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Add missing CUDA tests and fix broken path (#611) add missing cuda tests and fix broken path * throwing error on mismatched size (#610) * throwing error on mismatched size * using a copy of the paramteres so that a builder can be repeatedly used * adding const * correcting number of tolerances for robertson * Auto-format code changes (#612) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Correct usage of third body species (#614) using the species map to grab the exact same species for reactants and products * Auto-format code changes (#615) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * correcting solver builder constructor (#616) * correcting solver builder constructor * fix a bug --------- Co-authored-by: Jiwon Gim <[email protected]> * Relax the criteria to pass the GPU test with nvhpc/24.7 on Derecho (#618) relax the criteria to pass the GPU test with nvhpc/24.7 on Derecho * Auto-format code changes (#623) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Update fill function for CUDA matrix (#626) * update the fill function for cuda matrix to avoid data transfer * fix compilation errors * add a comment about template function * update fill function for cuda sparse matrix * remove gcc11 CI test and add gcc14 CI test * Auto-format code changes (#627) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Remove data transfer in cuda matrix constructor and template some CUDA functions (#630) * remove data transfer in the cuda dense matrix constructor * template many cuda functions for cuda dense and sparse matrix * Auto-format code changes (#633) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Remove redundant variable and optimize the copy assignment for the CUDA matrix (#636) * test to remove forcing variables * fix broken unit tests * fix the bug of calculating forcing term when substepping happens * update the copy assignment operator for CUDA matrix * fix the broken unit tests again * Remove local copy of state in solver functions (#639) remove local copy of state in solver functions * Auto-format code changes (#640) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Add CUDA stream for asynchronous kernel launch (#641) * add the functions to create & get cuda stream * simplify the CUDA dense matrix destructor * add cuda stream to cuda matrix functions * add cuda stream to process_set.cu * add cuda stream to CudaLuDecomposition * add cuda stream to CudaLinearSolver * set cuda stream in the cublas handle add cuda stream to rosenbrock.cu * switch to singleton class for cuda stream manager * update the method to get the cuda stream * revise the Gtest main function to cleanup the CUDA resources explicitly * fix broken cuda analytical test * fix GPU memory leak in the unit test * clean up unused files * fix Kyle's review comment * make cudamemset asynchronous * Auto-format code changes (#645) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Remove the local copy of Jacobian matrix when doing LU decomposition (#646) remove the local copy of jaocbian matrix in the LinFactor function * Auto-format code changes (#647) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Add const to solver functions (#642) add const to solver functions * Replace json to yaml 619 (#649) * reaplce * json to yaml * yamle to JSON * test * added .string to yaml file * added string to loadFile * changes based on the PR. modified the code to use YAML file * Auto-format code changes (#650) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Add const qualifiers (#651) add const qualifiers * Move Yerror construction outside of the inner solve loop for rosenbrock (#652) * added error outside of the loop * moved the code to all the way to outer while loop * Auto-format code changes (#653) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Move temporary variables to the State class (#655) * add temporary variables in the solver class * declare temporary variable in the State class; initialize temporary variable in the solver * fix broken units test build * rename base class for temporary variables * make destructor of base class virtual so that the GPU memory is freed correctly * remove unnecessary data member from the solver class * add the copy assignment and constructor for the state class * add JIT rosenbrock parameter type * maybe this fixes the broken JIT tests * try is_convertible instead * Auto-format code changes (#656) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Use CUDA Rosenbrock parameters (#659) * use cuda rosenbrock parameters instead * use 0 for fill function * Added license and copyright (#661) added copyright * Auto-format code changes (#660) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * Misc updates (#665) * add back the getnumberofreactions function * update cuda thread count to 512 * Set LU matrices to zero when jacobian is a zero element (#666) * pushing * pushing fix * removing unneccesary logic check * adding cuda stuff * lowering tolerance * lowering tolerance * modified jit ludecomp * raising tolerance * testing jit and cuda properly * raising tolerance * raising again * again * raising again * lowering tolerance * adding prints to matrices * copy LU to host * printing A * sparsity * bernoulli again * manual engine * double * thing * printing values * larger matrix * 2 cells * now * dense * 20 * 4000 * things * uncomment * uncomment * print * 9 * 8 * 6 * 5 * 2e-6 * lu decomp * 10 * 8 * 0 * comment * checking * uncomment * 7 * 1 * 10 * print * print * again * 100 * data check * remove check results * 13 * 16 * eq * equal * uncomment * 1 block * 5 * print * 1 * testing LU decomp specifically * trying to correct cuda test * lowering * lowering tolerance * lowering again * thing * variable * all tests pass on derecho * setting values to zero for lu decomp * defaulting LU to 0 instead of 1e-30 * copying block values to other blocks * removing small value initialization * correcting version copyright * using absolute error * making index once * camel case * Auto-format code changes (#667) Auto-format code using Clang-Format Co-authored-by: GitHub Actions <[email protected]> * bumping version --------- Co-authored-by: Jian Sun <[email protected]> Co-authored-by: Jian Sun <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: GitHub Actions <[email protected]> Co-authored-by: Matt Dawson <[email protected]> Co-authored-by: Jiwon Gim <[email protected]> Co-authored-by: Montek Thind <[email protected]>
Better version of #663
Closes #625
See the issue for more detail, but this fixes a bug where the L or U matrix would sometimes not be set to the value in the jacobian matrix, leading to a drifting value of the LU matrices. This would lead to incorrect results and eventually the solver would stop converging. We noticed this in the performance repository and learned that we couldn't reuse the state matrix.
We started setting the LU matrices to zero back in #410, but later removed in #594 when we thought it was just an issue around a vectorized normalized error function. It wasn't.