Skip to content

Comments

Solver: clean up and optimise SolveBySubstitution() and WriteJacobian()#1563

Merged
phkahler merged 8 commits intosolvespace:masterfrom
iscgar:iscgar/system-substitute
May 9, 2025
Merged

Solver: clean up and optimise SolveBySubstitution() and WriteJacobian()#1563
phkahler merged 8 commits intosolvespace:masterfrom
iscgar:iscgar/system-substitute

Conversation

@iscgar
Copy link
Contributor

@iscgar iscgar commented Apr 8, 2025

This PR cleans up and optimises various parts of the solver, especially around SolveBySubstitution() and WriteJacobian(). The latter can result in noticeable speedups on some models (see initial numbers in my comment here).


Original PR description (superseded)

While trying to look into #1247 I was having trouble reproducing the issue, and the model was failing in weird ways (such as multiple constraints not satisfied, causing horizontal and vertical lines to become diagonal).

After staring hard at the code for a long time, I finally traced the issue down to commit 708a08f (merged in #1159) which made changes in order to speed up solving by substitution. However, the changes inadvertently caused substitution to persist across solve runs and across groups, which broke multiple cases.

There are multiple options for fixing it, and I chose to do it by removing the substd member from Param (in order to prevent
such breakage in the future), and replacing the logic with similar runtime complexity logic that doesn't require modifying Param itself.

@iscgar iscgar force-pushed the iscgar/system-substitute branch from 96bc0df to 3fc9bc5 Compare April 8, 2025 16:27
@iscgar
Copy link
Contributor Author

iscgar commented Apr 8, 2025

CI is failing due to spurious 502 errors when cloning the FreeType repository. The job should be retired, but I don't have the permissions to do so. cc @ruevs.

@iscgar iscgar force-pushed the iscgar/system-substitute branch from 3fc9bc5 to 041af9b Compare April 8, 2025 22:06
@phkahler
Copy link
Member

@iscgar Does this actually fix #1247 ? Also, has conflicts - maybe I merged a PR in the wrong order?

@iscgar
Copy link
Contributor Author

iscgar commented Apr 10, 2025

No, it does not fix #1247. I only encountered the problem this PR is intended to fix while trying to debug that other issue.

However, please hold off integrating this PR for now, because I'm currently unable to reproduce the issue. Also, in hindsight my description of it doesn't make much sense, becuase System::SolveBySubstitution() only works on copies of the parameters that are held by the system, so they should not persist across solve runs, and most definitely not across groups.

While I still think this PR brings some measurable improvement (such as reducing memory usage by getting rid of the substd member of Param, which is only ever used by substitution, and better algorithmic complexity for Expr::ParamsUsedList() through the use of std::unordered_set, as well as better algorithmic complexity by avoiding the traversal of the substitution linked list), I want to first try and reproduce the problem that I encountered in order to understand what caused it and ensure that it is actually fixed.

EDIT: although it doesn't matter for now, I tried to check why GitHub is complaining, and I'm not sure what conflicts it is refering to, as this cleanly rebases on master on my end.

@iscgar iscgar marked this pull request as draft April 10, 2025 18:50
@iscgar iscgar force-pushed the iscgar/system-substitute branch 3 times, most recently from 0c41a56 to 13fc47d Compare April 11, 2025 01:04
iscgar added 8 commits April 15, 2025 22:08
This way is faster to search for large param sets, and gets rid of the
custom `List<T>` container.
This way is faster to search for large param sets.
Commit 708a08f made changes in order to
speed up solving by substitution, but still left some performance on the
table by requiring a traversal of substitution links and also doing
unnecessary allocations.

Optimise memory usage by removing the `substd` member from `Param`, since
it's only used when solving by substitution, and make it faster as well
by avoiding redundant work using better data structures.
Previously the entire substitution vector had to be traversed in order
to update slots that pointed to the same substitution. Add tracking of
such slots in order to be able to quickly update only the relevant
ones.
It's only used inside `System::SolveLeastSquares()`, so move it there as
there's no reason to keep it across function calls.
Avoid allocating memory when no one else is holding onto an expression,
and make sure constant folding is performed during the parameter
resolution instead of only before it takes place, as this misses known
parameters which are converted to constants during resolution.

Also reorder the code a bit in order to be able to bail early in case
an exceptional condition is encountered without doing redundant work.
For some reason it was set to 1, even though it is a valid param handle
value, which means that if such a parameter was part of an equation that
was soluble alone it was incorrectly skipped.
@iscgar iscgar force-pushed the iscgar/system-substitute branch from 13fc47d to 1fbd9c8 Compare April 15, 2025 20:26
@iscgar
Copy link
Contributor Author

iscgar commented Apr 15, 2025

I'm still unable to reproduce the issues I was having, so I'll have to chalk it up to a broken build on my machine.

This PR's description is now incorrect, as it turned into a bunch of small cleanups and optimisations, with a focus on SolveBySubstitution and WriteJacobian. I'll update the description to match the new state of things soon, but I'm marking it as ready to review in the mean time, so please take a look and let me know what you think.

Depending on the model used, different parts of the code are exercised, so the overall speedup will vary as well. The following list tallies the improvement in runtime (in release mode with LTO) for two operations: generating groups (with DoF calculation suppressed; depending on model this can exercise either of SolveBySubstitution() and WriteJacobian(), or both), and marking free points (which exercises WriteJacobian()):

  • 8040_finished_constrained
    • Generate
      • Before: ALL(BB)=57ms, ALL=85ms, DIRTY=33ms
      • After: ALL(BB)=52ms, ALL=79ms, DIRTY=31ms
      • Change: insignificant
    • Mark free
      • Before: 927ms
      • After: 710ms
      • Change: -24%
  • case-vents
    • Generate
      • Before: ALL(BB)=36ms, ALL=1247ms, DIRTY=1132ms
      • After: ALL(BB)=33ms, ALL=1243ms, DIRTY=1118ms
      • Change: insignificant
    • Mark free
      • Before: 1323ms
      • After: 1290ms
      • Change: insignificant
  • Chapiteau vue de dessus
    • Generate
      • Before: ALL(BB)=120ms, ALL=129ms, DIRTY=not measured
      • After: ALL(BB)=118ms, ALL=126ms, DIRTY=not measured
      • Change: insignificant
    • Mark free
      • Before: 4985ms
      • After: 3970ms
      • Change: -20%
  • constrainfreeze
    • Generate
      • Before: ALL(BB)=60ms, ALL=61ms, DIRTY=not measured
      • After: ALL(BB)=55ms, ALL=55ms, DIRTY=not measured
      • Change: insignificant
    • Mark free
      • Before: 245ms
      • After: 125ms
      • Change: -49%
  • LeliaManya2
    • Generate
      • Before: ALL=not measured, DIRTY(BB)=493ms, DIRTY=836ms
      • After: ALL=not measured, DIRTY(BB)=491ms, DIRTY=830ms
      • Change: insignificant
    • Mark free
      • Before: 719ms
      • After: 706ms
      • Change: insignificant

All of my generation tests were done with DoF calculation suppression, and yet SolveBySubstitution() has essentially no impact, and the changes which were originally motivated by what I perceived as a bug are now merely a cleanup which may only benefit models with many substituable equations (I was having trouble coming up with such a model, so I'm not sure this is actually useful).

@iscgar iscgar marked this pull request as ready for review April 15, 2025 20:31
@iscgar iscgar changed the title Solver: fix the broken SolveBySubstitution() Solver: clean up and optimise SolveBySubstitution() and WriteJacobian() Apr 15, 2025
@phkahler
Copy link
Member

phkahler commented Apr 16, 2025

@iscgar These look pretty good. I'm not sure I like this one because it adds extra complexity and the solver is rarely a performance bottleneck. How has it been for you to learn the solver code and work with it? Would that change make such work harder?

Unrelated, there has been discussion in the past about solving more than one group at a time. Maybe by adding a checkbox to a group "solve with previous group" which could be chained to solve several together. This is not an existing feature, but I'm wondering how possible you think this might be, and if maybe the unused "tag" was related?

@iscgar
Copy link
Contributor Author

iscgar commented Apr 16, 2025

@iscgar These look pretty good. I'm not sure I like this one because it adds extra complexity and the solver is rarely a performance bottleneck. How has it been for you to learn the solver code and work with it? Would that change make such work harder?

Yeah, I can't say I like it either, but without it SolveBySubstitution() has O(n^2) complexity in case there are many independent substitutions, and I just couldn't let it stand considering the original Evil-Spirit commit message contained the words "linear complexity" :)

I find the solver code quite easy to work with, since it's mostly a direct implementation of the textbook Gauss-Newton algorithm. I initially started looking into SolveBySubstitution() because it was the only exception, and it took me a moment to figure out what it was doing with dragged parameters and cycle breaking. That specific change is still easier to understand IMO than the original implementation, as everything directly points to the target instead of traversing a linked list, but it is definitely more complex than my initial change in the commit preceding it (which is why I added a bit too many comments to explain why the code works).

As I was unable to come up with a model where SolveBySubstitution() would be the bottleneck in the solver, I'm fine with dropping that change if you feel that it isn't worth the added complexity.

Unrelated, there has been discussion in the past about solving more than one group at a time. Maybe by adding a checkbox to a group "solve with previous group" which could be chained to solve several together. This is not an existing feature, but I'm wondering how possible you think this might be, and if maybe the unused "tag" was related?

Solving groups together should be easy enough to implement in the solver, as it's just a matter of using more than a single value in filtering constraints and entities to use for writing the equation system. However, from a UI standpoint it would probably be harder to reason about, with rank test suppression and e.g. bad constraints from multiple groups. I'm curious though: what's the advantage of solving multiple groups together? Is there a specific use case where it can potentially result in a noticeable speedup?

The tag is only an internal solver parameter for separating subsystems, so it's unlikely to have any connection with solving multiple groups, and is just a leftover from the pre-Eigen solver (originally the Jacobian was written inside NewtonSolve(), and this was changed in bab13b8).

@phkahler
Copy link
Member

@iscgar These look pretty good. I'm not sure I like this one because it adds extra complexity and the solver is rarely a performance bottleneck. How has it been for you to learn the solver code and work with it? Would that change make such work harder?

Yeah, I can't say I like it either, but without it SolveBySubstitution() has O(n^2) complexity in case there are many independent substitutions, and I just couldn't let it stand considering the original Evil-Spirit commit message contained the words "linear complexity" :)

Hahaha. I get it ;-) Let's have it do what it says on the box!

I find the solver code quite easy to work with, since it's mostly a direct implementation of the textbook Gauss-Newton algorithm. I initially started looking into SolveBySubstitution() because it was the only exception, and it took me a moment to figure out what it was doing with dragged parameters and cycle breaking. That specific change is still easier to understand IMO than the original implementation, as everything directly points to the target instead of traversing a linked list, but it is definitely more complex than my initial change in the commit preceding it (which is why I added a bit too many comments to explain why the code works).

As I was unable to come up with a model where SolveBySubstitution() would be the bottleneck in the solver, I'm fine with dropping that change if you feel that it isn't worth the added complexity.

I have one model that had a significant improvement with those patches and the inclusion of Eigen. I'll try it with this when I find time.

Unrelated, there has been discussion in the past about solving more than one group at a time. Maybe by adding a checkbox to a group "solve with previous group" which could be chained to solve several together. This is not an existing feature, but I'm wondering how possible you think this might be, and if maybe the unused "tag" was related?

Solving groups together should be easy enough to implement in the solver, as it's just a matter of using more than a single value in filtering constraints and entities to use for writing the equation system. However, from a UI standpoint it would probably be harder to reason about, with rank test suppression and e.g. bad constraints from multiple groups. I'm curious though: what's the advantage of solving multiple groups together? Is there a specific use case where it can potentially result in a noticeable speedup?

One big advantage would be simulating assemblies (mechanisms). Each part in an assembly is a separate group which means you can drag the first part to drive a machine, but you can't drag a later part to move a machine. Well you can, but only the later parts can move. A machine with 1 DoF must be dragged by the first part in the assembly, but it'd be better to grab it wherever is convenient.

Another use would be designing something where an extrusion length may not be known until later parts of the geometry are constructed. Or when designing something out of 8020 extrusions - it'd be great if an entire assembly could be resized later. As it is today you need to make a sketch-in-3D as a first group with key dimensions and then build everything on top of that. It's actually a pretty good way to do it, but a bit abstract I think for new users.

@iscgar
Copy link
Contributor Author

iscgar commented Apr 17, 2025

One big advantage would be simulating assemblies (mechanisms). Each part in an assembly is a separate group which means you can drag the first part to drive a machine, but you can't drag a later part to move a machine. Well you can, but only the later parts can move. A machine with 1 DoF must be dragged by the first part in the assembly, but it'd be better to grab it wherever is convenient.

Another use would be designing something where an extrusion length may not be known until later parts of the geometry are constructed. Or when designing something out of 8020 extrusions - it'd be great if an entire assembly could be resized later. As it is today you need to make a sketch-in-3D as a first group with key dimensions and then build everything on top of that. It's actually a pretty good way to do it, but a bit abstract I think for new users.

I see. The problem with solving multiple groups together in these cases is that it can lead to an explosion of unknowns, slowing down the solver, and making convergence less likely given that the solver is already struggling today with some simple cases like #1247 or when points are dragged too quickly. Also, we artificially limit the number of unknowns to 2048 in order to prevent a slowdown, and when multiple groups are solved together it would be less obvious how to fix the issue if a "too many unknowns" error is encountered.

So while letting the solver handle multiple groups would be easy, I'm not sure it would be useful, and I think we need to at least improve convergence when solving a single group first. Then we need to think about the UI for solving multiple groups (especially in the error cases) before enabling it.

@phkahler
Copy link
Member

@iscgar A linked sketch creates a group with 7 unknowns regardless of the number of entities in the linked sketch. The upper limit of 2048 is arbitrary and can be dropped but things will definitely slow down with huge numbers of constraints. I agree that fixing those solve failures is higher priority.

}

void Expr::ParamsUsedList(std::vector<hParam> *list) const {
void Expr::ParamsUsedList(ParamSet *list) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you really tested it? There is no "large" param sets actually exist, linear search is the best suitable for this kind of data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly no expression currently contains more than a handful of parameters, and even less duplicated ones in the largest of expressions, but in my testing it made no difference in performance one way or the other even with models that generate a huge amount of equations, and the reason I went with an unordered set was simply because it made the code simpler since it doesn't require an explicit check for duplicates. I'm fine with dropping this change if the maintainers feel uncomfortable with it, but as noted above this was done as a cleanup rather than as an optimisation.

// A list of parameters that are being dragged; these are the ones that
// we should put as close as possible to their initial positions.
List<hParam> dragged;
ParamSet dragged;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, carefully test it. For small dragging lists std::vector is preferrable, but depending on size of the list, custom container can be created (using linear search for small amount of data and switch to hastable when data is huge). The point of using such things - you can drag single point of huge object and it is more often way to drag, but also you can drag a big number of objects, so it is worth to optimize both cases.

Copy link
Contributor Author

@iscgar iscgar May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is again more of a cleanup than an optimisation. I did test it, and while there can be a potential gain with an extremely large amount of dragged parameters, in practice it doesn't make a meaningful difference in the grand scheme of things the way SolveSpace uses it, especially given the fact it's only used when solving by substitution and scaling the parameters, while when solving the bulk of the time is spent inside Eigen on solving and rank testing anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you test constraint-less dragging?

}

// If both ops are known, then we can evaluate immediately
if(n->a->op == Op::CONSTANT && n->b->op == Op::CONSTANT) {
Copy link
Collaborator

@Evil-Spirit Evil-Spirit May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best way to do folding - is using optimization druring writing equations (test on constants inside Plus/Minus/Times/Etc and return result of operations). So no need to call FoldConstants or OptimizeExpression(or how it named?). Please, see NoteCAD source code (Exp) as reference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is precisely what I'm doing inside DeepCopyWithParamsAsPointers(), indirectly using FoldConstants() with a depth of 1, since the expression is copied from the leaves to the root (thus not missing either existing constants, nor constants introduced during the copy by parameters being resolved as known).

I did take a look at your code in NoteCAD (as well as the Adjacent port to C++) and it is an impressive work, but as you undoubtably know it's not directly applicable to the way expressions are allocated and constructed in SolveSpace. I have a branch where I created an expression system that does optimise during construction (with more simplification rules than just constant evaluation), but it's not complete yet, and in my testing there didn't seem to be many more opportunities for optimisations of the existing expressions in a way that would meaningfully affect performance further than what is currently achievable with my approach in this PR.

@ruevs ruevs added the solver label May 2, 2025
@phkahler phkahler merged commit 8cff14a into solvespace:master May 9, 2025
4 checks passed
@iscgar iscgar deleted the iscgar/system-substitute branch May 9, 2025 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants