Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BreakMeshByBlock robustness #12033

Closed
1 of 2 tasks
jiangwen84 opened this issue Aug 23, 2018 · 114 comments
Closed
1 of 2 tasks

BreakMeshByBlock robustness #12033

jiangwen84 opened this issue Aug 23, 2018 · 114 comments
Labels
P: normal A defect affecting operation with a low possibility of significantly affects.

Comments

@jiangwen84
Copy link
Contributor

jiangwen84 commented Aug 23, 2018

Rationale

There are two main issues with existing BreakMeshByBlock

  • the interface sideset is not assigned correctly.
  • the connectivity of the elements with news nodes are not correct -> lindsayad comment: I don't believe we can ever get this correct since in a strict sense after node replacement the former neighbors are not neighbors anymore

Description

resolve those issues in BreakMeshByBlock.

Impact

Improvement and bug fixes.

@lindsayad lindsayad changed the title Connectivity and assigning interface boundary issues in BreakMeshByBlock BreakMeshByBlock robustness Oct 22, 2019
@lindsayad
Copy link
Member

It seems that BreakMeshByBlock(Generator) suffers from a fair number of robustness issues. Based on @arovinelli's comment here it does not work with recover (at least when using InterfaceKernels?) nor with DistributedMesh. Additionally, according to the test suite it doesn't work in parallel in debug mode either. It would be nice if we could make this object a bit more robust such that it can work in all the traditional MOOSE scenarios.

@lindsayad lindsayad reopened this Oct 22, 2019
lindsayad added a commit to lindsayad/moose that referenced this issue Oct 22, 2019
When run in parallel in debug mode these split mesh tests
fail assertions in libMesh, specifically some node_touched
assertions. BreakMeshByBlockGenerator needs to be made more
robust in the future

Refs idaholab#14124 idaholab#12033
@arovinelli
Copy link
Contributor

@lindsayad making it work with distributed mesh should be double even if a bit cumbersome. For the restart I believe the major problem is that the mesh connectivity is not saved there ant it might be difficult to reconstruct it when the mesh is already split (if we have large dispalcements between the two blocks is going to be a impossible). Any idea on how to solve this?

It seems that BreakMeshByBlock(Generator) suffers from a fair number of robustness issues. Based on @arovinelli's comment here it does not work with recover (at least when using InterfaceKernels?) nor with DistributedMesh. Additionally, according to the test suite it doesn't work in parallel in debug mode either. It would be nice if we could make this object a bit more robust such that it can work in all the traditional MOOSE scenarios.

@permcody
Copy link
Member

permcody commented Nov 7, 2019

Closing this specific issue based on the merged PRs listed.

@permcody permcody closed this as completed Nov 7, 2019
@lindsayad
Copy link
Member

Err I don't think this should be closed, #14216 specifically references this issue because it is still an issue

@lindsayad lindsayad reopened this Nov 7, 2019
@arovinelli
Copy link
Contributor

@lindsayad and @permcody I need to run some big jobs on a cluster and therefore I need to fix at the least to enable the distributed mesh option. I guess that the problem here is to sync different MPI process to add node and update element simultaneously. Can you point me to any example doing something similar to give me an example to follow and simplify my work? Of course if Ica n fix some issue this will be merged.
Thanks again

@permcody
Copy link
Member

@arovinelli - You can jump around other generators and see if others have been updated. It can be pretty complex.

Here's a better idea. Instead of trying to get it working with DistributedMesh, I recommend that you just use the "split mesh" capability in MOOSE. The short explanation is that you run a job with a few extra CLI options that will just run the mesh steps (reading it in, generating it, applying transformations, etc) and will write it back out as separate files ready to run for a larger distributed run. The nice part is when you do this, the memory used by your larger run and the startup time will be drastically reduced. Take a look here:

https://www.mooseframework.org/syntax/Mesh/splitting.html

@arovinelli
Copy link
Contributor

@permcody thanks for the reply this is good to know.
One question would the file written from the mesh split contain the face-face neighbor information, or it will be reconstructed later? If it is reconstructed later than I need a way to reassign the correct neighbors

@permcody
Copy link
Member

face-face neighbor information? I assume you mean the new internal boundaries you are trying to add?

The splitter system is designed to run all mesh setup tasks including running RelationshipManagers. So if you have Generators that add boundaries that information will be saved. If you have objects that require a "wider stencil", those ghost elements should be written. Ideally everything you need should be preserved.

@arovinelli
Copy link
Contributor

@permcody face-face neighbor information I mean elem->face->neighbor_elem information
The BreackMeshByBlockGenerator add nodes such that neighboring element on two different block do not share the same nodes anymore. So the mesh is in fact broken, but the elem->face->neighbor_elem information are still there. If the elem->face->neighbor_elem info are not saved by the mesh split, I need to reconstruct it be cause otherwise the interface will not work anymore

arovinelli pushed a commit to arovinelli/moose that referenced this issue Nov 19, 2019
@arovinelli
Copy link
Contributor

@permcody and @lindsayad
so I tried to use the split mesh tool, howeve when treying use the split mesh things brake
this is the output

*** Error in `../../../../tensor_mechanics-opt': malloc(): memory corruption (fast): 0x0000000000f35f10 ***
*** Error in `../../../../tensor_mechanics-opt': malloc(): memory corruption (fast): 0x000000000228aff0 ***
*** Error in `../../../../tensor_mechanics-opt': malloc(): memory corruption (fast): 0x00000000024d4270 ***

@permcody
Copy link
Member

permcody commented Nov 19, 2019 via email

@arovinelli
Copy link
Contributor

@permcody it will not as

BreakMeshByBlockGenerator::BreakMeshByBlockGenerator(const InputParameters & parameters)
: BreakMeshByBlockGeneratorBase(parameters), _input(getMesh("input"))
{
if (typeid(_input).name() == typeid(DistributedMesh).name())
mooseError("BreakMeshByBlockGenerator only works with ReplicatedMesh.");
}

@permcody
Copy link
Member

permcody commented Nov 19, 2019 via email

@arovinelli
Copy link
Contributor

@permcody thanks for the reply
I believe it would be much easier going the other way around start from a disconnected mesh and join coincident faces (e.g. the one having nodes sharing the same locations). I'll work on that

@permcody
Copy link
Member

I believe it would be much easier going the other way around start from a disconnected mesh and join coincident faces (e.g. the one having nodes sharing the same locations).

It shouldn't be more difficult one way or the other. When working with a DistributedMesh in the MeshGenerator system, you'll have access to the full mesh on each processor (well in almost all cases). The way you deal with DistributedMesh is that you inspect the "processor_id" on each element to know whether the current processor owns it or not.

@arovinelli
Copy link
Contributor

Ok, I didn't know that. I Have a couple of questions?

  1. when adding a new node, which processor should be in charge of adding the new node? the processor owning the element to which the node will belong?
  2. Also once a node has been added, how will I broadcast this information to all the processors?

It shouldn't be more difficult one way or the other. When working with a DistributedMesh in the MeshGenerator system, you'll have access to the full mesh on each processor (well in almost all cases). The way you deal with DistributedMesh is that you inspect the "processor_id" on each element to know whether the current processor owns it or not.

@permcody
Copy link
Member

During the generation phase you want the mesh to be consistent on all processors. All processors will have to add the same Node with the same IDs. If anything gets out of sync, you'll run into problems. I should say that it is possible to generate a DistributedMesh where each processor truly works with only their portion of the mesh, but that's very advanced and we aren't doing that in more than maybe one place in MOOSE.

If you need to perform parallel communication you may do so. However, hopefully since you are doing the same work on all processors you won't need to worry about this step.

@arovinelli
Copy link
Contributor

If all processors have access to all the mesh and all processor must do the same thing, why should I be worried if a node and or element is remote or local? Are there any caveats for remote node and elements?

During the generation phase you want the mesh to be consistent on all processors. All processors will have to add the same Node with the same IDs. If anything gets out of sync, you'll run into problems. I should say that it is possible to generate a DistributedMesh where each processor truly works with only their portion of the mesh, but that's very advanced and we aren't doing that in more than maybe one place in MOOSE.

If you need to perform parallel communication you may do so. However, hopefully since you are doing the same work on all processors you won't need to worry about this step.

@permcody
Copy link
Member

If all processors have access to all the mesh and all processor must do the same thing, why should I be worried if a node and or element is remote or local? Are there any caveats for remote node and elements?

I'll admit that I'm not sure we've fleshed out every single possible case when creating a tree of generators. We are still polishing up this system and trying to create documentation and guidance for everyone to use. My concern is that you might generate a sequence of generators A -> B -> C. It might turn out that you need to call the "prepare_for_use()" method on the Mesh between the stages of the generator because you need to call certain "exploration methods" like retrieving neighboring elements of elements, etc. I'm not an expert at what you can and can't do but many methods aren't available for use until you've done an intermediate prepare. However, with DistributedMesh once you do that, I believe (again need to test and verify) that libMesh will go ahead and throw away remote elements.

Now you get into one of the later stages and you no longer have all of the elements available on all processors. Now if you plan to add new elements/nodes everything gets significantly more complicated. You no longer need to do the same work on all ranks, but you likely can't just do the work on one rank either. An example would be where you were going to add new elements on a free surface right on a processor boundary. Multiple processors would need to work together to add a new element due to ghosting.

For now, we've kind of glossed over some of these really nasty edge cases (literal) and just considered the more normal case of having the full mesh available on all ranks during the generation phase. That'll satisfy nearly every case and if you are willing to work with Split Mesh, perhaps 100% of cases. Yeah we are figuring this all out as we go to. It's all pretty new and you are once again working on the bleeding edge.

For now, I would recommend that you assume you have all information on all ranks and just do the same work everywhere. In the future we will continue to think about how to make this easier for our developers and give them the tools they need to deal with truly Distributed cases.

@lindsayad
Copy link
Member

However, with DistributedMesh once you do that, I believe (again need to test and verify) that libMesh will go ahead and throw away remote elements.

By default it will, but there is a method to stop this from happening: MeshBase::allow_remote_element_removal

@permcody
Copy link
Member

By default it will, but there is a method to stop this from happening: MeshBase::allow_remote_element_removal

True, but is this something we want to always turn on during MeshGeneration? Perhaps, but maybe not. Design designs that just haven't been made.

arovinelli pushed a commit to arovinelli/moose that referenced this issue Dec 13, 2019
arovinelli pushed a commit to arovinelli/moose that referenced this issue Dec 13, 2019
arovinelli pushed a commit to arovinelli/moose that referenced this issue Dec 13, 2019
still not Distributed
arovinelli pushed a commit to arovinelli/moose that referenced this issue Dec 18, 2019
arovinelli pushed a commit to arovinelli/moose that referenced this issue Dec 18, 2019
arovinelli pushed a commit to arovinelli/moose that referenced this issue Dec 18, 2019
still not Distributed
arovinelli pushed a commit to arovinelli/moose that referenced this issue Jan 6, 2020
arovinelli pushed a commit to arovinelli/moose that referenced this issue Jan 6, 2020
still not Distributed
@lindsayad
Copy link
Member

The documentation says "Only [manually set id] in parallel if you are manually keeping ids consistent",

@roystgnr where is this documentation?

lindsayad pushed a commit to arovinelli/moose that referenced this issue Mar 18, 2020
lindsayad pushed a commit to arovinelli/moose that referenced this issue Mar 18, 2020
still not Distributed
lindsayad added a commit to lindsayad/libmesh that referenced this issue Mar 19, 2020
Compare with centroids instead of global node ids. This is extremely
useful when comparing side elements that are coincident but may
not actually share the same nodes. See idaholab/moose#12033 for
application and libMesh#2362 for more inspiration. This probably won't
work but let's give it a shot
lindsayad pushed a commit to lindsayad/moose that referenced this issue Mar 20, 2020
lindsayad pushed a commit to lindsayad/moose that referenced this issue Mar 20, 2020
still not Distributed
arovinelli pushed a commit to arovinelli/moose that referenced this issue Jun 1, 2020
@lindsayad
Copy link
Member

lindsayad commented Jun 23, 2020

The documentation says "Only [manually set id] in parallel if you are manually keeping ids consistent",

@roystgnr where is the documentation guiding when to manually set ids?

@roystgnr
Copy link
Contributor

It's in the doxygen comments for MeshBase::add_point and MeshBase::add_elem

@roystgnr
Copy link
Contributor

But that's when - I'm not sure where we've got proper documentation as to how. I remember writing it out at least once but that might have been on libmesh-users or libmesh-devel.

@lindsayad
Copy link
Member

lindsayad commented Jun 24, 2020 via email

@aeslaughter aeslaughter added the P: normal A defect affecting operation with a low possibility of significantly affects. label Apr 12, 2021
@lindsayad
Copy link
Member

@jiangwen84 can you comment or put an 'x' on whether the first bullet in your original post has been fixed? I want to close this issue in favor of a couple more specific, newer issues, but I want to know what we've addressed already first

@jiangwen84
Copy link
Contributor Author

Yes, I fixed that.

@lindsayad
Copy link
Member

Great, thanks!

@lindsayad
Copy link
Member

Closing in favor of more specific issue #21154. I think issues have gotten too long when github forces us to expand comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P: normal A defect affecting operation with a low possibility of significantly affects.
Projects
None yet
Development

No branches or pull requests

7 participants