Performance improvements#47
Conversation
# Conflicts: # src/ifcconvert/ColladaSerializer.cpp # src/ifcconvert/ColladaSerializer.h # src/ifcgeom/IfcGeomIterator.h # src/ifcgeom/IfcGeomRenderStyles.h
| if (ifcproduct_iterator == ifcproducts->begin() || settings.get(IteratorSettings::USE_WORLD_COORDS)) { | ||
| next_triangulation = new TriangulationElement<P>(*next_shape_model); | ||
| } else { | ||
| next_triangulation = new TriangulationElement<P>(*next_shape_model, current_triangulation->geometry_pointer()); |
There was a problem hiding this comment.
@Stinkfist0 This does not work after merging, because with your take ownership flag current_triangulation == 0. Is this flag still necessary as the min,max bounds are now computed during initialization?
There was a problem hiding this comment.
Let me test a get back to you soon.
There was a problem hiding this comment.
Centering seems to work OK judging by a quick test:
Logger::Status("Creating geometry...");
if (center_model) {
double* offset = serializer->settings().offset;
gp_XYZ center = (context_iterator.bounds_min() + context_iterator.bounds_max()) * 0.5;
offset[0] = -center.X();
offset[1] = -center.Y();
offset[2] = -center.Z();
//printf("Bounds min. (%g, %g, %g)\n", context_iterator.bounds_min().X(), context_iterator.bounds_min().Y(), context_iterator.bounds_min().Z());
//printf("Bounds max. (%g, %g, %g)\n", context_iterator.bounds_min().X(), context_iterator.bounds_min().X(), context_iterator.bounds_min().Z());
printf("Using model offset (%g, %g, %g)\n", offset[0], offset[1], offset[2]); //TODO Logger::Message(Logger::LOG_NOTICE, ...);
}
size_t num_created = 0;
// The functions IfcGeom::Iterator::get() and IfcGeom::Iterator::next()
// wrap an iterator of all geometrical products in the Ifc file.
// IfcGeom::Iterator::get() returns an IfcGeom::TriangulationElement or
// -BRepElement pointer, based on current settings. (see IfcGeomIterator.h
// for definition) IfcGeom::Iterator::next() is used to poll whether more
// geometrical entities are available. None of these functions throw
// exceptions, neither for parsing errors or geometrical errors. Upon
// calling next() the entity to be returned has already been processed, a
// true return value guarantees that a successfully processed product is
// available.
do {
const IfcGeom::Element<double>* geom_object = context_iterator.get();
++num_created;
if (is_tesselated) {
serializer->write(static_cast<const IfcGeom::TriangulationElement<real_t>*>(geom_object));
}
else {
serializer->write(static_cast<const IfcGeom::BRepElement<real_t>*>(geom_object));
}
const int progress = context_iterator.progress() / 2;
if (old_progress != progress) Logger::ProgressBar(progress);
old_progress = progress;
} while (context_iterator.next());
Logger::Status("\rDone creating geometry (" + boost::lexical_cast<std::string>(num_created) +
" objects) ");
serializer->finalize();
delete serializer;
Logger::Status("\rDone serializing geometry ");
|
Ok, removed the option to take ownership again. Maybe something to look into is proper move and copy constructors? In case users still want this. |
|
Yeah, maybe look into this again if/when a use case rises. |
|
I'm running now into issues with one file at least. The progress bar doesn't progress (context_iterator.progress() returns 0 always) even though while debugging I see geometry being created and num_created increasing steadily. I was able to convert this file OK with the master code. Unfortunately I can't share this file but the file consist of hundreds of mostly identical furnitures. I'll debug deeper and get back to you. |
|
Actually, I'm seeing the progress bar move now, I just had to wait couple minutes in order to see the first progress. |
|
Yeah, the conversion time went from 6 minutes (original code) to 30 (this PR) for this particular file (ENABLE_BUILD_OPTIMIZATIONS is off), but file size from OTOH 950 MB to 31 MB :) |
|
Thanks for testing, but yikes, I'm not too happy with the increase in processing time. Would you mind checking whether this is due to disabling the cache? Could you post back the processing time with |
|
Hi. Enabled the caching and the processing time went down to 37 seconds(!). What was the reasoning for disabling cache by default? I've been using this code for many models at work and everything seems to be in order - are you planning to merge this code soon? |
|
Yeah this should be merged soon. Thanks for testing the caching. Caching accounts for a large part of the memory usage and in some of my tests the caching even proved to be detrimental in terms of processing time if there is little re-use, as it can take some time especially for large models to put everything in maps. Alternatively we can arbitrarily trash our cache once in a while (https://github.com/IfcOpenShell/IfcOpenShell/pull/47/files#diff-4a1897ad94043892537a522bdd7fafd5R328) to prevent memory footprint and cache insertion speed to grow out of control. I will do some more tests. Note that this is a stop-gap measure anyway as in the future I hope to do this more precisely by determining re-use by means of counting inverse relationships and depleting cache entries as soon as possible. On the other hand, this is complicated by features such as multi-threading, so we'll have to see how things unfold. |
# Conflicts: # src/ifcexpressparser/implementation.py # src/ifcgeom/IfcGeomFunctions.cpp # src/ifcparse/Ifc4.cpp
|
I see. Cache with runtime configurable memory usage budget could be a nice solution in the long run. |
|
Also, could test and profile using |
That would be a good idea I think. The keys are basically unique Anyway, I did some plots for memory usage for several files and with several caching configurations. Horizontal axis is time in seconds, vertical memory usage in mb. Results do not appear very consistent although trashing the cache after 100 iterations proves to be a reasonable middle ground. Maybe I will have a go with |
|
@Stinkfist0 Using [1] https://github.com/IfcOpenShell/IfcOpenShell/blob/master/src/ifcparse/IfcFile.h#L35 |
|
Great to hear. The classic std::map look-up has always had O(log n) complexity AFAIK. |
|
I seem to have mis-labeled my builds. I can no longer reproduce a performance gain using Trying to find a sensible number for the interval between cache clears doesn't show much of a sweet spot. Even a single iteration is beneficial as apparently within a single representation there is the most re-use, which is imaginable. I think I am going to settle on 64 as I like power of 2s. Needless to say this is very dependent on the type of file, but at least this shows it is not very sensitive to the exact value, as long as the cache is cleared once in a while. So it can be somewhat arbitrary. |
|
Why not provide a option to set the clear_interval of cache. I have a big model containing more than 5000 ifcproduct. And after generating the geom data, 90% geom data are repeated. The default clear_interval 64 is too small. |
|
@vipcxj I think discussion is a bit outdated. There are smarter ways to do this caching as IfcOpenShell could detect how long data needs to be retained. Also there is going to be multi-threading at some point which will likely have a big impact on caching.
If you want to work on this, I'd happily merge a pull request, shouldn't be hard to implement, but I think it's better if we direct our attention to more clever strategies. |








[1]
Only in the cases listed below. Which seems to work for most windows and doors. But might cause instances not to be detected especially in MEP models where pipes are scaled uniformly and styled on a mapped-item level.
[2]
From some early testing it seems that the caching of geometric elements is detrimental for performance. More analysis needed.
[3]
Thought compilers would optimize this, but they don't. Also removed potentially UB by relying on passing -1 as a NULL enumeration value. Using
boost::optionalnow.