fix(core): concurrent access errors with GDCM when accessing to DICOM data from different threads
Description
Running several services that access DICOM fields on the same image but from different workers may crash from time to time.
Clang thread sanitize reports errors with GDCM:
Data race:
Thread 1:
Write of size 8 at 0x7b0c004761f8 by thread T90:
#0 gdcm::Object::Register() /usr/include/gdcm-3.0/gdcmObject.h:73:19 (libsight_data.so.24.0+0x70c761) (BuildId: 9af6565acac7dc76)
#1 gdcm::SmartPointer<gdcm::SequenceOfItems>::Register() /usr/include/gdcm-3.0/gdcmSmartPointer.h:105:26 (libsight_data.so.24.0+0x70c761)
#2 gdcm::SmartPointer<gdcm::SequenceOfItems>::SmartPointer(gdcm::SmartPointer<gdcm::SequenceOfItems> const&) /usr/include/gdcm-3.0/gdcmSmartPointer.h:44:7 (libsight_data.so.24.0+0x70c761)
#3 gdcm::SmartPointer<gdcm::SequenceOfItems> sight::data::detail::SeriesImpl::get_multi_frame_group_sequence<gdcm::Attribute<(unsigned short)20992, (unsigned short)37424, 262144ll, 1>>(unsigned long) const /home/mschweitzer/dev/projects/src/wonder_project/sight/libs/__/data/detail/series_impl.hxx:839:16 (libsight_data.so.24.0+0x70c761)
#4 gdcm::SmartPointer<gdcm::SequenceOfItems> sight::data::detail::SeriesImpl::get_multi_frame_sequence<gdcm::Attribute<(unsigned short)20992, (unsigned short)37424, 262144ll, 1>, gdcm::Attribute<(unsigned short)32, (unsigned short)37137, 262144ll, 1>>(unsigned long) const /home/mschweitzer/dev/projects/src/wonder_project/sight/libs/__/data/detail/series_impl.hxx:853:38 (libsight_data.so.24.0+0x775dc0) (BuildId: 9af6565acac7dc76)
#5 std::optional<gdcm::Attribute<(unsigned short)24, (unsigned short)36980, 64ll, 1>::ArrayType> sight::data::detail::SeriesImpl::get_multi_frame_value<gdcm::Attribute<(unsigned short)20992, (unsigned short)37424, 262144ll, 1>, gdcm::Attribute<(unsigned short)32, (unsigned short)37137, 262144ll, 1>, gdcm::Attribute<(unsigned short)24, (unsigned short)36980, 64ll, 1>>(unsigned long) const /home/mschweitzer/dev/projects/src/wonder_project/sight/libs/__/data/detail/series_impl.hxx:892:42 (libsight_data.so.24.0+0x775b4e) (BuildId: 9af6565acac7dc76)
#6 sight::data::series::get_frame_acquisition_date_time[abi:cxx11](unsigned long) const /home/mschweitzer/dev/projects/src/wonder_project/sight/libs/__/data/series.cpp:2807:37 (libsight_data.so.24.0+0x6dffbd) (BuildId: 9af6565acac7dc76)
#7 sight::data::series::get_frame_acquisition_time_point(unsigned long) const /home/mschweitzer/dev/projects/src/wonder_project/sight/libs/__/data/series.cpp:2859:35 (libsight_data.so.24.0+0x6e03b4) (BuildId: 9af6565acac7dc76)
#8 wonder_project::filter::nn::inference::base::updating() /home/mschweitzer/dev/projects/src/wonder_project/libs/filter/nn/inference/base.cpp:139:91 (libwonder_project_filter_nn.so.0.1+0xb4960) (BuildId: dd2ec892573b57ae)
Thread2:
#0 gdcm::Object::Register() /usr/include/gdcm-3.0/gdcmObject.h:73:19 (libsight_data.so.24.0+0x70c761) (BuildId: 9af6565acac7dc76)
#1 gdcm::SmartPointer<gdcm::SequenceOfItems>::Register() /usr/include/gdcm-3.0/gdcmSmartPointer.h:105:26 (libsight_data.so.24.0+0x70c761)
#2 gdcm::SmartPointer<gdcm::SequenceOfItems>::SmartPointer(gdcm::SmartPointer<gdcm::SequenceOfItems> const&) /usr/include/gdcm-3.0/gdcmSmartPointer.h:44:7 (libsight_data.so.24.0+0x70c761)
#3 gdcm::SmartPointer<gdcm::SequenceOfItems> sight::data::detail::SeriesImpl::get_multi_frame_group_sequence<gdcm::Attribute<(unsigned short)20992, (unsigned short)37424, 262144ll, 1>>(unsigned long) const /home/mschweitzer/dev/projects/src/wonder_project/sight/libs/__/data/detail/series_impl.hxx:839:16 (libsight_data.so.24.0+0x70c761)
#4 gdcm::SmartPointer<gdcm::SequenceOfItems> sight::data::detail::SeriesImpl::get_multi_frame_sequence<gdcm::Attribute<(unsigned short)20992, (unsigned short)37424, 262144ll, 1>, gdcm::Attribute<(unsigned short)32, (unsigned short)37137, 262144ll, 1>>(unsigned long) const /home/mschweitzer/dev/projects/src/wonder_project/sight/libs/__/data/detail/series_impl.hxx:853:38 (libsight_data.so.24.0+0x775dc0) (BuildId: 9af6565acac7dc76)
#5 std::optional<gdcm::Attribute<(unsigned short)24, (unsigned short)36980, 64ll, 1>::ArrayType> sight::data::detail::SeriesImpl::get_multi_frame_value<gdcm::Attribute<(unsigned short)20992, (unsigned short)37424, 262144ll, 1>, gdcm::Attribute<(unsigned short)32, (unsigned short)37137, 262144ll, 1>, gdcm::Attribute<(unsigned short)24, (unsigned short)36980, 64ll, 1>>(unsigned long) const /home/mschweitzer/dev/projects/src/wonder_project/sight/libs/__/data/detail/series_impl.hxx:892:42 (libsight_data.so.24.0+0x775b4e) (BuildId: 9af6565acac7dc76)
#6 sight::data::series::get_frame_acquisition_date_time[abi:cxx11](unsigned long) const /home/mschweitzer/dev/projects/src/wonder_project/sight/libs/__/data/series.cpp:2807:37 (libsight_data.so.24.0+0x6dffbd) (BuildId: 9af6565acac7dc76)
#7 sight::data::series::get_frame_acquisition_time_point(unsigned long) const /home/mschweitzer/dev/projects/src/wonder_project/sight/libs/__/data/series.cpp:2859:35 (libsight_data.so.24.0+0x6e03b4) (BuildId: 9af6565acac7dc76)
#8 wonder_project::module::filter::image::copy_dicom_timestamp::updating() /home/mschweitzer/dev/projects/src/wonder_project/modules/filter/image/copy_dicom_timestamp.cpp:43:44 (libwonder_project_module_filter_image.so+0x50486) (BuildId: 87d86604aad696bc)
one of the hypothesis is the use of.const &
forGDCM::smart_ptr<>
. This should removed and use safe object copy so the reference counter could be increased
After a quick look at the GDCM code, it appears clear that using
GDCM::smart_ptr<>
are not thread safe.GDCM::smart_ptr<>
will increment/decrement the internal object instance counter without any protection, making the the thing crash if two speartesGDCM::smart_ptr<>
operate on the same pointer at the same time.
GDCM author seems to think that GDCM should be thread safe but some part of it isn't:
https://github.com/SimpleITK/SimpleITK/issues/805
Steps to reproduce
This has been reproduced with an application with several frame producer and consumer on different workers. This is quite hard to reproduce, but the added unit test from !1074 (merged) demonstrates the problem.
Functional specifications
- Write a unit test that reproduce the problem
- No crash with multiple concurrent access to DICOM properties in
ImageSeries
Technical specifications
- Use recursive mutex to protect gdcm objects.
- keep const &, since it limit the number of non-thread safe instance counter.
Test plan
- Unit test