2000 character limit reached
High performance on-demand de-identification of a petabyte-scale medical imaging data lake (2008.01827v1)
Published 4 Aug 2020 in cs.DC and cs.PF
Abstract: With the increase in Artificial Intelligence driven approaches, researchers are requesting unprecedented volumes of medical imaging data which far exceed the capacity of traditional on-premise client-server approaches for making the data research analysis-ready. We are making available a flexible solution for on-demand de-identification that combines the use of mature software technologies with modern cloud-based distributed computing techniques to enable faster turnaround in medical imaging research. The solution is part of a broader platform that supports a secure high performance clinical data science platform.