Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Data Lakes, Clouds and Commons: A Review of Platforms for Analyzing and Sharing Genomic Data (1809.01699v2)

Published 5 Sep 2018 in q-bio.GN and cs.CY

Abstract: Data commons collate data with cloud computing infrastructure and commonly used software services, tools and applications to create biomedical resources for the large-scale management, analysis, harmonization, and sharing of biomedical data. Over the past few years, data commons have been used to analyze, harmonize and share large scale genomics datasets. Data ecosystems can be built by interoperating multiple data commons. It can be quite labor intensive to curate, import and analyze the data in a data commons. Data lakes provide an alternative to data commons and simply provide access to data, with the data curation and analysis deferred until later and delegated to those that access the data. We review software platforms for managing, analyzing and sharing genomic data, with an emphasis on data commons, but also covering data ecosystems and data lakes.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Robert L. Grossman (18 papers)
Citations (47)

Summary

We haven't generated a summary for this paper yet.