Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GenoCraft: A Comprehensive, User-Friendly Web-Based Platform for High-Throughput Omics Data Analysis and Visualization (2312.14249v3)

Published 21 Dec 2023 in q-bio.GN and cs.LG

Abstract: The surge in high-throughput omics data has reshaped the landscape of biological research, underlining the need for powerful, user-friendly data analysis and interpretation tools. This paper presents GenoCraft, a web-based comprehensive software solution designed to handle the entire pipeline of omics data processing. GenoCraft offers a unified platform featuring advanced bioinformatics tools, covering all aspects of omics data analysis. It encompasses a range of functionalities, such as normalization, quality control, differential analysis, network analysis, pathway analysis, and diverse visualization techniques. This software makes state-of-the-art omics data analysis more accessible to a wider range of users. With GenoCraft, researchers and data scientists have access to an array of cutting-edge bioinformatics tools under a user-friendly interface, making it a valuable resource for managing and analyzing large-scale omics data. The API with an interactive web interface is publicly available at https://genocraft.stanford. edu/. We also release all the codes in https://github.com/futianfan/GenoCraft.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Scenic: single-cell regulatory network inference and clustering. Nature methods, 14(11):1083–1086, 2017.
  2. Count-based differential expression analysis of rna sequencing data using r and bioconductor.
  3. Ncbi geo: archive for functional genomics data sets—10 years on. Nucleic acids research, 39(suppl_1):D1005–D1010, 2010.
  4. From reads to genes to pathways: differential expression analysis of rna-seq experiments using rsubread and the edger quasi-likelihood pipeline. F1000Research, 5, 2016.
  5. Data-driven detection of subtype-specific differentially expressed genes. Scientific reports, 11(1):332, 2021.
  6. Abds: tool suite for analyzing biologically diverse samples. bioRxiv, 2023.
  7. Normalization and noise reduction for single cell rna-seq experiments. Bioinformatics, 31(13):2225–2227, 2015.
  8. The reactome pathway knowledgebase. Nucleic acids research, 46(D1):D649–D655, 2018.
  9. Artificial intelligence foundation for therapeutic science. Nature Chemical Biology, pages 1–4, 2022.
  10. Single-cell rna sequencing technologies and bioinformatics pipelines. Experimental & molecular medicine, 50(8):1–14, 2018.
  11. Proteomic architecture of human coronary and aortic atherosclerosis. Circulation, 137(25):2741–2756, 2018.
  12. Allelotype of pancreatic adenocarcinoma using xenograft enrichment. Cancer research, 55(20):4670–4675, 1995.
  13. Algorithm as 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), 28(1):100–108, 1979.
  14. Maker2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC bioinformatics, 12(1):1–14, 2011.
  15. Classification of low quality cells from single-cell rna-seq data. Genome biology, 17(1):1–15, 2016.
  16. KEGG: kyoto encyclopedia of genes and genomes. Nucleic acids research, 28(1):27–30, 2000.
  17. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic acids research, 44(W1):W90–W97, 2016.
  18. Integrated identification of disease specific pathways using multi-omics data. bioRxiv, page 666065, 2019.
  19. Matthew Lease. On quality control and machine learning in crowdsourcing. In Workshops at the twenty-fifth AAAI conference on artificial intelligence. Citeseer, 2011.
  20. Yingzhou Lu. Multi-omics Data Integration for Identifying Disease Specific Biological Pathways. PhD thesis, Virginia Tech, 2018.
  21. COT: an efficient and accurate method for detecting marker genes among many subtypes. Bioinformatics Advances, 2(1):vbac037, 2022.
  22. Machine learning for synthetic data generation: a review. arXiv preprint arXiv:2302.04062, 2023.
  23. Wisdom of crowds for robust gene network inference. Nature methods, 9(8):796–804, 2012.
  24. Using graph theory to analyze biological networks. BioData mining, 4:1–27, 2011.
  25. The use of high-dimensional biology (genomics, transcriptomics, proteomics, and metabolomics) to understand the preterm parturition syndrome. BJOG: An International Journal of Obstetrics & Gynaecology, 113:118–135, 2006.
  26. Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic bulletin & review, 16:225–237, 2009.
  27. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences, 102(43):15545–15550, 2005.
  28. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-SNE. Journal of machine learning research, 9(11), 2008.
  29. Scientific discovery in the age of artificial intelligence. 620:47–60, 2023.
  30. Cosbin: cosine score-based iterative normalization of biologically diverse samples. Bioinformatics Advances, 2(1):vbac076, 2022.
  31. Rseqc: quality control of rna-seq experiments. Bioinformatics, 28(16):2184–2185, 2012.
  32. Interpretation of omics data analyses. Journal of human genetics, 66(1):93–102, 2021.
  33. DDN2.0: R and python packages for differential dependency network analysis of biological systems. bioRxiv, pages 2021–04, 2021.
  34. Misuse of RPKM or TPM normalization when comparing across samples and sequencing protocols. Rna, 26(8):903–909, 2020.
Citations (3)

Summary

We haven't generated a summary for this paper yet.