Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SCARF: A Biomedical Association Rule Finding Webserver (1709.09850v1)

Published 28 Sep 2017 in cs.DB and q-bio.QM

Abstract: The analysis of enormous datasets with missing data entries is a standard task in biological and medical data processing. Large-scale, multi-institution clinical studies are the typical examples of such datasets. These sets make possible the search for multi-parametric relations since from the plenty of the data one is likely to find a satisfying number of subjects with the required parameter ensembles. Specifically, finding combinatorial biomarkers for some given condition also needs a very large dataset to analyze. For this goal, statistical regression analysis is not the preferred tool of choice, since (i) the {\em a priori} knowledge of the parameter-sets to analyze is missing, and (ii) typically relatively few subjects have the interesting parameter-value ensembles for the analysis. For fast and automatic multi-parametric relation discovery association-rule finding tools are used for more than two decades in the data-mining community. Here we present the SCARF webserver for {\em generalized} association rule mining. Association rules are of the form: $a$ AND $b$ AND ... AND $x \rightarrow y$, meaning that the presence of properties $a$ AND $b$ AND ... AND $x$ implies property $y$; our algorithm finds generalized association rules, since it also finds logical disjunctions (i.e., ORs) at the left-hand side, allowing the discovery of more complex rules in a more compressed form in the database. This feature also helps reducing the typically very large result-tables of such studies, since allowing ORs in the left-hand side of a single rule could include dozens of classical rules. The capabilities of the SCARF algorithm were demonstrated in mining the Alzheimer's database of the Coalition Against Major Diseases (CAMD) in our recent publication (Archives of Gerontology and Geriatrics Vol. 73, pp. 300-307, 2017). Here we describe the webserver implementation of the algorithm.

Summary

We haven't generated a summary for this paper yet.