Papers
Topics
Authors
Recent
2000 character limit reached

Fitting Prediction Rule Ensembles to Psychological Research Data: An Introduction and Tutorial (1907.05302v5)

Published 11 Jul 2019 in stat.AP

Abstract: Prediction rule ensembles (PREs) are a relatively new statistical learning method, which aim to strike a balance between predictive accuracy and interpretability. Starting from a decision tree ensemble, like a boosted tree ensemble or a random forest, PREs retain a small subset of tree nodes in the final predictive model. These nodes can be written as simple rules of the form if [condition] then [prediction]. As a result, PREs are often much less complex than full decision tree ensembles, while they have been found to provide similar predictive accuracy in many situations. The current paper introduces the methodology and shows how PREs can be fitted using the R package pre through several real-data examples from psychological research. The examples also illustrate a number of features of package \textbf{pre} that may be particularly useful for applications in psychology: support for categorical, multivariate and count responses, application of (non-)negativity constraints, inclusion of confirmatory rules and standardized variable importance measures.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. \APACrefYearMonthDay2010. \BBOQ\APACrefatitlePermutation importance: a corrected feature importance measure Permutation importance: a corrected feature importance measure.\BBCQ \APACjournalVolNumPagesBioinformatics26101340–1347. \PrintBackRefs\CurrentBib
  2. \APACrefYearMonthDay2015. \BBOQ\APACrefatitleFitting Linear Mixed-Effects Models Using lme4 Fitting linear mixed-effects models using lme4.\BBCQ \APACjournalVolNumPagesJournal of Statistical Software6711–48. {APACrefDOI} 10.18637/jss.v067.i01 \PrintBackRefs\CurrentBib
  3. \APACinsertmetastarBrei96bagging{APACrefauthors}Breiman, L.  \APACrefYearMonthDay1996\BCnt1. \BBOQ\APACrefatitleBagging predictors Bagging predictors.\BBCQ \APACjournalVolNumPagesMachine Learning242123–140. \PrintBackRefs\CurrentBib
  4. \APACinsertmetastarBrei96{APACrefauthors}Breiman, L.  \APACrefYearMonthDay1996\BCnt2. \BBOQ\APACrefatitleHeuristics of instability and stabilization in model selection Heuristics of instability and stabilization in model selection.\BBCQ \APACjournalVolNumPagesThe Annals of Statistics2462350–2383. \PrintBackRefs\CurrentBib
  5. \APACinsertmetastarBrei98{APACrefauthors}Breiman, L.  \APACrefYearMonthDay1998. \BBOQ\APACrefatitleArcing Classifiers Arcing classifiers.\BBCQ \APACjournalVolNumPagesThe Annals of Statistics263801–849. \PrintBackRefs\CurrentBib
  6. \APACinsertmetastarBrei01RandFor{APACrefauthors}Breiman, L.  \APACrefYearMonthDay2001\BCnt1. \BBOQ\APACrefatitleRandom forests Random forests.\BBCQ \APACjournalVolNumPagesMachine Learning4515–32. \PrintBackRefs\CurrentBib
  7. \APACinsertmetastarBrei01TwoCult{APACrefauthors}Breiman, L.  \APACrefYearMonthDay2001\BCnt2. \BBOQ\APACrefatitleStatistical modeling: The two cultures (with comments and a rejoinder by the author) Statistical modeling: The two cultures (with comments and a rejoinder by the author).\BBCQ \APACjournalVolNumPagesStatistical Science163199–231. \PrintBackRefs\CurrentBib
  8. \APACrefYear1984. \APACrefbtitleClassification and Regression Trees Classification and regression trees. \APACaddressPublisherNew YorkWadsworth. \PrintBackRefs\CurrentBib
  9. \APACrefYearMonthDay2014. \BBOQ\APACrefatitleInternet-delivered treatment for substance abuse: a multisite randomized controlled trial Internet-delivered treatment for substance abuse: a multisite randomized controlled trial.\BBCQ \APACjournalVolNumPagesAmerican Journal of Psychiatry1716683–690. \PrintBackRefs\CurrentBib
  10. \APACrefYearMonthDay2012. \BBOQ\APACrefatitleDesign and methodological considerations of an effectiveness trial of a computer-assisted intervention: an example from the NIDA Clinical Trials Network Design and methodological considerations of an effectiveness trial of a computer-assisted intervention: an example from the nida clinical trials network.\BBCQ \APACjournalVolNumPagesContemporary Clinical Trials332386–395. \PrintBackRefs\CurrentBib
  11. \APACrefYearMonthDay2016. \BBOQ\APACrefatitleStatistical learning theory for high dimensional prediction: Application to criterion-keyed scale development Statistical learning theory for high dimensional prediction: Application to criterion-keyed scale development.\BBCQ \APACjournalVolNumPagesPsychological Methods214603. \PrintBackRefs\CurrentBib
  12. \APACrefYearMonthDay1999. \BBOQ\APACrefatitleA simple, fast, and effective rule learner A simple, fast, and effective rule learner.\BBCQ \BIn \APACrefbtitleProceedings of the National Conference on Artificial Intelligence Proceedings of the National Conference on Artificial Intelligence (\BPGS 335–342). \PrintBackRefs\CurrentBib
  13. \APACinsertmetastarCric96{APACrefauthors}Crick, N\BPBIR.  \APACrefYearMonthDay1996. \BBOQ\APACrefatitleThe role of overt aggression, relational aggression, and prosocial behavior in the prediction of children’s future social adjustment The role of overt aggression, relational aggression, and prosocial behavior in the prediction of children’s future social adjustment.\BBCQ \APACjournalVolNumPagesChild Development6752317–2327. \PrintBackRefs\CurrentBib
  14. \APACrefYearMonthDay2002. \BBOQ\APACrefatitleRisk factors for 12-month comorbidity of mood, anxiety, and substance use disorders: findings from the Netherlands Mental Health Survey and Incidence Study Risk factors for 12-month comorbidity of mood, anxiety, and substance use disorders: findings from the netherlands mental health survey and incidence study.\BBCQ \APACjournalVolNumPagesAmerican Journal of Psychiatry1594620–629. \PrintBackRefs\CurrentBib
  15. \APACrefYearMonthDay2010. \BBOQ\APACrefatitleENDER: A statistical framework for boosting decision rules ENDER: A statistical framework for boosting decision rules.\BBCQ \APACjournalVolNumPagesData Mining and Knowledge Discovery21152–90. \PrintBackRefs\CurrentBib
  16. \APACrefYearMonthDay2004. \BBOQ\APACrefatitleThe SCL-90-R, the Brief Symptom Inventory (BSI), and the BSI-18 The SCL-90-R, the Brief Symptom Inventory (BSI), and the BSI-18.\BBCQ \BIn M. Maruish (\BED), \APACrefbtitleThe use of psychological testing for treatment planning and outcomes assessment: Instruments for adults The use of psychological testing for treatment planning and outcomes assessment: Instruments for adults (\BPGS 1–41). \APACaddressPublisherMahwah, NJ, USLawrence Erlbaum Associates Publishers. \PrintBackRefs\CurrentBib
  17. \APACinsertmetastarFokkinpress{APACrefauthors}Fokkema, M.  \APACrefYearMonthDayaccepted. \BBOQ\APACrefatitleFitting prediction rule ensembles with R package pre Fitting prediction rule ensembles with R package pre.\BBCQ \APACjournalVolNumPagesJournal of Statistical Software. {APACrefURL} \urlhttps://arxiv.org/abs/1707.07149 \PrintBackRefs\CurrentBib
  18. \APACrefYearMonthDay2015. \BBOQ\APACrefatitleConnecting clinical and actuarial prediction with rule-based methods. Connecting clinical and actuarial prediction with rule-based methods.\BBCQ \APACjournalVolNumPagesPsychological Assessment272636. \PrintBackRefs\CurrentBib
  19. \APACrefYearMonthDay1995. \BBOQ\APACrefatitleA desicion-theoretic generalization of on-line learning and an application to boosting A desicion-theoretic generalization of on-line learning and an application to boosting.\BBCQ \BIn \APACrefbtitleEuropean Conference on Computational Learning Theory European Conference on Computational Learning Theory (\BPGS 23–37). \PrintBackRefs\CurrentBib
  20. \APACinsertmetastarFrie01{APACrefauthors}Friedman, J.  \APACrefYearMonthDay2001. \BBOQ\APACrefatitleGreedy function approximation: a gradient boosting machine Greedy function approximation: a gradient boosting machine.\BBCQ \APACjournalVolNumPagesAnnals of Statistics1189–1232. \PrintBackRefs\CurrentBib
  21. \APACrefYearMonthDay2010. \BBOQ\APACrefatitleRegularization Paths for Generalized Linear Models via Coordinate Descent Regularization paths for generalized linear models via coordinate descent.\BBCQ \APACjournalVolNumPagesJournal of Statistical Software3311–22. {APACrefURL} \urlhttp://www.jstatsoft.org/v33/i01/ \PrintBackRefs\CurrentBib
  22. \APACrefYearMonthDay2003. \APACrefbtitleImportance sampled learning ensembles Importance sampled learning ensembles [Technical Report]. \APACaddressPublisherStanford University. {APACrefURL} \urlhttp://www-stat.stanford.edu/ jhf/ftp/isle.pdf \PrintBackRefs\CurrentBib
  23. \APACrefYearMonthDay2008. \BBOQ\APACrefatitlePredictive learning via rule ensembles Predictive learning via rule ensembles.\BBCQ \APACjournalVolNumPagesThe Annals of Applied Statistics23916–954. \PrintBackRefs\CurrentBib
  24. \APACrefYearMonthDay2012. \APACrefbtitleRuleFit (version 3) Rulefit (version 3) [Computer software]. {APACrefURL} \urlhttp://www-stat.stanford.edu/ jhf/R-RuleFit.html \PrintBackRefs\CurrentBib
  25. \APACrefYearMonthDay1996. \BBOQ\APACrefatitleReasoning the fast and frugal way: Models of bounded rationality Reasoning the fast and frugal way: Models of bounded rationality.\BBCQ \APACjournalVolNumPagesPsychological Review1034650–669. \PrintBackRefs\CurrentBib
  26. \APACinsertmetastarGrah09{APACrefauthors}Graham, J\BPBIW.  \APACrefYearMonthDay2009. \BBOQ\APACrefatitleMissing data analysis: Making it work in the real world Missing data analysis: Making it work in the real world.\BBCQ \APACjournalVolNumPagesAnnual Review of Psychology200960549–576. \PrintBackRefs\CurrentBib
  27. \APACrefYearMonthDay2016. \BBOQ\APACrefatitleBig data in psychology: Introduction to the special issue Big data in psychology: Introduction to the special issue.\BBCQ \APACjournalVolNumPagesPsychological Methods214447. \PrintBackRefs\CurrentBib
  28. \APACrefYear2009. \APACrefbtitleThe elements of statistical learning The elements of statistical learning (\PrintOrdinal2nd \BEd). \APACaddressPublisherNew YorkSpringer. \PrintBackRefs\CurrentBib
  29. \APACrefYearMonthDay2008. \BBOQ\APACrefatitlePredicting cardiovascular risk in England and Wales: Prospective derivation and validation of QRISK2 Predicting cardiovascular risk in England and Wales: Prospective derivation and validation of QRISK2.\BBCQ \APACjournalVolNumPagesBritish Medical Journal33676591475–1482. \PrintBackRefs\CurrentBib
  30. \APACrefYearMonthDay2006. \BBOQ\APACrefatitleUnbiased recursive partitioning: A conditional inference framework Unbiased recursive partitioning: A conditional inference framework.\BBCQ \APACjournalVolNumPagesJournal of Computational and Graphical Statistics153651–674. \PrintBackRefs\CurrentBib
  31. \APACrefYearMonthDay2015. \BBOQ\APACrefatitlepartykit: A Modular Toolkit for Recursive Partytioning in R partykit: A modular toolkit for recursive partytioning in R.\BBCQ \APACjournalVolNumPagesJournal of Machine Learning Research163905-3909. \PrintBackRefs\CurrentBib
  32. \APACrefYearMonthDay2012. \BBOQ\APACrefatitleL1-Based Compression of Random Forest Models L1-based compression of random forest models.\BBCQ \BIn \APACrefbtitle20th European Symposium on Artificial Neural Networks. 20th European Symposium on Artificial Neural Networks. \PrintBackRefs\CurrentBib
  33. \APACrefYearMonthDay2008. \BBOQ\APACrefatitleFrom Meehl to Fast and Frugal Heuristics (and Back): New Insights into How to Bridge the Clinical-Actuarial Divide From meehl to fast and frugal heuristics (and back): New insights into how to bridge the clinical-actuarial divide.\BBCQ \APACjournalVolNumPagesTheory & Psychology184443–464. \PrintBackRefs\CurrentBib
  34. \APACinsertmetastarKim09{APACrefauthors}Kim, J\BHBIH.  \APACrefYearMonthDay2009. \BBOQ\APACrefatitleEstimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap.\BBCQ \APACjournalVolNumPagesComputational Statistics & Data Analysis53113735–3745. \PrintBackRefs\CurrentBib
  35. \APACrefYearMonthDay2016. \BBOQ\APACrefatitleMining big data to extract patterns and predict real-life outcomes Mining big data to extract patterns and predict real-life outcomes.\BBCQ \APACjournalVolNumPagesPsychological Methods214493. \PrintBackRefs\CurrentBib
  36. \APACrefYearMonthDay2014. \BBOQ\APACrefatitleCross-validation pitfalls when selecting and assessing regression and classification models Cross-validation pitfalls when selecting and assessing regression and classification models.\BBCQ \APACjournalVolNumPagesJournal of Cheminformatics610. \PrintBackRefs\CurrentBib
  37. \APACinsertmetastarKuhn08{APACrefauthors}Kuhn, M.  \APACrefYearMonthDay2008. \BBOQ\APACrefatitleBuilding Predictive Models in R Using the caret Package Building Predictive Models in R Using the caret Package.\BBCQ \APACjournalVolNumPagesJournal of Statistical Software2851–26. {APACrefURL} \urlhttps://www.jstatsoft.org/v028/i05 {APACrefDOI} 10.18637/jss.v028.i05 \PrintBackRefs\CurrentBib
  38. \APACrefYearMonthDay2017. \BBOQ\APACrefatitlelmerTest Package: Tests in Linear Mixed Effects Models lmerTest package: Tests in linear mixed effects models.\BBCQ \APACjournalVolNumPagesJournal of Statistical Software82131–26. {APACrefDOI} 10.18637/jss.v082.i13 \PrintBackRefs\CurrentBib
  39. \APACrefYearMonthDay2004. \BBOQ\APACrefatitlePredicting drinking behavior and alcohol-related problems among fraternity and sorority members: Examining the role of descriptive and injunctive norms. Predicting drinking behavior and alcohol-related problems among fraternity and sorority members: Examining the role of descriptive and injunctive norms.\BBCQ \APACjournalVolNumPagesPsychology of Addictive Behaviors183203. \PrintBackRefs\CurrentBib
  40. \APACrefYearMonthDay2002. \BBOQ\APACrefatitleOrganizational citizenship behavior and workplace deviance: The role of affect and cognitions. Organizational citizenship behavior and workplace deviance: The role of affect and cognitions.\BBCQ \APACjournalVolNumPagesJournal of Applied Psychology871131. \PrintBackRefs\CurrentBib
  41. \APACrefYearMonthDay2003. \BBOQ\APACrefatitleCoping skills and treatment outcomes in cognitive-behavioral and interactional group therapy for alcoholism Coping skills and treatment outcomes in cognitive-behavioral and interactional group therapy for alcoholism.\BBCQ \APACjournalVolNumPagesJournal of Consulting and Clinical Psychology711118. \PrintBackRefs\CurrentBib
  42. \APACrefYearMonthDay1997. \BBOQ\APACrefatitleSplit selection methods for classification trees Split selection methods for classification trees.\BBCQ \APACjournalVolNumPagesStatistica Sinica74815–840. \PrintBackRefs\CurrentBib
  43. \APACrefYearMonthDay2011. \BBOQ\APACrefatitleA signal-detection analysis of fast-and-frugal trees A signal-detection analysis of fast-and-frugal trees.\BBCQ \APACjournalVolNumPagesPsychological Review1182316. \PrintBackRefs\CurrentBib
  44. \APACrefYearMonthDay2004. \BBOQ\APACrefatitlePredictive validity of the Implicit Association Test in studies of brands, consumer attitudes, and behavior Predictive validity of the implicit association test in studies of brands, consumer attitudes, and behavior.\BBCQ \APACjournalVolNumPagesJournal of Consumer Psychology144405–415. \PrintBackRefs\CurrentBib
  45. \APACinsertmetastarMein10{APACrefauthors}Meinshausen, N.  \APACrefYearMonthDay2010. \BBOQ\APACrefatitleNode harvest Node harvest.\BBCQ \APACjournalVolNumPagesThe Annals of Applied Statistics442049–2072. \PrintBackRefs\CurrentBib
  46. \APACrefYearMonthDay2016. \BBOQ\APACrefatitleFinding structure in data using multivariate tree boosting. Finding structure in data using multivariate tree boosting.\BBCQ \APACjournalVolNumPagesPsychological Methods214583. \PrintBackRefs\CurrentBib
  47. \APACrefYearMonthDay2010. \BBOQ\APACrefatitleThe behaviour of random forest permutation-based variable importance measures under predictor correlation The behaviour of random forest permutation-based variable importance measures under predictor correlation.\BBCQ \APACjournalVolNumPagesBMC Bioinformatics111110. \PrintBackRefs\CurrentBib
  48. \APACrefYearMonthDay2016. \BBOQ\APACrefatitlePredicting performance in higher education using proximal predictors Predicting performance in higher education using proximal predictors.\BBCQ \APACjournalVolNumPagesPloS one114e0153663. \PrintBackRefs\CurrentBib
  49. \APACrefYearMonthDay2008. \BBOQ\APACrefatitleThe Netherlands Study of Depression and Anxiety (NESDA): Rationale, Objectives and Methods The Netherlands Study of Depression and Anxiety (NESDA): Rationale, objectives and methods.\BBCQ \APACjournalVolNumPagesInternational Journal of Methods in Psychiatric Research173121–140. \PrintBackRefs\CurrentBib
  50. \APACrefYearMonthDay2011. \BBOQ\APACrefatitleTwo-year course of depressive and anxiety disorders: Results from the Netherlands Study of Depression and Anxiety (NESDA) Two-year course of depressive and anxiety disorders: Results from the Netherlands Study of Depression and Anxiety (NESDA).\BBCQ \APACjournalVolNumPagesJournal of Affective Disorders133176–85. \PrintBackRefs\CurrentBib
  51. \APACinsertmetastarR19{APACrefauthors}R Core Team.  \APACrefYearMonthDay2019. \BBOQ\APACrefatitleR: A Language and Environment for Statistical Computing R: A language and environment for statistical computing\BBCQ [\bibcomputersoftwaremanual]. \APACaddressPublisherVienna, Austria. {APACrefURL} \urlhttps://www.R-project.org/ \PrintBackRefs\CurrentBib
  52. \APACinsertmetastarRoka10{APACrefauthors}Rokach, L.  \APACrefYearMonthDay2010. \BBOQ\APACrefatitleEnsemble-based classifiers Ensemble-based classifiers.\BBCQ \APACjournalVolNumPagesArtificial Intelligence Review331-21–39. \PrintBackRefs\CurrentBib
  53. \APACrefYearMonthDay2014. \BBOQ\APACrefatitleModified rule ensemble method for binary data and its applications Modified rule ensemble method for binary data and its applications.\BBCQ \APACjournalVolNumPagesBehaviormetrika412225–244. \PrintBackRefs\CurrentBib
  54. \APACrefYearMonthDay2011. \BBOQ\APACrefatitleRegularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent Regularization paths for cox’s proportional hazards model via coordinate descent.\BBCQ \APACjournalVolNumPagesJournal of Statistical Software3951–13. {APACrefURL} \urlhttp://www.jstatsoft.org/v39/i05/ \PrintBackRefs\CurrentBib
  55. \APACrefYearMonthDay2008. \BBOQ\APACrefatitleConditional variable importance for random forests Conditional variable importance for random forests.\BBCQ \APACjournalVolNumPagesBMC Bioinformatics911. \PrintBackRefs\CurrentBib
  56. \APACrefYearMonthDay2007. \BBOQ\APACrefatitleBias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution Bias in random forest variable importance measures: Illustrations, sources and a solution.\BBCQ \APACjournalVolNumPagesBMC Bioinformatics825. {APACrefURL} \urlhttp://www.biomedcentral.com/1471-2105/8/25 \PrintBackRefs\CurrentBib
  57. \APACrefYearMonthDay2009. \BBOQ\APACrefatitleAn introduction to recursive partitioning: Rationale, application, and characteristics of classification and regression trees, bagging, and random forests An introduction to recursive partitioning: Rationale, application, and characteristics of classification and regression trees, bagging, and random forests.\BBCQ \APACjournalVolNumPagesPsychological Methods144323. \PrintBackRefs\CurrentBib
  58. \APACrefYearMonthDay2000. \BBOQ\APACrefatitleLightweight Rule Induction Lightweight rule induction.\BBCQ \BIn \APACrefbtitleProceedings of the Seventeenth International Conference on Machine Learning Proceedings of the seventeenth international conference on machine learning (\BPGS 1135–1142). \PrintBackRefs\CurrentBib
  59. \APACrefYearMonthDay2010. \BBOQ\APACrefatitleThe efficacy of violence prediction: a meta-analytic comparison of nine risk assessment tools The efficacy of violence prediction: a meta-analytic comparison of nine risk assessment tools.\BBCQ \APACjournalVolNumPagesPsychological Bulletin1365740. \PrintBackRefs\CurrentBib
  60. \APACrefYearMonthDay2008. \BBOQ\APACrefatitleMining diagnostic rules of breast tumor on ultrasound image using cost-sensitive RuleFit method Mining diagnostic rules of breast tumor on ultrasound image using cost-sensitive rulefit method.\BBCQ \BIn \APACrefbtitleISKE 2008: 3rd International Conference on Intelligent Systems and Knowledge Engineering ISKE 2008: 3rd International Conference on Intelligent Systems and Knowledge Engineering (\BVOL 1, \BPGS 354–359). \PrintBackRefs\CurrentBib
  61. \APACrefYearMonthDay2017. \BBOQ\APACrefatitleChoosing prediction over explanation in psychology: Lessons from machine learning Choosing prediction over explanation in psychology: Lessons from machine learning.\BBCQ \APACjournalVolNumPagesPerspectives on Psychological Science1261100–1122. \PrintBackRefs\CurrentBib
  62. \APACrefYearMonthDay2006. \BBOQ\APACrefatitlePrediction of the 10-year course of borderline personality disorder Prediction of the 10-year course of borderline personality disorder.\BBCQ \APACjournalVolNumPagesAmerican Journal of Psychiatry1635827–832. \PrintBackRefs\CurrentBib
Citations (21)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.