Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 174 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

SoK: Unintended Interactions among Machine Learning Defenses and Risks (2312.04542v2)

Published 7 Dec 2023 in cs.CR and cs.LG

Abstract: Machine learning (ML) models cannot neglect risks to security, privacy, and fairness. Several defenses have been proposed to mitigate such risks. When a defense is effective in mitigating one risk, it may correspond to increased or decreased susceptibility to other risks. Existing research lacks an effective framework to recognize and explain these unintended interactions. We present such a framework, based on the conjecture that overfitting and memorization underlie unintended interactions. We survey existing literature on unintended interactions, accommodating them within our framework. We use our framework to conjecture on two previously unexplored interactions, and empirically validate our conjectures.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (203)
  1. J. Aalmoes et al., “Leveraging algorithmic fairness to mitigate blackbox attribute inference attacks,” arXiv preprint arXiv:2211.10209, 2022.
  2. M. Abadi et al., “Deep learning with differential privacy,” in CCS, 2016, p. 308–318. [Online]. Available: https://doi.org/10.1145/2976749.2978318
  3. Y. Adi et al., “Turning your weakness into a strength: Watermarking deep neural networks by backdooring,” in USENIX Security, 2018, pp. 1615–1631. [Online]. Available: https://www.usenix.org/conference/usenixsecurity18/presentation/adi
  4. A. Agarwal et al., “A reductions approach to fair classification,” in ICML, vol. 80, 2018, pp. 60–69.
  5. U. Aïvodji et al., “Model extraction from counterfactual explanations,” arXiv preprint arXiv:2009.01884, 2020.
  6. G. Alves et al., “Survey on fairness notions and related tensions,” EURO Journal on Decision Processes, 2023.
  7. M. Ancona et al., “Towards better understanding of gradient-based attribution methods for deep neural networks,” in ICLR, 2018. [Online]. Available: https://openreview.net/forum?id=Sy21R9JAW
  8. D. Arpit et al., “A closer look at memorization in deep networks,” in ICML, 2017, p. 233–242.
  9. G. Ateniese et al., “Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers,” Int. J. Secur. Netw., vol. 10, no. 3, p. 137–150, Sep. 2015.
  10. B. G. Atli Tekgul and N. Asokan, “On the effectiveness of dataset watermarking,” in IWSPA, 2022, p. 93–99. [Online]. Available: https://doi.org/10.1145/3510548.3519376
  11. E. Bagdasaryan et al., “Differential privacy has disparate impact on model accuracy,” in NeurIPS, 2019, pp. 15 479–15 488.
  12. A. Balagopalan et al., “The road to explainability is paved with bias: Measuring the fairness of explanations,” in FaccT, 2022, p. 1194–1206. [Online]. Available: https://doi.org/10.1145/3531146.3533179
  13. T. Baluta et al., “Membership inference attacks and generalization: A causal perspective,” in CCS, 2022, p. 249–262.
  14. H. Baniecki et al., “Fooling partial dependence via data poisoning,” in Machine Learning and Knowledge Discovery in Databases, 2023, pp. 121–136.
  15. M. Belkin et al., “Reconciling modern machine-learning practice and the classical bias–variance trade-off,” Proc. of National Academy of Sciences, vol. 116, no. 32, pp. 15 849–15 854, 2019. [Online]. Available: https://www.pnas.org/doi/abs/10.1073/pnas.1903070116
  16. P. Benz et al., “Robustness may be at odds with fairness: An empirical study on class-wise accuracy,” in NeurIPS Workshop on Pre-registration in ML, 2021, pp. 325–342. [Online]. Available: https://proceedings.mlr.press/v148/benz21a.html
  17. P. Blanchard et al., “Machine learning with adversaries: Byzantine tolerant gradient descent,” in NeurIPS, 2017. [Online]. Available: https://proceedings.neurips.cc/paper/2017/file/f4b9ec30ad9f68f89b29639786cb62ef-Paper.pdf
  18. F. Boenisch et al., “Gradient masking and the underestimated robustness threats of differential privacy in deep learning,” arXiv preprint arXiv:2105.07985, 2021.
  19. E. Borgnia et al., “Strong data augmentation sanitizes poisoning and backdoor attacks without an accuracy tradeoff,” in ICASSP, 2021, pp. 3855–3859.
  20. G. Brown et al., “When is memorization of irrelevant training data necessary for high-accuracy learning?” in STOC, 2021, p. 123–132.
  21. X. Cao et al., “Ipguard: Protecting intellectual property of deep neural networks via fingerprinting the classification boundary,” in AsiaCCS, 2021, p. 14–25. [Online]. Available: https://doi.org/10.1145/3433210.3437526
  22. N. Carlini et al., “The secret sharer: Evaluating and testing unintended memorization in neural networks,” in USENIX Security, 2019, pp. 267–284. [Online]. Available: https://www.usenix.org/conference/usenixsecurity19/presentation/carlini
  23. ——, “Extracting training data from large language models,” in USENIX Security, 2021, pp. 2633–2650. [Online]. Available: https://www.usenix.org/conference/usenixsecurity21/presentation/carlini-extracting
  24. ——, “Membership inference attacks from first principles,” in S&P, 2022, pp. 1897–1914.
  25. ——, “The privacy onion effect: Memorization is relative,” in NeurIPS, 2022. [Online]. Available: https://openreview.net/forum?id=ErUlLrGaVEU
  26. ——, “Quantifying memorization across neural language models,” arXiv preprint arXiv:2202.07646, 2022.
  27. ——, “Extracting training data from diffusion models,” in USENIX Security, 2023.
  28. H. Chang et al., “On adversarial bias and the robustness of fair machine learning,” 2020.
  29. H. Chang and R. Shokri, “On the privacy risks of algorithmic fairness,” in EuroS&P, 2021, pp. 292–303.
  30. H. Chaudhari et al., “Snap: Efficient extraction of private properties with poisoning,” in S&P, 2023, pp. 400–417. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/SP46215.2023.10179334
  31. J. Chen et al., “Hopskipjumpattack: A query-efficient decision-based attack,” in 2020 S&P, 2020, pp. 1277–1294.
  32. M. Chen and O. Ohrimenko, “Protecting global properties of datasets with distribution privacy mechanisms,” 2022. [Online]. Available: https://arxiv.org/abs/2207.08367
  33. Y. Chen et al., “Amplifying membership exposure via data poisoning,” in NeurIPS, 2022. [Online]. Available: https://openreview.net/forum?id=mT18WLu9J_
  34. C. A. Choquette-Choo et al., “Label-only membership inference attacks,” in ICML, 2021, pp. 1964–1974. [Online]. Available: https://proceedings.mlr.press/v139/choquette-choo21a.html
  35. J. R. Correia-Silva et al., “Copycat cnn: Stealing knowledge by persuading confession with random non-labeled data,” in IJCNN, 2018, pp. 1–8.
  36. F. Croce et al., “Robustbench: a standardized adversarial robustness benchmark,” in NeurIPS Datasets and Benchmarks Track, 2021. [Online]. Available: https://openreview.net/forum?id=SSKZPJCt7B
  37. R. Cummings et al., “On the compatibility of privacy and fairness,” in Adjunct Publication of UMAP, 2019, p. 309–315. [Online]. Available: https://doi.org/10.1145/3314183.3323847
  38. J. Dai et al., “Fairness via explanation quality: Evaluating disparities in the quality of post hoc explanations,” in AIES, 2022, p. 203–214. [Online]. Available: https://doi.org/10.1145/3514094.3534159
  39. Y. Dar et al., “A farewell to the bias-variance tradeoff? an overview of the theory of overparameterized machine learning,” arXiv preprint arXiv:2109.02355, 2021.
  40. E. De Cristofaro, “An overview of privacy in machine learning,” arXiv preprint arXiv:2005.08679, 2020.
  41. A. Demontis et al., “Why do adversarial attacks transfer? explaining transferability of evasion and poisoning attacks,” in USENIX Security, 2019, pp. 321–338. [Online]. Available: https://www.usenix.org/conference/usenixsecurity19/presentation/demontis
  42. G. J. V. den Burg and C. Williams, “On memorization in probabilistic deep generative models,” in NeurIPS, 2021. [Online]. Available: https://openreview.net/forum?id=PlGSgjFK2oJ
  43. A.-K. Dombrowski et al., “Explanations can be manipulated and geometry is to blame,” NeurIPS, vol. 32, 2019.
  44. V. Duddu and A. Boutet, “Inferring sensitive attributes from model explanations,” in CIKM, 2022, p. 416–425. [Online]. Available: https://doi.org/10.1145/3511808.3557362
  45. V. Duddu et al., “Towards privacy aware deep learning for embedded systems,” in SAC, p. 520–529. [Online]. Available: https://doi.org/10.1145/3477314.3507128
  46. ——, “Shapr: An efficient and versatile membership privacy risk metric for machine learning,” arXiv preprint arXiv:2112.02230, 2021.
  47. V. Feldman, “Does learning require memorization? a short tale about a long tail,” in STOC, 2020, pp. 954–959.
  48. V. Feldman and C. Zhang, “What neural networks memorize and why: Discovering the long tail via influence estimation,” NeurIPS, pp. 2881–2891, 2020.
  49. J. Ferry et al., “Exploiting fairness to enhance sensitive attributes reconstruction,” in SaTML, 2023, pp. 18–41.
  50. F. Fioretto et al., “Differential privacy and fairness in decisions and learning tasks: A survey,” in IJCAI, 2022, pp. 5470–5477. [Online]. Available: https://doi.org/10.24963/ijcai.2022/766
  51. M. Fredrikson et al., “Model inversion attacks that exploit confidence information and basic countermeasures,” in CCS, 2015, p. 1322–1333. [Online]. Available: https://doi.org/10.1145/2810103.2813677
  52. ——, “Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing,” in USENIX Security, 2014, p. 17–32.
  53. S. Galhotra et al., “Causal feature selection for algorithmic fairness,” in SIGMOD, 2022, p. 276–285.
  54. K. Ganju et al., “Property inference attacks on fully connected neural networks using permutation invariant representations,” in CCS, 2018, p. 619–633. [Online]. Available: https://doi.org/10.1145/3243734.3243834
  55. J. Geiping et al., “Inverting gradients - how easy is it to break privacy in federated learning?” in NeurIPS, 2020, pp. 16 937–16 947. [Online]. Available: https://proceedings.neurips.cc/paper/2020/file/c4ede56bbd98819ae6112b20ac6bf145-Paper.pdf
  56. S. Geman et al., “Neural networks and the bias/variance dilemma,” Neural Computation, vol. 4, no. 1, pp. 1–58, 1992.
  57. A. Gittens et al., “An adversarial perspective on accuracy, robustness, fairness, and privacy: Multilateral-tradeoffs in trustworthy ml,” IEEE Access, vol. 10, pp. 120 850–120 865, 2022.
  58. I. J. Goodfellow et al., “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
  59. K. Grosse et al., “On the (statistical) detection of adversarial examples,” arXiv preprint arXiv:1702.06280, 2017.
  60. T. Gu et al., “Badnets: Evaluating backdooring attacks on deep neural networks,” IEEE Access, vol. 7, pp. 47 230–47 244, 2019.
  61. R. Guerraoui et al., “The hidden vulnerability of distributed learning in byzantium,” in ICML, 2018, pp. 3521–3530.
  62. R. Guidotti et al., “A survey of methods for explaining black box models,” ACM Comput. Surv., vol. 51, no. 5, 2018. [Online]. Available: [https://doi.org/10.1145/3236009](https://doi.org/10.1145/3236009)
  63. F. Harder et al., “Interpretable and differentially private predictions,” pp. 4083–4090, 2020. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/5827
  64. M. Hardt et al., “Equality of opportunity in supervised learning,” NeurIPS, 2016.
  65. ——, “Train faster, generalize better: Stability of stochastic gradient descent,” in ICML, 2016, p. 1225–1234.
  66. V. Hartmann et al., “Distribution inference risks: Identifying and mitigating sources of leakage,” in SaTML, 2023, pp. 136–149. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/SaTML54575.2023.00018
  67. J. Hayes, “Trade-offs between membership privacy & adversarially robust learning,” 2020.
  68. J. Heo et al., “Fooling neural network interpretations via adversarial model manipulation,” 2019.
  69. S. Hong et al., “On the effectiveness of mitigating data poisoning attacks with gradient shaping,” arXiv preprint arXiv:2002.11497, 2020.
  70. H. Hu et al., “Membership inference attacks on machine learning: A survey,” ACM Comput. Surv., 2022. [Online]. Available: [https://doi.org/10.1145/3523273](https://doi.org/10.1145/3523273)
  71. Y. Hu et al., “Understanding the impact of adversarial robustness on accuracy disparity,” in ICML, 2023.
  72. T. Humphries et al., “Investigating membership inference attacks under data dependencies,” in CSF, 2023, pp. 473–488.
  73. A. Ilyas et al., “Adversarial examples are not bugs, they are features,” NeurIPS, vol. 32, 2019.
  74. M. Jagielski et al., “Differentially private fair learning,” in ICML, 2019, pp. 3000–3008. [Online]. Available: https://proceedings.mlr.press/v97/jagielski19a.html
  75. ——, “Subpopulation data poisoning attacks,” in CCS, 2021, pp. 3104–3122.
  76. M. Jagielski and A. Oprea, “Does differential privacy defeat data poisoning?” Workshop on Distributed and Private Machine Learning, ICLR, 2021.
  77. B. Jayaraman and D. Evans, “Evaluating differentially private machine learning in practice,” in USENIX Security, 2019.
  78. ——, “Are attribute inference attacks just imputation?” in CCS, 2022, p. 1569–1582. [Online]. Available: https://doi.org/10.1145/3548606.3560663
  79. R. Jia et al., “Efficient task-specific data valuation for nearest neighbor algorithms,” Proc. VLDB Endow., vol. 12, no. 11, p. 1610–1623, 2019.
  80. ——, “Towards efficient data valuation based on the shapley value,” in AISTATS, 2019, pp. 1167–1176. [Online]. Available: http://proceedings.mlr.press/v89/jia19a.html
  81. ——, “Scalability vs. utility: Do we have to sacrifice one for the other in data importance quantification?” in CVPR, 2021.
  82. M. Juuti et al., “Prada: protecting against dnn model stealing attacks,” in EuroS&P, 2019, pp. 512–527.
  83. F. Kamiran and T. Calders, “Data pre-processing techniques for classification without discrimination,” Knowledge and Information Systems, vol. 33, 2011.
  84. N. Kandpal et al., “Deduplicating training data mitigates privacy risks in language models,” in ICML, 2022.
  85. H. Karimi et al., “Characterizing the decision boundary of deep neural networks,” arXiv preprint arXiv:1912.11460, 2019.
  86. K. Khaled et al., “Careful what you wish for: on the extraction of adversarially trained models,” in PST, 2022, pp. 1–10.
  87. B. Kim et al., “Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav),” in ICML, 2018, pp. 2668–2677.
  88. K. Krishna et al., “Thieves of sesame street: Model extraction on bert-based apis,” in ICLR, 2020.
  89. M. Lecuyer et al., “Certified robustness to adversarial examples with differential privacy,” in S&P, 2019, pp. 656–672.
  90. K. Lee et al., “Deduplicating training data makes language models better,” in ACL, 2022, pp. 8424–8445. [Online]. Available: https://aclanthology.org/2022.acl-long.577
  91. G. Li et al., “Adversarial training over long-tailed distribution,” arXiv preprint arXiv:2307.10205, 2023.
  92. L. Li et al., “Sok: Certified robustness for deep neural networks,” in S&P, 2023, pp. 1289–1310. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/SP46215.2023.10179303
  93. Y. Li et al., “Untargeted backdoor watermark: Towards harmless and stealthy dataset copyright protection,” NeurIPS, vol. 35, pp. 13 238–13 250, 2022.
  94. J. Liu et al., “False claims against model ownership resolution,” arXiv preprint arXiv:2304.06607, 2023.
  95. Y. Liu et al., “{{\{{ML-Doctor}}\}}: Holistic risk assessment of inference attacks against machine learning models,” in USENIX Security, 2022, pp. 4525–4542.
  96. S. Lounici et al., “Blindspot: Watermarking through fairness,” in IH&MMSec, 2022, p. 39–50. [Online]. Available: https://doi.org/10.1145/3531536.3532950
  97. N. Lukas et al., “Deep neural network fingerprinting by conferrable adversarial examples,” in ICLR, 2021. [Online]. Available: https://openreview.net/forum?id=VqzVhqxkjH1
  98. ——, “Sok: How robust is image classification deep neural network watermarking?” in S&P, 2022, pp. 787–804.
  99. X. Ma et al., “On the tradeoff between robustness and fairness,” in NeurIPS, 2022. [Online]. Available: https://openreview.net/forum?id=LqGA2JMLwBw
  100. Y. Ma et al., “Data poisoning against differentially-private learners: Attacks and defenses,” in IJCAI, 2019, p. 4732–4738.
  101. G. R. Machado et al., “Adversarial machine learning in image classification: A survey toward the defender’s perspective,” ACM Comput. Surv., vol. 55, no. 1, 2021. [Online]. Available: [https://doi.org/10.1145/3485133](https://doi.org/10.1145/3485133)
  102. A. Madry et al., “Towards deep learning models resistant to adversarial attacks,” in ICLR, 2018. [Online]. Available: https://openreview.net/forum?id=rJzIBfZAb
  103. D. Mahajan et al., “The connection between out-of-distribution generalization and privacy of ml models,” 2020.
  104. ——, “Domain generalization using causal matching,” in ICML, 2021, pp. 7313–7324.
  105. S. Mahloujifar et al., “Property inference from poisoning,” in S&P, 2022, pp. 1120–1137.
  106. P. Maini et al., “Dataset inference: Ownership resolution in machine learning,” in ICLR, 2021. [Online]. Available: https://openreview.net/forum?id=hvdKKV2yt7T
  107. M. Malekzadeh et al., “Honest-but-curious nets: Sensitive attributes of private inputs can be secretly coded into the classifiers’ outputs,” in CCS, 2021, p. 825–844. [Online]. Available: https://doi.org/10.1145/3460120.3484533
  108. S. Mehnaz et al., “Are your sensitive attributes private? novel model inversion attribute inference attacks on classification models,” in USENIX Security, Boston, MA, 2022. [Online]. Available: https://www.usenix.org/conference/usenixsecurity22/presentation/mehnaz
  109. N. Mehrabi et al., “Exacerbating algorithmic bias through fairness attacks,” AIES, pp. 8930–8938, 2021. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/17080
  110. ——, “A survey on bias and fairness in machine learning,” ACM Comput. Surv., vol. 54, no. 6, 2021. [Online]. Available: [https://doi.org/10.1145/3457607](https://doi.org/10.1145/3457607)
  111. F. A. Mejia et al., “Robust or private? adversarial training makes models more vulnerable to privacy attacks,” in arXiv preprint arXiv:1906.06449, 2019.
  112. L. Melis et al., “Exploiting unintended feature leakage in collaborative learning,” in S&P, 2019, pp. 691–706.
  113. S.-M. Moosavi-Dezfooli et al., “Robustness via curvature regularization, and vice versa,” in CVPR, 2019, pp. 9078–9086.
  114. R. Nabi and I. Shpitser, “Fair inference on outcomes,” in AAAI, 2018.
  115. N.-B. Nguyen et al., “Re-thinking model inversion attacks against deep neural networks,” in CVPR, 2023, pp. 16 384–16 393.
  116. H. Nori et al., “Accuracy, interpretability, and differential privacy via explainable boosting,” in ICML, 2021, pp. 8227–8237. [Online]. Available: https://proceedings.mlr.press/v139/nori21a.html
  117. T. Orekondy et al., “Knockoff nets: Stealing functionality of black-box models,” in CVPR, 2019, pp. 4954–4963.
  118. N. Papernot et al., “Sok: Security and privacy in machine learning,” in EuroS&P, 2018, pp. 399–414.
  119. N. Patel et al., “Model explanations with differential privacy,” in FaccT, 2022, p. 1895–1904. [Online]. Available: https://doi.org/10.1145/3531146.3533235
  120. A. Paudice et al., “Detection of adversarial training examples in poisoning attacks through anomaly detection,” arXiv preprint arXiv:1802.03041, 2018.
  121. M. Pawelczyk et al., “On the privacy risks of algorithmic recourse,” in AISTATS, 2023, pp. 9680–9696.
  122. Z. Peng et al., “Fingerprinting deep neural networks globally via universal adversarial perturbations,” in CVPR, 2022, pp. 13 430–13 439.
  123. D. Pessach and E. Shmueli, “A review on fairness in machine learning,” ACM Comput. Surv., vol. 55, no. 3, feb 2022. [Online]. Available: [https://doi.org/10.1145/3494672](https://doi.org/10.1145/3494672)
  124. C. Pinzon et al., “On the incompatibility of accuracy and equal opportunity,” Machine Learning, pp. 1–30, 2023.
  125. G. Pleiss et al., “On fairness and calibration,” NeurIPS, 2017.
  126. P. Quan et al., “On the amplification of security and privacy risks by post-hoc explanations in machine learning models,” arXiv preprint arXiv:2206.14004, 2022.
  127. A. Raghunathan et al., “Understanding and mitigating the tradeoff between robustness and accuracy,” in ICML, 2020.
  128. M. A. Rahman et al., “Membership inference attack against differentially private deep learning model.” Trans. Data Priv., vol. 11, no. 1, pp. 61–79, 2018.
  129. K. Rodolfa et al., “Empirical observation of negligible fairness–accuracy trade-offs in machine learning for public policy,” Nature Machine Intelligence, vol. 3, no. 10, pp. 896–904, 2021.
  130. A. Sablayrolles et al., “Radioactive data: tracing through training,” in ICML, 2020, pp. 8326–8335.
  131. A. Salem et al., “Sok: Let the privacy games begin! a unified treatment of data inference privacy in machine learning,” in S&P, 2023, pp. 327–345. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/SP46215.2023.10179281
  132. R. R. Selvaraju et al., “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in ICCV, 2017, pp. 618–626.
  133. A. Shafahi et al., “Poison frogs! targeted clean-label poisoning attacks on neural networks,” in NeurIPS, 2018, p. 6106–6116.
  134. S. Shekhar et al., “Fairod: Fairness-aware outlier detection,” in AIES, 2021, p. 210–220. [Online]. Available: https://doi.org/10.1145/3461702.3462517
  135. Y. Shen et al., “Towards understanding the impact of model size on differential private classification,” 2021.
  136. R. Shokri et al., “Membership inference attacks against machine learning models,” in S&P, 2017, pp. 3–18.
  137. ——, “On the privacy risks of model explanations,” in AIES, 2021, p. 231–241. [Online]. Available: https://doi.org/10.1145/3461702.3462533
  138. A. Shrikumar et al., “Learning important features through propagating activation differences,” in ICML, 2017, p. 3145–3153.
  139. D. Z. Slack et al., “Counterfactual explanations can be manipulated,” in NeurIPS, 2021. [Online]. Available: https://openreview.net/forum?id=iaO_IH7CnGJ
  140. D. Smilkov et al., “Smoothgrad: removing noise by adding noise,” 2017.
  141. D. Solans et al., “Poisoning attacks on algorithmic fairness,” in ECML PKDD, 2020, p. 162–177.
  142. C. Song et al., “Machine learning models that remember too much,” in CCS, 2017, p. 587–601. [Online]. Available: https://doi.org/10.1145/3133956.3134077
  143. C. Song and V. Shmatikov, “Overlearning reveals sensitive attributes,” in ICLR, 2020.
  144. L. Song et al., “Privacy risks of securing machine learning models against adversarial examples,” in CCS, 2019, p. 241–257. [Online]. Available: https://doi.org/10.1145/3319535.3354211
  145. P. Stock et al., “Defending against reconstruction attacks using rényi differential privacy,” 2023. [Online]. Available: https://openreview.net/forum?id=e0GcQ9l4Dh
  146. M. Strobel and R. Shokri, “Data privacy and trustworthy machine learning,” IEEE Security & Privacy, no. 01, pp. 2–7, 5555.
  147. M. Sundararajan et al., “Axiomatic attribution for deep networks,” in ICML, 2017, p. 3319–3328.
  148. A. Suri et al., “Dissecting distribution inference,” in SaTML, 2023, pp. 150–164. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/SaTML54575.2023.00019
  149. A. Suri and D. Evans, “Formalizing and estimating distribution inference risks,” PETS, 2022.
  150. V. M. Suriyakumar et al., “Chasing your long tails: Differentially private prediction in health care settings,” in FaccT, 2021, p. 723–734. [Online]. Available: https://doi.org/10.1145/3442188.3445934
  151. S. Szyller and N. Asokan, “Conflicting interactions among protection mechanisms for machine learning models,” in AAAI, 2023, pp. 15 179–15 187.
  152. S. Szyller et al., “Dawn: Dynamic adversarial watermarking of neural networks,” in MM, 2021, p. 4417–4425. [Online]. Available: https://doi.org/10.1145/3474085.3475591
  153. L. Tao et al., “Better safe than sorry: Preventing delusive adversaries with adversarial training,” in NeurIPS, 2021. [Online]. Available: https://openreview.net/forum?id=I39u89067j
  154. Z. Tian et al., “A comprehensive survey on poisoning attacks and countermeasures in machine learning,” ACM Comput. Surv., vol. 55, no. 8, 2022. [Online]. Available: https://doi.org/10.1145/3551636
  155. S. Tople et al., “Alleviating privacy attacks via causal learning,” in ICML, 2020.
  156. F. Tramèr et al., “Stealing machine learning models via prediction apis,” in USENIX Security, 2016, p. 601–618.
  157. ——, “Truth serum: Poisoning machine learning models to reveal their secrets,” in CCS, 2022. [Online]. Available: https://doi.org/10.1145/3548606.3560554
  158. C. Tran et al., “Differentially private empirical risk minimization under the fairness lens,” in NeurIPS, 2021. [Online]. Available: https://openreview.net/forum?id=7EFdodSWee4
  159. ——, “A fairness analysis on private aggregation of teacher ensembles,” arXiv preprint arXiv:2109.08630, 2021.
  160. ——, “Fairness increases adversarial vulnerability,” 2022.
  161. D. Tsipras et al., “Robustness may be at odds with accuracy,” in ICLR, 2019. [Online]. Available: https://openreview.net/forum?id=SyxAb30cY7
  162. N. Tursynbek et al., “Robustness threats of differential privacy,” NeurIPS Privacy-Preserving Machine Learning Workshop, 2020.
  163. M.-H. Van et al., “Poisoning attacks on fair machine learning,” in DASFAA, 2022, pp. 370–386.
  164. A. K. Veldanda et al., “Fairness via in-processing in the over-parameterized regime: A cautionary tale with mindiff loss,” TMLR, 2023. [Online]. Available: https://openreview.net/forum?id=f4VyYhkRvi
  165. A. Waheed et al., “Grove: Ownership verification of graph neural networks using embeddings,” in S&P, 2024.
  166. H. Wang et al., “Partial and asymmetric contrastive learning for out-of-distribution detection in long-tailed recognition,” in ICML, 2022, pp. 23 446–23 458.
  167. Y. Wang and F. Farnia, “On the role of generalization in transferability of adversarial examples,” in UAI, 2023, pp. 2259–2270. [Online]. Available: https://proceedings.mlr.press/v216/wang23g.html
  168. Y. Wang et al., “Dualcf: Efficient model extraction attack from counterfactual explanations,” in FaccT, 2022, p. 1318–1329. [Online]. Available: https://doi.org/10.1145/3531146.3533188
  169. M. Weber et al., “Rab: Provable robustness against backdoor attacks,” in S&P, 2023, pp. 640–657. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/SP46215.2023.00037
  170. R. Wen et al., “Is adversarial training really a silver bullet for mitigating data poisoning?” in ICLR, 2023. [Online]. Available: https://openreview.net/forum?id=zKvm1ETDOq
  171. Y. Wen et al., “Fishing for user data in large-batch federated learning via gradient magnification,” in ICML, 2022.
  172. F. Wu et al., “Linkteller: Recovering private edges from graph neural networks via influence analysis,” in S&P, 2022.
  173. T. Wu et al., “Adversarial robustness under long-tailed distribution,” in CVPR, 2021, pp. 8659–8668.
  174. C. Xie et al., “Crfl: Certifiably robust federated learning against backdoor attacks,” in ICML, 2021, pp. 11 372–11 382.
  175. H. Xu et al., “To be robust or to be fair: Towards fairness in adversarial training,” in ICML, 2021, pp. 11 492–11 501. [Online]. Available: https://proceedings.mlr.press/v139/xu21b.html
  176. Y. Xu et al., “Exploring and exploiting decision boundary dynamics for adversarial robustness,” ArXiv, vol. abs/2302.03015, 2023.
  177. M. Yaghini et al., “Disparate vulnerability: on the unfairness of privacy attacks against machine learning,” in PETS, 2022.
  178. ——, “Learning with impartiality to walk on the pareto frontier of fairness, privacy, and utility,” arXiv preprint arXiv:2302.09183, 2023.
  179. Z. Yang et al., “Neural network inversion in adversarial setting via background knowledge alignment,” in CCS, 2019, p. 225–240. [Online]. Available: https://doi.org/10.1145/3319535.3354261
  180. ——, “Purifier: defending data inference attacks via transforming confidence scores,” in AAAI, 2023, pp. 10 871–10 879.
  181. ——, “Rethinking bias-variance trade-off for generalization of neural networks,” in ICML, 2020.
  182. D. Ye et al., “One parameter defense—defending against data inference attacks via differential privacy,” IEEE TIFS, vol. 17, pp. 1466–1480, 2022.
  183. J. Ye et al., “Enhanced membership inference attacks against machine learning models,” in CCS, 2022, pp. 3093–3106.
  184. S. Yeom et al., “Privacy risk in machine learning: Analyzing the connection to overfitting,” in CSF, 2018, pp. 268–282.
  185. H. Yin et al., “See through gradients: Image batch recovery via gradinversion,” in CVPR, 2021, pp. 16 337–16 346.
  186. M. B. Zafar et al., “Fairness Constraints: Mechanisms for Fair Classification,” in AISTATS, vol. 54, 2017, pp. 962–970. [Online]. Available: https://proceedings.mlr.press/v54/zafar17a.html
  187. R. Zhai et al., “Understanding why generalized reweighting does not improve over erm,” in ICLR, 2023.
  188. B. Zhang et al., “Privacy for all: Demystify vulnerability disparity of differential privacy against membership inference attack,” arXiv preprint arXiv:2001.08855, 2020.
  189. B. H. Zhang et al., “Mitigating unwanted biases with adversarial learning,” in AIES, 2018, p. 335–340.
  190. C. Zhang et al., “Understanding deep learning requires rethinking generalization,” in ICLR, 2017. [Online]. Available: https://openreview.net/forum?id=Sy8gdB9xx
  191. ——, “Counterfactual memorization in neural language models,” 2023.
  192. H. Zhang et al., “Data poisoning attacks against outcome interpretations of predictive models,” in KDD, 2021, p. 2165–2173. [Online]. Available: https://doi.org/10.1145/3447548.3467405
  193. ——, “Theoretically principled trade-off between robustness and accuracy,” in ICML, 2019, pp. 7472–7482. [Online]. Available: https://proceedings.mlr.press/v97/zhang19p.html
  194. J. Zhang et al., “Protecting intellectual property of deep neural networks with watermarking,” in AsiaCCS, 2018, p. 159–172. [Online]. Available: https://doi.org/10.1145/3196494.3196550
  195. ——, “Privacy leakage of adversarial training models in federated learning systems,” in CVPR Workshops, 2022, pp. 108–114.
  196. W. Zhang et al., “Leakage of dataset properties in Multi-Party machine learning,” in USENIX Security, 2021, pp. 2687–2704. [Online]. Available: https://www.usenix.org/conference/usenixsecurity21/presentation/zhang-wanrong
  197. ——, “Attribute privacy: Framework and mechanisms,” in FaccT, 2022, pp. 757–766.
  198. X. Zhang et al., “Interpretable deep learning under fire,” in USENIX Security, 2020.
  199. Y. Zhang et al., “The secret revealer: Generative model-inversion attacks against deep neural networks,” in CVPR, 2020.
  200. Z. Zhang et al., “Model inversion attacks against graph neural networks,” IEEE TKDE, 2022.
  201. X. Zhao et al., “Exploiting explanations for model inversion attacks,” in CVPR, 2021, pp. 682–692.
  202. Y. Zheng et al., “A dnn fingerprint for non-repudiable model ownership identification and piracy detection,” IEEE TIFS, vol. 17, pp. 2977–2989, 2022.
  203. L. Zhu et al., “Deep leakage from gradients,” in NeurIPS, 2019. [Online]. Available: https://proceedings.neurips.cc/paper/2019/file/60a6c4002cc7b29142def8871531281a-Paper.pdf
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Youtube Logo Streamline Icon: https://streamlinehq.com