Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 39 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 91 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Interactive Reweighting for Mitigating Label Quality Issues (2312.05067v1)

Published 8 Dec 2023 in cs.HC

Abstract: Label quality issues, such as noisy labels and imbalanced class distributions, have negative effects on model performance. Automatic reweighting methods identify problematic samples with label quality issues by recognizing their negative effects on validation samples and assigning lower weights to them. However, these methods fail to achieve satisfactory performance when the validation samples are of low quality. To tackle this, we develop Reweighter, a visual analysis tool for sample reweighting. The reweighting relationships between validation samples and training samples are modeled as a bipartite graph. Based on this graph, a validation sample improvement method is developed to improve the quality of validation samples. Since the automatic improvement may not always be perfect, a co-cluster-based bipartite graph visualization is developed to illustrate the reweighting relationships and support the interactive adjustments to validation samples and reweighting results. The adjustments are converted into the constraints of the validation sample improvement method to further improve validation samples. We demonstrate the effectiveness of Reweighter in improving reweighting results through quantitative evaluation and two case studies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. B. Zhang, Y. Wang, W. Hou, H. Wu, J. Wang, M. Okumura, and T. Shinozaki, “Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling,” in Proc. Adv. Neural Inform. Process. Syst., 2021, pp. 18 408–18 419.
  2. M. Ren, W. Zeng, B. Yang, and R. Urtasun, “Learning to reweight examples for robust deep learning,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 4334–4343.
  3. Z. Zhang and T. Pfister, “Learning fast sample re-weighting without reward data,” in Proc. Int. Conf. Comput. Vis., 2021, pp. 725–734.
  4. D. A. Hoang, C. Nguyen, B. Vasileios, and G. Carneiro, “Maximising the utility of validation sets for imbalanced noisy-label meta-learning,” CoRR, 2022.
  5. S. Xiang, X. Ye, J. Xia, J. Wu, Y. Chen, and S. Liu, “Interactive correction of mislabeled training data,” in Proc. IEEE Conf. Vis. Anal. Sci. Technol., 2019, pp. 57–68.
  6. J. Yuan, C. Chen, W. Yang, M. Liu, J. Xia, and S. Liu, “A survey of visual analytics techniques for machine learning,” Computational Visual Media, vol. 7, no. 1, pp. 3–36, 2021.
  7. Y. Xu, L. Zhu, L. Jiang, and Y. Yang, “Faster meta update strategy for noise-robust deep learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 144–153.
  8. S. Song, C. Li, D. Li, J. Chen, and C. Wang, “GraphDecoder: Recovering diverse network graphs from visualization images via attention-aware learning,” IEEE Transactions on Visualization and Computer Graphics, 2022, to be published.
  9. X. Zhang, J. P. Ono, H. Song, L. Gou, K.-L. Ma, and L. Ren, “SliceTeller: A data slice-driven approach for machine learning model validation,” IEEE Transactions on Visualization and Computer Graphics, vol. 29, no. 1, pp. 842–852, 2023.
  10. S. Robertson, Z. J. Wang, D. Moritz, M. B. Kery, and F. Hohman, “Angler: Helping machine translation practitioners prioritize model improvements,” in CHI, 2023.
  11. J. Shu, Q. Xie, L. Yi, Q. Zhao, S. Zhou, Z. Xu, and D. Meng, “Meta-weight-net: Learning an explicit mapping for sample weighting,” in Advances in Neural Information Processing Systems, 2019.
  12. T. Liu and D. Tao, “Classification with noisy labels by importance reweighting,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 3, pp. 447–461, 2016.
  13. R. Wang, T. Liu, and D. Tao, “Multiclass learning with partially corrupted labels,” TNNLS, vol. 29, no. 6, pp. 2568–2580, 2017.
  14. Y. Wang, A. Kucukelbir, and D. M. Blei, “Robust probabilistic modeling with bayesian data reweighting,” in Proc. Int. Conf. Mach. Learn., 2017, pp. 3646–3655.
  15. Y.-P. Liu, N. Xu, Y. Zhang, and X. Geng, “Label distribution for learning with noisy labels,” in Proc. Int. Joint Conf. Artif. Intell., 2021, pp. 2568–2574.
  16. W. He, L. Zou, A. K. Shekar, L. Gou, and L. Ren, “Where can we help? a visual analytics approach to diagnosing and improving semantic segmentation of movable objects,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 1, pp. 1040–1050, 2022.
  17. C. Chen, J. Wu, X. Wang, S. Xiang, S.-H. Zhang, Q. Tang, and S. Liu, “Towards better caption supervision for object detection,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 4, pp. 1941–1954, 2022.
  18. S. E. Whang and J.-G. Lee, “Data collection and quality challenges for deep learning,” Proc. VLDB Endow., vol. 13, no. 12, p. 3429–3432, 2020.
  19. W. Yang, M. Liu, Z. Wang, and S. Liu, “Foundation models meet visualizations: Challenges and opportunities,” arXiv preprint arXiv:2310.05771, 2023.
  20. J. Moehrmann, S. Bernstein, T. Schlegel, G. Werner, and G. Heidemann, “Improving the usability of hierarchical representations for interactively labeling large image data sets,” in Proc. Int. Conf. Hum.-Comput. Interact., 2011, pp. 618–627.
  21. K. Kurzhals, M. Hlawatsch, C. Seeger, and D. Weiskopf, “Visual analytics for mobile eye tracking,” IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 1, pp. 301–310, 2017.
  22. M. Khayat, M. Karimzadeh, J. Zhao, and D. S. Ebert, “VASSL: A visual analytics toolkit for social spambot labeling,” IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 1, pp. 874–883, 2020.
  23. G. Halter, R. Ballester-Ripoll, B. Flueckiger, and R. Pajarola, “VIAN: A visual annotation tool for film analysis,” Comput. Graph. Forum, vol. 38, no. 3, pp. 119–129, 2019.
  24. J. Eirich, J. Bonart, D. Jäckle, M. Sedlmair, U. Schmid, K. Fischbach, T. Schreck, and J. Bernard, “Irvine: A design study on analyzing correlation patterns of electrical engines,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 1, pp. 11–21, 2022.
  25. A. S. Júnior, C. Renso, and S. Matwin, “Analytic: An active learning system for trajectory classification,” IEEE Comput. Graph. Appl., vol. 37, no. 5, pp. 28–39, 2017.
  26. F. L. Dennig, T. Polk, Z. Lin, T. Schreck, H. Pfister, and M. Behrisch, “FDive: Learning relevance models using pattern-based similarity measures,” in Proc. IEEE Conf. Vis. Anal. Sci. Technol., 2019, pp. 69–80.
  27. L. S. Snyder, Y.-S. Lin, M. Karimzadeh, D. Goldwasser, and D. S. Ebert, “Interactive learning for identifying relevant tweets to support real-time situational awareness,” IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 1, pp. 558–568, 2020.
  28. F. Sperrle, R. Sevastjanova, R. Kehlbeck, and M. El-Assady, “VIANA: Visual interactive annotation of argumentation,” in Proc. IEEE Conf. Vis. Anal. Sci. Technol., 2019, pp. 11–22.
  29. J. Choi, S.-E. Lee, Y. Lee, E. Cho, S. Chang, and W.-K. Jeong, “Dxplorer: A unified visualization framework for interactive dendritic spine analysis using 3d morphological features,” IEEE Transactions on Visualization and Computer Graphics, vol. 29, no. 2, pp. 1424–1437, 2023.
  30. S. Jia, Z. Li, N. Chen, and J. Zhang, “Towards visual explainable active learning for zero-shot classification,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 1, pp. 791 –801, 2022.
  31. Z. Zhao, P. Xu, C. Scheidegger, and L. Ren, “Human-in-the-loop extraction of interpretable concepts in deep learning models,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 1, pp. 780–790, 2022.
  32. W. Yang, X. Ye, X. Zhang, L. Xiao, J. Xia, Z. Wang, J. Zhu, H. Pfister, and S. Liu, “Diagnosing ensemble few-shot classifiers,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 9, pp. 3292–3306, 2022.
  33. C. Chen, J. Yuan, Y. Lu, Y. Liu, H. Su, S. Yuan, and S. Liu, “OoDAnalyzer: Interactive analysis of out-of-distribution samples,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 7, pp. 3335–3349, 2021.
  34. W. Yang, Z. Li, M. Liu, Y. Lu, K. Cao, R. Maciejewski, and S. Liu, “Diagnosing concept drift with visual analytics,” in Proc. IEEE Conf. Vis. Anal. Sci. Technol., 2020, pp. 12–23.
  35. X. Wang, W. Chen, J. Xia, Z. Chen, D. Xu, X. Wu, M. Xu, and T. Schreck, “Conceptexplorer: Visual analysis of concept drifts in multi-source time-series data,” in Proc. IEEE Conf. Vis. Anal. Sci. Technol., 2020, pp. 1–11.
  36. A. Yeshchenko, C. Di Ciccio, J. Mendling, and A. Polyvyanyy, “Visual drift detection for sequence data analysis of business processes,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 8, pp. 3050 – 3068, 2022.
  37. L. Gou, L. Zou, N. Li, M. Hofmann, A. K. Shekar, A. Wendt, and L. Ren, “VATLD: a visual analytics system to assess, understand and improve traffic light detection,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 2, pp. 261–271, 2021.
  38. J. H. Park, S. Nadeem, S. Mirhosseini, and A. Kaufman, “C22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTA: Crowd consensus analytics for virtual colonoscopy,” in Proc. IEEE Conf. Vis. Anal. Sci. Technol., 2016, pp. 21–30.
  39. S. Liu, C. Chen, Y. Lu, F. Ouyang, and B. Wang, “An interactive method to improve crowdsourced annotations,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 1, pp. 235–245, 2019.
  40. J. H. Park, S. Nadeem, S. Boorboor, J. Marino, and A. Kaufman, “CMed: Crowd analytics for medical imaging data,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 6, pp. 2869–2880, 2021.
  41. J. G. S. Paiva, W. R. Schwartz, H. Pedrini, and R. Minghim, “An approach to supporting incremental visual data classification,” IEEE Transactions on Visualization and Computer Graphics, vol. 21, no. 1, pp. 4–17, 2015.
  42. C. Chen, Z. Wang, J. Wu, X. Wang, L.-Z. Guo, Y.-F. Li, and S. Liu, “Interactive graph construction for graph-based semi-supervised learning,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 9, pp. 3701–3716, 2021.
  43. A. Bäuerle, H. Neumann, and T. Ropinski, “Classifier-guided visual correction of noisy labels for image classification tasks,” Comput. Graph. Forum, vol. 39, no. 3, pp. 195–205, 2020.
  44. X. Zhang, X. Xuan, A. Dima, T. Sexton, and K.-L. Ma, “Labelvizier: Interactive validation and relabeling for technical text annotations,” in IEEE Pacific Visualization Symposium, Seoul, 2023, pp. 167–176.
  45. D. Chakrabarti, S. Papadimitriou, D. S. Modha, and C. Faloutsos, “Fully automatic cross-associations,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2004, pp. 79–88.
  46. S. Chatterjee, “Coherent gradients: An approach to understanding generalization in gradient descent-based optimization,” in International Conference on Learning Representations, 2020.
  47. H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 9, pp. 1263–1284, 2009.
  48. A. Kendall, Y. Gal, and R. Cipolla, “Multi-task learning using uncertainty to weigh losses for scene geometry and semantics,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 7482–7491.
  49. I. S. Dhillon, “Co-clustering documents and words using bipartite spectral graph partitioning,” in Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2001, pp. 269–274.
  50. J. Rissanen, “Modeling by shortest data description,” Automatica, vol. 14, no. 5, pp. 465–471, 1978.
  51. M. Ghoniem, J.-D. Fekete, and P. Castagliola, “On the readability of graphs using node-link and matrix-based representations: a controlled experiment and statistical analysis,” Inf. Vis., vol. 4, no. 2, pp. 114–135, 2005.
  52. Z. Li, X. Wang, W. Yang, J. Wu, Z. Zhang, Z. Liu, M. Sun, H. Zhang, and S. Liu, “A unified understanding of deep nlp models for text classification,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 12, pp. 4980–4994, 2022.
  53. I. Robinson and E. Pierce-Hoffman, “Tree-SNE: Hierarchical clustering and visualization using t-sne,” arXiv preprint arXiv:2002.05687, 2020.
  54. B. Saket, A. Endert, and Ç. Demiralp, “Task-based effectiveness of basic visualizations,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 7, pp. 2505–2512, 2019.
  55. J. Yuan, S. Xiang, J. Xia, L. Yu, and S. Liu, “Evaluation of sampling methods for scatterplots,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 2, pp. 1720–1730, 2021.
  56. K. Sugiyama, S. Tagawa, and M. Toda, “Methods for visual understanding of hierarchical system structures,” IEEE Trans. Syst. Man Cybern., vol. 11, no. 2, pp. 109–125, 1981.
  57. E. R. Gansner, E. Koutsofios, S. C. North, and K.-P. Vo, “A technique for drawing directed graphs,” IEEE Trans. Softw. Eng., vol. 19, no. 3, pp. 214–230, 1993.
  58. M. Hlawatsch, M. Burch, and D. Weiskopf, “Visual adjacency lists for dynamic graphs,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 11, pp. 1590–1603, 2014.
  59. D. S. Kermany, M. Goldbaum, W. Cai, C. C. Valentim, H. Liang, S. L. Baxter, A. McKeown, G. Yang, X. Wu, F. Yan et al., “Identifying medical diagnoses and treatable diseases by image-based deep learning,” Cell, vol. 172, no. 5, pp. 1122–1131, 2018.
  60. S. Zagoruyko and N. Komodakis, “Wide residual networks,” arXiv preprint arXiv:1605.07146, 2016.
  61. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, 2016, pp. 770–778.
  62. K. Gu, X. Masotto, V. Bachani, B. Lakshminarayanan, J. Nikodem, and D. Yin, “An instance-dependent simulation framework for learning with label noise,” Machine Learning, vol. 112, no. 6, pp. 1871–1896, 2023.
  63. K.-H. Lee, X. He, L. Zhang, and L. Yang, “Cleannet: Transfer learning for scalable image classifier training with label noise,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 5447–5456.
  64. A. Wu, Y. Wang, X. Shu, D. Moritz, W. Cui, H. Zhang, D. Zhang, and H. Qu, “AI4VIS: Survey on artificial intelligence approaches for data visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 12, pp. 5049–5070, 2022.
  65. M. Savva, N. Kong, A. Chhajta, L. Fei-Fei, M. Agrawala, and J. Heer, “Revision: Automated classification, analysis and redesign of chart images,” in Proceedings of the Annual ACM Symposium on User Interface Software and Technology, 2011, pp. 393–402.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube