Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
121 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Olfactory Label Prediction on Aroma-Chemical Pairs (2312.16124v2)

Published 26 Dec 2023 in cs.LG, physics.chem-ph, and q-bio.QM

Abstract: The application of deep learning techniques on aroma-chemicals has resulted in models more accurate than human experts at predicting olfactory qualities. However, public research in this domain has been limited to predicting the qualities of single molecules, whereas in industry applications, perfumers and food scientists are often concerned with blends of many molecules. In this paper, we apply both existing and novel approaches to a dataset we gathered consisting of labeled pairs of molecules. We present graph neural network models capable of accurately predicting the odor qualities arising from blends of aroma-chemicals, with an analysis of how variations in architecture can lead to significant differences in predictive power.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Pytorch: An imperative style, high-performance deep learning library. https://proceedings.neurips.cc/paper_files/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html. Accessed: 2023-12-21.
  2. Atlas of odor character profiles, 1985.
  3. Fast graph representation learning with PyTorch geometric. March 2019.
  4. Neural message passing for quantum chemistry. arXiv [cs.LG], April 2017.
  5. Predicting natural language descriptions of mono-molecular odorants. Nat. Commun., 9(1):4979, November 2018.
  6. Predicting human olfactory perception from chemical features of odor molecules. Science, 355(6327):820–826, February 2017.
  7. A principal odor map unifies diverse tasks in human olfactory perception. September 2022.
  8. Bill Luebke. https://www.thegoodscentscompany.com/.
  9. Scikit-learn: Machine learning in python. pages 2825–2830, January 2012.
  10. Machine learning for scent: Learning generalizable perceptual representations of small molecules. October 2019.
  11. P D Seymour and R Thomas. Call routing and the ratcatcher. Combinatorica, 14(2):217–241, June 1994.
  12. Laura Sisson. Odor descriptor understanding through prompting. May 2022.
  13. Exploring the characteristics of an Aroma-Blending mixture by investigating the network of shared odors and the molecular features of their related odorants. Molecules, 25(13), July 2020.
  14. Order matters: Sequence to sequence for sets. November 2015.
  15. How powerful are graph neural networks? October 2018.

Summary

  • The paper introduces a deep learning model that predicts olfactory qualities from aroma-chemical pairs using a structured meta-graph dataset.
  • It employs various architectures and identifies a Graph Isomorphism Network as most effective for predicting 33 odor labels with notable accuracy.
  • The study highlights data scarcity challenges while paving the way for improved olfactory prediction models beneficial for perfumers and food scientists.

Introduction

Deep learning has proven effective in various scientific disciplines, including chemistry, where it has been used to predict olfactory qualities of molecules. While strides have been made in deciphering the molecular basis of odor perception, the complex interactions caused by blending different aroma-chemicals, which is of prime interest to perfumers and food scientists, remain less understood. This paper addresses the challenge by applying deep learning to a new dataset of molecule pairs, resulting in a model that can predict the resulting odor of chemical blends.

Methods and Dataset

To tackle the complexity of olfactory label prediction in chemical blends, the paper assembled a dataset from online sources, compiling molecular structures and associated odorant labels, focusing on recommended pairings that produce distinct aromas. The resulting dataset was structured into a meta-graph to facilitate deep learning applications. Systematic division into training and test subsets ensured the integrity of the data used to train and evaluate the models. The final dataset included a substantial number of training pairs and a smaller set for testing, filtered to contain only pairs with sufficiently frequent labels.

Model Architecture and Performance

Various deep learning architectures were rigorously tested, resulting in the identification of a Graph Isomorphism Network as the most effective for the task. The model utilized this network to develop comprehensive embeddings for each molecule pair, which were then used to predict the binarized presence or absence of 33 distinct olfactory notes. The model displayed significant predictive power, with some odor labels, like "alliaceous" (garlic), being predicted with high accuracy. In contrast, labels like "earthy" presented greater challenges, highlighting opportunities for future research.

Conclusions and Future Work

The paper concludes that the presented model, designed to predict non-linear olfactory qualities in blends of aroma-chemicals, demonstrates promising results and applications for single aroma-chemical prediction as well. This paper's implications could significantly impact the work of perfumers and food scientists by enabling the prediction of odor perceptions from complex blends.

The authors identify the scarcity of well-labeled, publicly available olfactory datasets as a significant limitation for advancements in this field. They advocate for innovative methods to overcome the lack of data and push for more publicly shared resources to continue growing understanding in this area. The paper serves as an essential milestone, paving the way for better models capable of predicting continuous labels for chemical blends at varied concentrations, closer reflecting the nuances of real-life olfactory experiences.

X Twitter Logo Streamline Icon: https://streamlinehq.com